With next-generation sequencing technologies, experiments that were considered prohibitive only a few years ago are now possible. However, while these technologies have the ability to produce ...enormous volumes of data, the sequence reads are prone to error. This poses fundamental hurdles when genetic diversity is investigated.
We developed ShoRAH, a computational method for quantifying genetic diversity in a mixed sample and for identifying the individual clones in the population, while accounting for sequencing errors. The software was run on simulated data and on real data obtained in wet lab experiments to assess its reliability.
ShoRAH is implemented in C++, Python, and Perl and has been tested under Linux and Mac OS X. Source code is available under the GNU General Public License at http://www.cbg.ethz.ch/software/shorah.
Circadian rhythms are a nearly universal feature of living organisms and affect almost every biological process. Our innate preference for mornings or evenings is determined by the phase of our ...circadian rhythms. We conduct a genome-wide association analysis of self-reported morningness, followed by analyses of biological pathways and related phenotypes. We identify 15 significantly associated loci, including seven near established circadian genes (rs12736689 near RGS16, P=7.0 × 10(-18); rs9479402 near VIP, P=3.9 × 10(-11); rs55694368 near PER2, P=2.6 × 10(-9); rs35833281 near HCRTR2, P=3.7 × 10(-9); rs11545787 near RASD1, P=1.4 × 10(-8); rs11121022 near PER3, P=2.0 × 10(-8); rs9565309 near FBXL3, P=3.5 × 10(-8). Circadian and phototransduction pathways are enriched in our results. Morningness is associated with insomnia and other sleep phenotypes; and is associated with body mass index and depression but we did not find evidence for a causal relationship in our Mendelian randomization analysis. Our findings reinforce current understanding of circadian biology and will guide future studies.
Infectious diseases have a profound impact on our health and many studies suggest that host genetics play a major role in the pathogenesis of most of them. We perform 23 genome-wide association ...studies for common infections and infection-associated procedures, including chickenpox, shingles, cold sores, mononucleosis, mumps, hepatitis B, plantar warts, positive tuberculosis test results, strep throat, scarlet fever, pneumonia, bacterial meningitis, yeast infections, urinary tract infections, tonsillectomy, childhood ear infections, myringotomy, measles, hepatitis A, rheumatic fever, common colds, rubella and chronic sinus infection, in over 200,000 individuals of European ancestry. We detect 59 genome-wide significant (P < 5 × 10
) associations in genes with key roles in immunity and embryonic development. We apply fine-mapping analysis to dissect associations in the human leukocyte antigen region, which suggests important roles of specific amino acid polymorphisms in the antigen-binding clefts. Our findings provide an important step toward dissecting the host genetic architecture of response to common infections.Susceptibility to infectious diseases is, among others, influenced by the genetic landscape of the host. Here, Tian and colleagues perform genome-wide association studies for 23 common infections and find 59 risk loci for 17 of these, both within the HLA region and non-HLA loci.
Allergic disease is very common and carries substantial public-health burdens. We conducted a meta-analysis of genome-wide associations with self-reported cat, dust-mite and pollen allergies in ...53,862 individuals. We used generalized estimating equations to model shared and allergy-specific genetic effects. We identified 16 shared susceptibility loci with association P<5×10(-8), including 8 loci previously associated with asthma, as well as 4p14 near TLR1, TLR6 and TLR10 (rs2101521, P=5.3×10(-21)); 6p21.33 near HLA-C and MICA (rs9266772, P=3.2×10(-12)); 5p13.1 near PTGER4 (rs7720838, P=8.2×10(-11)); 2q33.1 in PLCL1 (rs10497813, P=6.1×10(-10)), 3q28 in LPP (rs9860547, P=1.2×10(-9)); 20q13.2 in NFATC2 (rs6021270, P=6.9×10(-9)), 4q27 in ADAD1 (rs17388568, P=3.9×10(-8)); and 14q21.1 near FOXA1 and TTC6 (rs1998359, P=4.8×10(-8)). We identified one locus with substantial evidence of differences in effects across allergies at 6p21.32 in the class II human leukocyte antigen (HLA) region (rs17533090, P=1.7×10(-12)), which was strongly associated with cat allergy. Our study sheds new light on the shared etiology of immune and autoimmune disease.
We conducted a genome-wide association study (GWAS) to identify novel predisposition alleles associated with Philadelphia chromosome-negative myeloproliferative neoplasms (MPNs) and JAK2 V617F clonal ...hematopoiesis in the general population. We recruited a web-based cohort of 726 individuals with polycythemia vera, essential thrombocythemia, and myelofibrosis and 252 637 population controls unselected for hematologic phenotypes. Using a single-nucleotide polymorphism (SNP) array platform with custom probes for the JAK2 V617F mutation (V617F), we identified 497 individuals (0.2%) among the population controls who were V617F carriers. We performed a combined GWAS of the MPN cases plus V617F carriers in the control population (n = 1223) vs the remaining controls who were noncarriers for V617F (n = 252 140). For these MPN cases plus V617F carriers, we replicated the germ line JAK2 46/1 haplotype (rs59384377: odds ratio OR = 2.4, P = 6.6 × 10−89), previously associated with V617F-positive MPN. We also identified genome-wide significant associations in the TERT gene (rs7705526: OR = 1.8, P = 1.1 × 10−32), in SH2B3 (rs7310615: OR = 1.4, P = 3.1 × 10−14), and upstream of TET2 (rs1548483: OR = 2.0, P = 2.0 × 10−9). These associations were confirmed in a separate replication cohort of 446 V617F carriers vs 169 021 noncarriers. In a joint analysis of the combined GWAS and replication results, we identified additional genome-wide significant predisposition alleles associated with CHEK2, ATM, PINT, and GFI1B. All SNP ORs were similar for MPN patients and controls who were V617F carriers. These data indicate that the same germ line variants endow individuals with a predisposition not only to MPN, but also to JAK2 V617F clonal hematopoiesis, a more common phenomenon that may foreshadow the development of an overt neoplasm.
•Germ line variants in TERT, SH2B3, TET2, ATM, CHEK2, PINT, and GFI1B are associated with JAK2 V617F clonal hematopoiesis and MPNs.•Age-related JAK2 V617F clonal hematopoiesis is found in ∼2 out of 1000 individuals in the general population.
Myopia, or nearsightedness, is the most common eye disorder, resulting primarily from excess elongation of the eye. The etiology of myopia, although known to be complex, is poorly understood. Here we ...report the largest ever genome-wide association study (45,771 participants) on myopia in Europeans. We performed a survival analysis on age of myopia onset and identified 22 significant associations (Formula: see text), two of which are replications of earlier associations with refractive error. Ten of the 20 novel associations identified replicate in a separate cohort of 8,323 participants who reported if they had developed myopia before age 10. These 22 associations in total explain 2.9% of the variance in myopia age of onset and point toward a number of different mechanisms behind the development of myopia. One association is in the gene PRSS56, which has previously been linked to abnormally small eyes; one is in a gene that forms part of the extracellular matrix (LAMA2); two are in or near genes involved in the regeneration of 11-cis-retinal (RGR and RDH5); two are near genes known to be involved in the growth and guidance of retinal ganglion cells (ZIC2, SFRP1); and five are in or near genes involved in neuronal signaling or development. These novel findings point toward multiple genetic factors involved in the development of myopia and suggest that complex interactions between extracellular matrix remodeling, neuronal development, and visual signals from the retina may underlie the development of myopia in humans.
Cancer evolves through the accumulation of mutations, but the order in which mutations occur is poorly understood. Inference of a temporal ordering on the level of genes is challenging because ...clinically and histologically identical tumors often have few mutated genes in common. This heterogeneity may at least in part be due to mutations in different genes having similar phenotypic effects by acting in the same functional pathway. We estimate the constraints on the order in which alterations accumulate during cancer progression from cross-sectional mutation data using a probabilistic graphical model termed Hidden Conjunctive Bayesian Network (H-CBN). The possible orders are analyzed on the level of genes and, after mapping genes to functional pathways, also on the pathway level. We find stronger evidence for pathway order constraints than for gene order constraints, indicating that temporal ordering results from selective pressure acting at the pathway level. The accumulation of changes in core pathways differs among cancer types, yet a common feature is that progression appears to begin with mutations in genes that regulate apoptosis pathways and to conclude with mutations in genes involved in invasion pathways. H-CBN models provide a quantitative and intuitive model of tumorigenesis showing that the genetic events can be linked to the phenotypic progression on the level of pathways.
Rosacea is a common, chronic skin disease that is currently incurable. Although environmental factors influence rosacea, the genetic basis of rosacea is not established. In this genome-wide ...association study, a discovery group of 22,952 individuals (2,618 rosacea cases and 20,334 controls) was analyzed, leading to identification of two significant single-nucleotide polymorphisms (SNPs) associated with rosacea, one of which replicated in a new group of 29,481 individuals (3,205 rosacea cases and 26,262 controls). The confirmed SNP, rs763035 (P=8.0 × 10−11 discovery group; P=0.00031 replication group), is intergenic between HLA-DRA and BTNL2. Exploratory immunohistochemical analysis of HLA-DRA and BTNL2 expression in papulopustular rosacea lesions from six individuals, including one with the rs763035 variant, revealed staining in the perifollicular inflammatory infiltrate of rosacea for both proteins. In addition, three HLA alleles, all MHC class II proteins, were significantly associated with rosacea in the discovery group and confirmed in the replication group: HLA-DRB1*03:01 (P=1.0 × 10−8 discovery group; P=4.4 × 10−6 replication group), HLA-DQB1*02:01 (P=1.3 × 10−8 discovery group; P=7.2 × 10−6 replication group), and HLA-DQA1*05:01 (P=1.4 × 10−8 discovery group; P=7.6 × 10−6 replication group). Collectively, the gene variants identified in this study support the concept of a genetic component for rosacea, and provide candidate targets for future studies to better understand and treat rosacea.
Despite the recent rapid growth in genome-wide data, much of human variation remains entirely unexplained. A significant challenge in the pursuit of the genetic basis for variation in common human ...traits is the efficient, coordinated collection of genotype and phenotype data. We have developed a novel research framework that facilitates the parallel study of a wide assortment of traits within a single cohort. The approach takes advantage of the interactivity of the Web both to gather data and to present genetic information to research participants, while taking care to correct for the population structure inherent to this study design. Here we report initial results from a participant-driven study of 22 traits. Replications of associations (in the genes OCA2, HERC2, SLC45A2, SLC24A4, IRF4, TYR, TYRP1, ASIP, and MC1R) for hair color, eye color, and freckling validate the Web-based, self-reporting paradigm. The identification of novel associations for hair morphology (rs17646946, near TCHH; rs7349332, near WNT10A; and rs1556547, near OFCC1), freckling (rs2153271, in BNC2), the ability to smell the methanethiol produced after eating asparagus (rs4481887, near OR2M7), and photic sneeze reflex (rs10427255, near ZEB2, and rs11856995, near NR2F2) illustrates the power of the approach.
Although the causes of Parkinson's disease (PD) are thought to be primarily environmental, recent studies suggest that a number of genes influence susceptibility. Using targeted case recruitment and ...online survey instruments, we conducted the largest case-control genome-wide association study (GWAS) of PD based on a single collection of individuals to date (3,426 cases and 29,624 controls). We discovered two novel, genome-wide significant associations with PD-rs6812193 near SCARB2 (p = 7.6 × 10(-10), OR = 0.84) and rs11868035 near SREBF1/RAI1 (p = 5.6 × 10(-8), OR = 0.85)-both replicated in an independent cohort. We also replicated 20 previously discovered genetic associations (including LRRK2, GBA, SNCA, MAPT, GAK, and the HLA region), providing support for our novel study design. Relying on a recently proposed method based on genome-wide sharing estimates between distantly related individuals, we estimated the heritability of PD to be at least 0.27. Finally, using sparse regression techniques, we constructed predictive models that account for 6%-7% of the total variance in liability and that suggest the presence of true associations just beyond genome-wide significance, as confirmed through both internal and external cross-validation. These results indicate a substantial, but by no means total, contribution of genetics underlying susceptibility to both early-onset and late-onset PD, suggesting that, despite the novel associations discovered here and elsewhere, the majority of the genetic component for Parkinson's disease remains to be discovered.