Integrating single nucleotide polymorphism (SNP) p-values from genome-wide association studies (GWAS) across genes and pathways is a strategy to improve statistical power and gain biological insight. ...Here, we present Pascal (Pathway scoring algorithm), a powerful tool for computing gene and pathway scores from SNP-phenotype association summary statistics. For gene score computation, we implemented analytic and efficient numerical solutions to calculate test statistics. We examined in particular the sum and the maximum of chi-squared statistics, which measure the strongest and the average association signals per gene, respectively. For pathway scoring, we use a modified Fisher method, which offers not only significant power improvement over more traditional enrichment strategies, but also eliminates the problem of arbitrary threshold selection inherent in any binary membership based pathway enrichment approach. We demonstrate the marked increase in power by analyzing summary statistics from dozens of large meta-studies for various traits. Our extensive testing indicates that our method not only excels in rigorous type I error control, but also results in more biologically meaningful discoveries.
High lipoprotein (a) Lp(a) concentrations are an independent risk factor for cardiovascular outcomes. Concentrations are strongly influenced by apo(a) kringle IV repeat isoforms. We aimed to identify ...genetic loci associated with Lp(a) concentrations using data from five genome-wide association studies (n = 13,781). We identified 48 independent SNPs in the LPA and 1 SNP in the APOE gene region to be significantly associated with Lp(a) concentrations. We also adjusted for apo(a) isoforms to identify loci affecting Lp(a) levels independently from them, which resulted in 31 SNPs (30 in the LPA, 1 in the APOE gene region). Seven SNPs showed a genome-wide significant association with coronary artery disease (CAD) risk. A rare SNP (rs186696265; MAF ∼1%) showed the highest effect on Lp(a) and was also associated with increased risk of CAD (odds ratio = 1.73, P = 3.35 × 10−30). Median Lp(a) values increased from 2.1 to 91.1 mg/dl with increasing number of Lp(a)-increasing alleles. We found the APOE2-determining allele of rs7412 to be significantly associated with Lp(a) concentrations (P = 3.47 × 10−10). Each APOE2 allele decreased Lp(a) by 3.34 mg/dl corresponding to ∼15% of the populationʼns mean values. Performing a gene-based test of association, including suspected Lp(a) receptors and regulators, resulted in one significant association of the TLR2 gene with Lp(a) (P = 3.4 × 10−4). In summary, we identified a large number of independent SNPs in the LPA gene region, as well as the APOE2 allele, to be significantly associated with Lp(a) concentrations.
Intergenic long noncoding RNAs (lincRNAs) are the largest class of transcripts in the human genome. Although many have recently been linked to complex human traits, the underlying mechanisms for most ...of these transcripts remain undetermined. We investigated the regulatory roles of a high-confidence and reproducible set of 69 trait-relevant lincRNAs (TR-lincRNAs) in human lymphoblastoid cells whose biological relevance is supported by their evolutionary conservation during recent human history and genetic interactions with other trait-associated loci. Their enrichment in enhancer-like chromatin signatures, interactions with nearby trait-relevant protein-coding loci, and preferential location at topologically associated domain (TAD) boundaries provide evidence that TR-lincRNAs likely regulate proximal trait-relevant gene expression in cis by modulating local chromosomal architecture. This is consistent with the positive and significant correlation found between TR-lincRNA abundance and intra-TAD DNA-DNA contacts. Our results provide insights into the molecular mode of action by which TR-lincRNAs contribute to complex human traits.
Display omitted
•We identify 69 lincRNAs associated with human complex traits (TR-lincRNAs)•TR-lincRNAs are conserved in humans and interact with other disease-relevant loci•TR-lincRNAs often associate with cis-regulation of proximal protein-coding gene expression•TR-lincRNAs are enriched at TAD boundaries and may modulate chromatin architecture
Tan et al. identify and characterize 69 human complex trait/disease-associated lincRNAs in LCLs. They show that these loci are often associated with cis-regulation of gene expression and tend to be localized at TAD boundaries, suggesting that these lincRNAs may influence chromosomal architecture.
To better understand genome regulation, it is important to uncover the role of transcription factors in the process of chromatin structure establishment and maintenance. Here we present a data-driven ...approach to systematically characterise transcription factors that are relevant for this process. Our method uses a linear mixed modelling approach to combine datasets of transcription factor binding motif enrichments in open chromatin and gene expression across the same set of cell lines. Applying this approach to the ENCODE dataset, we confirm already known and imply numerous novel transcription factors that play a role in the establishment or maintenance of open chromatin. In particular, our approach rediscovers many factors that have been annotated as pioneer factors.
Genome-wide association studies with metabolic traits (mGWAS) uncovered many genetic variants that influence human metabolism. These genetically influenced metabotypes (GIMs) contribute to our ...metabolic individuality, our capacity to respond to environmental challenges, and our susceptibility to specific diseases. While metabolic homeostasis in blood is a well investigated topic in large mGWAS with over 150 known loci, metabolic detoxification through urinary excretion has only been addressed by few small mGWAS with only 11 associated loci so far. Here we report the largest mGWAS to date, combining targeted and non-targeted 1H NMR analysis of urine samples from 3,861 participants of the SHIP-0 cohort and 1,691 subjects of the KORA F4 cohort. We identified and replicated 22 loci with significant associations with urinary traits, 15 of which are new (HIBCH, CPS1, AGXT, XYLB, TKT, ETNPPL, SLC6A19, DMGDH, SLC36A2, GLDC, SLC6A13, ACSM3, SLC5A11, PNMT, SLC13A3). Two-thirds of the urinary loci also have a metabolite association in blood. For all but one of the 6 loci where significant associations target the same metabolite in blood and urine, the genetic effects have the same direction in both fluids. In contrast, for the SLC5A11 locus, we found increased levels of myo-inositol in urine whereas mGWAS in blood reported decreased levels for the same genetic variant. This might indicate less effective re-absorption of myo-inositol in the kidneys of carriers. In summary, our study more than doubles the number of known loci that influence urinary phenotypes. It thus allows novel insights into the relationship between blood homeostasis and its regulation through excretion. The newly discovered loci also include variants previously linked to chronic kidney disease (CPS1, SLC6A13), pulmonary hypertension (CPS1), and ischemic stroke (XYLB). By establishing connections from gene to disease via metabolic traits our results provide novel hypotheses about molecular mechanisms involved in the etiology of diseases.
Male-pattern baldness (MPB) is a common and highly heritable trait characterized by androgen-dependent, progressive hair loss from the scalp. Here, we carry out the largest GWAS meta-analysis of MPB ...to date, comprising 10,846 early-onset cases and 11,672 controls from eight independent cohorts. We identify 63 MPB-associated loci (P<5 × 10
, METAL) of which 23 have not been reported previously. The 63 loci explain ∼39% of the phenotypic variance in MPB and highlight several plausible candidate genes (FGF5, IRF4, DKK2) and pathways (melatonin signalling, adipogenesis) that are likely to be implicated in the key-pathophysiological features of MPB and may represent promising targets for the development of novel therapeutic options. The data provide molecular evidence that rather than being an isolated trait, MPB shares a substantial biological basis with numerous other human phenotypes and may deserve evaluation as an early prognostic marker, for example, for prostate cancer, sudden cardiac arrest and neurodegenerative disorders.
Metabolic traits are molecular phenotypes that can drive clinical phenotypes and may predict disease progression. Here, we report results from a metabolome- and genome-wide association study on ...(1)H-NMR urine metabolic profiles. The study was conducted within an untargeted approach, employing a novel method for compound identification. From our discovery cohort of 835 Caucasian individuals who participated in the CoLaus study, we identified 139 suggestively significant (P<5×10(-8)) and independent associations between single nucleotide polymorphisms (SNP) and metabolome features. Fifty-six of these associations replicated in the TasteSensomics cohort, comprising 601 individuals from São Paulo of vastly diverse ethnic background. They correspond to eleven gene-metabolite associations, six of which had been previously identified in the urine metabolome and three in the serum metabolome. Our key novel findings are the associations of two SNPs with NMR spectral signatures pointing to fucose (rs492602, P = 6.9×10(-44)) and lysine (rs8101881, P = 1.2×10(-33)), respectively. Fine-mapping of the first locus pinpointed the FUT2 gene, which encodes a fucosyltransferase enzyme and has previously been associated with Crohn's disease. This implicates fucose as a potential prognostic disease marker, for which there is already published evidence from a mouse model. The second SNP lies within the SLC7A9 gene, rare mutations of which have been linked to severe kidney damage. The replication of previous associations and our new discoveries demonstrate the potential of untargeted metabolomics GWAS to robustly identify molecular disease markers.
A metabolome-wide genome-wide association study (mGWAS) aims to discover the effects of genetic variants on metabolome phenotypes. Most mGWASes use as phenotypes concentrations of limited sets of ...metabolites that can be identified and quantified from spectral information. In contrast, in an untargeted mGWAS both identification and quantification are forgone and, instead, all measured metabolome features are tested for association with genetic variants. While the untargeted approach does not discard data that may have eluded identification, the interpretation of associated features remains a challenge. To address this issue, we developed metabomatching to identify the metabolites underlying significant associations observed in untargeted mGWASes on proton NMR metabolome data. Metabomatching capitalizes on genetic spiking, the concept that because metabolome features associated with a genetic variant tend to correspond to the peaks of the NMR spectrum of the underlying metabolite, genetic association can allow for identification. Applied to the untargeted mGWASes in the SHIP and CoLaus cohorts and using 180 reference NMR spectra of the urine metabolome database, metabomatching successfully identified the underlying metabolite in 14 of 19, and 8 of 9 associations, respectively. The accuracy and efficiency of our method make it a strong contender for facilitating or complementing metabolomics analyses in large cohorts, where the availability of genetic, or other data, enables our approach, but targeted quantification is limited.
Signalling through gap junctions contributes to control insulin secretion and, thus, blood glucose levels. Gap junctions of the insulin-producing β-cells are made of connexin 36 (Cx36), which is ...encoded by the GJD2 gene. Cx36-null mice feature alterations mimicking those observed in type 2 diabetes (T2D). GJD2 is also expressed in neurons, which share a number of common features with pancreatic β-cells. Given that a synonymous exonic single nucleotide polymorphism of human Cx36 (SNP rs3743123) associates with altered function of central neurons in a subset of epileptic patients, we investigated whether this SNP also caused alterations of β-cell function. Transfection of rs3743123 cDNA in connexin-lacking HeLa cells resulted in altered formation of gap junction plaques and cell coupling, as compared to those induced by wild type (WT) GJD2 cDNA. Transgenic mice expressing the very same cDNAs under an insulin promoter revealed that SNP rs3743123 expression consistently lead to a post-natal reduction of islet Cx36 levels and β-cell survival, resulting in hyperglycemia in selected lines. These changes were not observed in sex- and age-matched controls expressing WT hCx36. The variant GJD2 only marginally associated to heterogeneous populations of diabetic patients. The data document that a silent polymorphism of GJD2 is associated with altered β-cell function, presumably contributing to T2D pathogenesis.
Red blood cell (RBC) traits are routinely measured in clinical practice as important markers of health. Deviations from the physiological ranges are usually a sign of disease, although variation ...between healthy individuals also occurs, at least partly due to genetic factors. Recent large scale genetic studies identified loci associated with one or more of these traits; further characterization of known loci and identification of new loci is necessary to better understand their role in health and disease and to identify potential molecular mechanisms. We performed meta-analysis of Metabochip association results for six RBC traits-hemoglobin concentration (Hb), hematocrit (Hct), mean corpuscular hemoglobin (MCH), mean corpuscular hemoglobin concentration (MCHC), mean corpuscular volume (MCV) and red blood cell count (RCC)-in 11 093 Europeans from seven studies of the UCL-LSHTM-Edinburgh-Bristol (UCLEB) Consortium. We identified 394 non-overlapping SNPs in five loci at genome-wide significance: 6p22.1-6p21.33 (with HFE among others), 6q23.2 (with HBS1L among others), 6q23.3 (contains no genes), 9q34.3 (only ABO gene) and 22q13.1 (with TMPRSS6 among others), replicating previous findings of association with RBC traits at these loci and extending them by imputation to 1000 Genomes. We further characterized associations between ABO SNPs and three traits: hemoglobin, hematocrit and red blood cell count, replicating them in an independent cohort. Conditional analyses indicated the independent association of each of these traits with ABO SNPs and a role for blood group O in mediating the association. The 15 most significant RBC-associated ABO SNPs were also associated with five cardiometabolic traits, with discordance in the direction of effect between groups of traits, suggesting that ABO may act through more than one mechanism to influence cardiometabolic risk.