Recent work has demonstrated that some functional categories of the genome contribute disproportionately to the heritability of complex diseases. Here we analyze a broad set of functional elements, ...including cell type-specific elements, to estimate their polygenic contributions to heritability in genome-wide association studies (GWAS) of 17 complex diseases and traits with an average sample size of 73,599. To enable this analysis, we introduce a new method, stratified LD score regression, for partitioning heritability from GWAS summary statistics while accounting for linked markers. This new method is computationally tractable at very large sample sizes and leverages genome-wide information. Our findings include a large enrichment of heritability in conserved regions across many traits, a very large immunological disease-specific enrichment of heritability in FANTOM5 enhancers and many cell type-specific enrichments, including significant enrichment of central nervous system cell types in the heritability of body mass index, age at menarche, educational attainment and smoking behavior.
Identifying genetic correlations between complex traits and diseases can provide useful etiological insights and help prioritize likely causal relationships. The major challenges preventing ...estimation of genetic correlation from genome-wide association study (GWAS) data with current methods are the lack of availability of individual-level genotype data and widespread sample overlap among meta-analyses. We circumvent these difficulties by introducing a technique-cross-trait LD Score regression-for estimating genetic correlation that requires only GWAS summary statistics and is not biased by sample overlap. We use this method to estimate 276 genetic correlations among 24 traits. The results include genetic correlations between anorexia nervosa and schizophrenia, anorexia and obesity, and educational attainment and several diseases. These results highlight the power of genome-wide analyses, as there currently are no significantly associated SNPs for anorexia nervosa and only three for educational attainment.
Testosterone supplementation is commonly used for its effects on sexual function, bone health and body composition, yet its effects on disease outcomes are unknown. To better understand this, we ...identified genetic determinants of testosterone levels and related sex hormone traits in 425,097 UK Biobank study participants. Using 2,571 genome-wide significant associations, we demonstrate that the genetic determinants of testosterone levels are substantially different between sexes and that genetically higher testosterone is harmful for metabolic diseases in women but beneficial in men. For example, a genetically determined 1 s.d. higher testosterone increases the risks of type 2 diabetes (odds ratio (OR) = 1.37 (95% confidence interval (95% CI): 1.22-1.53)) and polycystic ovary syndrome (OR = 1.51 (95% CI: 1.33-1.72)) in women, but reduces type 2 diabetes risk in men (OR = 0.86 (95% CI: 0.76-0.98)). We also show adverse effects of higher testosterone on breast and endometrial cancers in women and prostate cancer in men. Our findings provide insights into the disease impacts of testosterone and highlight the importance of sex-specific genetic analyses.
We propose a method (GREML-LDMS) to estimate heritability for human complex traits in unrelated individuals using whole-genome sequencing data. We demonstrate using simulations based on whole-genome ...sequencing data that ∼97% and ∼68% of variation at common and rare variants, respectively, can be captured by imputation. Using the GREML-LDMS method, we estimate from 44,126 unrelated individuals that all ∼17 million imputed variants explain 56% (standard error (s.e.) = 2.3%) of variance for height and 27% (s.e. = 2.5%) of variance for body mass index (BMI), and we find evidence that height- and BMI-associated variants have been under natural selection. Considering the imperfect tagging of imputation and potential overestimation of heritability from previous family-based studies, heritability is likely to be 60-70% for height and 30-40% for BMI. Therefore, the missing heritability is small for both traits. For further discovery of genes associated with complex traits, a study design with SNP arrays followed by imputation is more cost-effective than whole-genome sequencing at current prices.
IMPORTANCE: Body fat distribution, usually measured using waist-to-hip ratio (WHR), is an important contributor to cardiometabolic disease independent of body mass index (BMI). Whether mechanisms ...that increase WHR via lower gluteofemoral (hip) or via higher abdominal (waist) fat distribution affect cardiometabolic risk is unknown. OBJECTIVE: To identify genetic variants associated with higher WHR specifically via lower gluteofemoral or higher abdominal fat distribution and estimate their association with cardiometabolic risk. DESIGN, SETTING, AND PARTICIPANTS: Genome-wide association studies (GWAS) for WHR combined data from the UK Biobank cohort and summary statistics from previous GWAS (data collection: 2006-2018). Specific polygenic scores for higher WHR via lower gluteofemoral or via higher abdominal fat distribution were derived using WHR-associated genetic variants showing specific association with hip or waist circumference. Associations of polygenic scores with outcomes were estimated in 3 population-based cohorts, a case-cohort study, and summary statistics from 6 GWAS (data collection: 1991-2018). EXPOSURES: More than 2.4 million common genetic variants (GWAS); polygenic scores for higher WHR (follow-up analyses). MAIN OUTCOMES AND MEASURES: BMI-adjusted WHR and unadjusted WHR (GWAS); compartmental fat mass measured by dual-energy x-ray absorptiometry, systolic and diastolic blood pressure, low-density lipoprotein cholesterol, triglycerides, fasting glucose, fasting insulin, type 2 diabetes, and coronary disease risk (follow-up analyses). RESULTS: Among 452 302 UK Biobank participants of European ancestry, the mean (SD) age was 57 (8) years and the mean (SD) WHR was 0.87 (0.09). In genome-wide analyses, 202 independent genetic variants were associated with higher BMI-adjusted WHR (n = 660 648) and unadjusted WHR (n = 663 598). In dual-energy x-ray absorptiometry analyses (n = 18 330), the hip- and waist-specific polygenic scores for higher WHR were specifically associated with lower gluteofemoral and higher abdominal fat, respectively. In follow-up analyses (n = 636 607), both polygenic scores were associated with higher blood pressure and triglyceride levels and higher risk of diabetes (waist-specific score: odds ratio OR, 1.57 95% CI, 1.34-1.83, absolute risk increase per 1000 participant-years ARI, 4.4 95% CI, 2.7-6.5, P < .001; hip-specific score: OR, 2.54 95% CI, 2.17-2.96, ARI, 12.0 95% CI, 9.1-15.3, P < .001) and coronary disease (waist-specific score: OR, 1.60 95% CI, 1.39-1.84, ARI, 2.3 95% CI, 1.5-3.3, P < .001; hip-specific score: OR, 1.76 95% CI, 1.53-2.02, ARI, 3.0 95% CI, 2.1-4.0, P < .001), per 1-SD increase in BMI-adjusted WHR. CONCLUSIONS AND RELEVANCE: Distinct genetic mechanisms may be linked to gluteofemoral and abdominal fat distribution that are the basis for the calculation of the WHR. These findings may improve risk assessment and treatment of diabetes and coronary disease.
Using a nontargeted metabolomics approach of 447 fasting plasma metabolites, we searched for novel molecular markers that arise before and after hyperglycemia in a large population-based cohort of ...2,204 females (115 type 2 diabetic T2D case subjects, 192 individuals with impaired fasting glucose IFG, and 1,897 control subjects) from TwinsUK. Forty-two metabolites from three major fuel sources (carbohydrates, lipids, and proteins) were found to significantly correlate with T2D after adjusting for multiple testing; of these, 22 were previously reported as associated with T2D or insulin resistance. Fourteen metabolites were found to be associated with IFG. Among the metabolites identified, the branched-chain keto-acid metabolite 3-methyl-2-oxovalerate was the strongest predictive biomarker for IFG after glucose (odds ratio OR 1.65 95% CI 1.39-1.95, P = 8.46 × 10(-9)) and was moderately heritable (h(2) = 0.20). The association was replicated in an independent population (n = 720, OR 1.68 1.34-2.11, P = 6.52 × 10(-6)) and validated in 189 twins with urine metabolomics taken at the same time as plasma (OR 1.87 1.27-2.75, P = 1 × 10(-3)). Results confirm an important role for catabolism of branched-chain amino acids in T2D and IFG. In conclusion, this T2D-IFG biomarker study has surveyed the broadest panel of nontargeted metabolites to date, revealing both novel and known associated metabolites and providing potential novel targets for clinical prediction and a deeper understanding of causal mechanisms.
The negative impacts of social isolation and loneliness on health are well documented. However, little is known about their possible biological determinants. In up to 452,302 UK Biobank study ...participants, we perform genome-wide association study analyses for loneliness and regular participation in social activities. We identify 15 genomic loci (P < 5 × 10
) for loneliness, and demonstrate a likely causal association between adiposity and increased susceptibility to loneliness and depressive symptoms. Further loci were identified for regular attendance at a sports club or gym (N = 6 loci), pub or social club (N = 13) or religious group (N = 18). Across these traits there was strong enrichment for genes expressed in brain regions that control emotional expression and behaviour. We demonstrate aetiological mechanisms specific to each trait, in addition to identifying loci that are pleiotropic across multiple complex traits. Further study of these traits may identify novel modifiable risk factors associated with social withdrawal and isolation.
A number of open questions in human evolutionary genetics would become tractable if we were able to directly measure evolutionary fitness. As a step towards this goal, we developed a method to ...examine whether individual genetic variants, or sets of genetic variants, currently influence viability. The approach consists in testing whether the frequency of an allele varies across ages, accounting for variation in ancestry. We applied it to the Genetic Epidemiology Research on Adult Health and Aging (GERA) cohort and to the parents of participants in the UK Biobank. Across the genome, we found only a few common variants with large effects on age-specific mortality: tagging the APOE ε4 allele and near CHRNA3. These results suggest that when large, even late-onset effects are kept at low frequency by purifying selection. Testing viability effects of sets of genetic variants that jointly influence 1 of 42 traits, we detected a number of strong signals. In participants of the UK Biobank of British ancestry, we found that variants that delay puberty timing are associated with a longer parental life span (P~6.2 × 10-6 for fathers and P~2.0 × 10-3 for mothers), consistent with epidemiological studies. Similarly, variants associated with later age at first birth are associated with a longer maternal life span (P~1.4 × 10-3). Signals are also observed for variants influencing cholesterol levels, risk of coronary artery disease (CAD), body mass index, as well as risk of asthma. These signals exhibit consistent effects in the GERA cohort and among participants of the UK Biobank of non-British ancestry. We also found marked differences between males and females, most notably at the CHRNA3 locus, and variants associated with risk of CAD and cholesterol levels. Beyond our findings, the analysis serves as a proof of principle for how upcoming biomedical data sets can be used to learn about selection effects in contemporary humans.
We introduce an approach to identify disease-relevant tissues and cell types by analyzing gene expression data together with genome-wide association study (GWAS) summary statistics. Our approach uses ...stratified linkage disequilibrium (LD) score regression to test whether disease heritability is enriched in regions surrounding genes with the highest specific expression in a given tissue. We applied our approach to gene expression data from several sources together with GWAS summary statistics for 48 diseases and traits (average N = 169,331) and found significant tissue-specific enrichments (false discovery rate (FDR) < 5%) for 34 traits. In our analysis of multiple tissues, we detected a broad range of enrichments that recapitulated known biology. In our brain-specific analysis, significant enrichments included an enrichment of inhibitory over excitatory neurons for bipolar disorder, and excitatory over inhibitory neurons for schizophrenia and body mass index. Our results demonstrate that our polygenic approach is a powerful way to leverage gene expression data for interpreting GWAS signals.
Higher cardiorespiratory fitness is associated with lower risk of type 2 diabetes. However, the causality of this relationship and the biological mechanisms that underlie it are unclear. Here, we ...examine genetic determinants of cardiorespiratory fitness in 450k European-ancestry individuals in UK Biobank, by leveraging the genetic overlap between fitness measured by an exercise test and resting heart rate. We identified 160 fitness-associated loci which we validated in an independent cohort, the Fenland study. Gene-based analyses prioritised candidate genes, such as CACNA1C, SCN10A, MYH11 and MYH6, that are enriched in biological processes related to cardiac muscle development and muscle contractility. In a Mendelian Randomisation framework, we demonstrate that higher genetically predicted fitness is causally associated with lower risk of type 2 diabetes independent of adiposity. Integration with proteomic data identified N-terminal pro B-type natriuretic peptide, hepatocyte growth factor-like protein and sex hormone-binding globulin as potential mediators of this relationship. Collectively, our findings provide insights into the biological mechanisms underpinning cardiorespiratory fitness and highlight the importance of improving fitness for diabetes prevention.