A major goal in human genetics is to use natural variation to understand the phenotypic consequences of altering each protein-coding gene in the genome. Here we used exome sequencing
to explore ...protein-altering variants and their consequences in 454,787 participants in the UK Biobank study
. We identified 12 million coding variants, including around 1 million loss-of-function and around 1.8 million deleterious missense variants. When these were tested for association with 3,994 health-related traits, we found 564 genes with trait associations at P ≤ 2.18 × 10
. Rare variant associations were enriched in loci from genome-wide association studies (GWAS), but most (91%) were independent of common variant signals. We discovered several risk-increasing associations with traits related to liver disease, eye disease and cancer, among others, as well as risk-lowering associations for hypertension (SLC9A3R2), diabetes (MAP3K15, FAM234A) and asthma (SLC27A3). Six genes were associated with brain imaging phenotypes, including two involved in neural development (GBE1, PLD1). Of the signals available and powered for replication in an independent cohort, 81% were confirmed; furthermore, association signals were generally consistent across individuals of European, Asian and African ancestry. We illustrate the ability of exome sequencing to identify gene-trait associations, elucidate gene function and pinpoint effector genes that underlie GWAS signals at scale.
The UK Biobank is a prospective study of 502,543 individuals, combining extensive phenotypic and genotypic data with streamlined access for researchers around the world
. Here we describe the release ...of exome-sequence data for the first 49,960 study participants, revealing approximately 4 million coding variants (of which around 98.6% have a frequency of less than 1%). The data include 198,269 autosomal predicted loss-of-function (LOF) variants, a more than 14-fold increase compared to the imputed sequence. Nearly all genes (more than 97%) had at least one carrier with a LOF variant, and most genes (more than 69%) had at least ten carriers with a LOF variant. We illustrate the power of characterizing LOF variants in this population through association analyses across 1,730 phenotypes. In addition to replicating established associations, we found novel LOF variants with large effects on disease traits, including PIEZO1 on varicose veins, COL6A1 on corneal resistance, MEPE on bone density, and IQGAP2 and GMPR on blood cell traits. We further demonstrate the value of exome sequencing by surveying the prevalence of pathogenic variants of clinical importance, and show that 2% of this population has a medically actionable variant. Furthermore, we characterize the penetrance of cancer in carriers of pathogenic BRCA1 and BRCA2 variants. Exome sequences from the first 49,960 participants highlight the promise of genome sequencing in large population-based studies and are now accessible to the scientific community.
Heart failure (HF) is a leading cause of morbidity and mortality worldwide. A small proportion of HF cases are attributable to monogenic cardiomyopathies and existing genome-wide association studies ...(GWAS) have yielded only limited insights, leaving the observed heritability of HF largely unexplained. We report results from a GWAS meta-analysis of HF comprising 47,309 cases and 930,014 controls. Twelve independent variants at 11 genomic loci are associated with HF, all of which demonstrate one or more associations with coronary artery disease (CAD), atrial fibrillation, or reduced left ventricular function, suggesting shared genetic aetiology. Functional analysis of non-CAD-associated loci implicate genes involved in cardiac development (MYOZ1, SYNPO2L), protein homoeostasis (BAG3), and cellular senescence (CDKN1A). Mendelian randomisation analysis supports causal roles for several HF risk factors, and demonstrates CAD-independent effects for atrial fibrillation, body mass index, and hypertension. These findings extend our knowledge of the pathways underlying HF and may inform new therapeutic strategies.
Observational studies have identified height as a strong risk factor for atrial fibrillation, but this finding may be limited by residual confounding. We aimed to examine genetic variation in height ...within the Mendelian randomization (MR) framework to determine whether height has a causal effect on risk of atrial fibrillation.
In summary-level analyses, MR was performed using summary statistics from genome-wide association studies of height (GIANT/UK Biobank; 693,529 individuals) and atrial fibrillation (AFGen; 65,446 cases and 522,744 controls), finding that each 1-SD increase in genetically predicted height increased the odds of atrial fibrillation (odds ratio OR 1.34; 95% CI 1.29 to 1.40; p = 5 × 10-42). This result remained consistent in sensitivity analyses with MR methods that make different assumptions about the presence of pleiotropy, and when accounting for the effects of traditional cardiovascular risk factors on atrial fibrillation. Individual-level phenome-wide association studies of height and a height genetic risk score were performed among 6,567 European-ancestry participants of the Penn Medicine Biobank (median age at enrollment 63 years, interquartile range 55-72; 38% female; recruitment 2008-2015), confirming prior observational associations between height and atrial fibrillation. Individual-level MR confirmed that each 1-SD increase in height increased the odds of atrial fibrillation, including adjustment for clinical and echocardiographic confounders (OR 1.89; 95% CI 1.50 to 2.40; p = 0.007). The main limitations of this study include potential bias from pleiotropic effects of genetic variants, and lack of generalizability of individual-level findings to non-European populations.
In this study, we observed evidence that height is likely a positive causal risk factor for atrial fibrillation. Further study is needed to determine whether risk prediction tools including height or anthropometric risk factors can be used to improve screening and primary prevention of atrial fibrillation, and whether biological pathways involved in height may offer new targets for treatment of atrial fibrillation.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Large-scale human exome sequencing can identify rare protein-coding variants with a large impact on complex traits such as body adiposity. We sequenced the exomes of 645,626 individuals from the ...United Kingdom, the United States, and Mexico and estimated associations of rare coding variants with body mass index (BMI). We identified 16 genes with an exome-wide significant association with BMI, including those encoding five brain-expressed G protein-coupled receptors (
,
,
,
, and
). Protein-truncating variants in
were observed in ~4/10,000 sequenced individuals and were associated with 1.8 kilograms per square meter lower BMI and 54% lower odds of obesity in the heterozygous state. Knock out of
in mice resulted in resistance to weight gain and improved glycemic control in a high-fat diet model. Inhibition of GPR75 may provide a therapeutic strategy for obesity.
Mutations in
or
cause typical autosomal dominant polycystic kidney disease (ADPKD), the most common monogenic kidney disease. Dominantly inherited polycystic kidney and liver diseases on the ADPKD ...spectrum are also caused by mutations in at least six other genes required for protein biogenesis in the endoplasmic reticulum, the loss of which results in defective production of the
gene product, the membrane protein polycystin-1 (PC1).
We used whole-exome sequencing in a cohort of 122 patients with genetically unresolved clinical diagnosis of ADPKD or polycystic liver disease to identify a candidate gene,
, and
cell-based assays of PC1 protein maturation to functionally validate it. For further validation, we identified carriers of
loss-of-function mutations and noncarrier matched controls in a large exome-sequenced population-based cohort and evaluated the occurrence of polycystic phenotypes in both groups.
Two patients in the clinically defined cohort had rare loss-of-function variants in
, which encodes a protein required for addition of specific mannose molecules to the assembling N-glycan precursors in the endoplasmic reticulum lumen.
assays showed that inactivation of
results in impaired maturation and defective glycosylation of PC1. Seven of the eight (88%) cases selected from the population-based cohort based on
mutation carrier state who had abdominal imaging after age 50; seven (88%) had at least four kidney cysts, compared with none in matched controls without
mutations.
is a novel disease gene in the genetically heterogeneous ADPKD spectrum. This study supports the utility of phenotype characterization in genetically-defined cohorts to validate novel disease genes, and provide much-needed genotype-phenotype correlations.
OBJECTIVE:To determine the relationship of a genome-wide polygenic score for coronary artery disease (GPSCAD) with lifetime trajectories of CAD risk, directly compare its predictive capacity to ...traditional risk factors, and assess its interplay with the Pooled Cohort Equations (PCE) clinical risk estimator.
APPROACH AND RESULTS:We studied GPSCAD in 28 556 middle-aged participants of the Malmö Diet and Cancer Study, of whom 4122 (14.4%) developed CAD over a median follow-up of 21.3 years. A pronounced gradient in lifetime risk of CAD was observed—16% for those in the lowest GPSCAD decile to 48% in the highest. We evaluated the discriminative capacity of the GPSCAD—as assessed by change in the C-statistic from a baseline model including age and sex—among 5685 individuals with PCE risk estimates available. The increment for the GPSCAD (+0.045, P<0.001) was higher than for any of 11 traditional risk factors (range +0.007 to +0.032). Minimal correlation was observed between GPSCAD and 10-year risk defined by the PCE (r=0.03), and addition of GPSCAD improved the C-statistic of the PCE model by 0.026. A significant gradient in lifetime risk was observed for the GPSCAD, even among individuals within a given PCE clinical risk stratum. We replicated key findings—noting strikingly consistent results—in 325 003 participants of the UK Biobank.
CONCLUSIONS:GPSCAD—a risk estimator available from birth—stratifies individuals into varying trajectories of clinical risk for CAD. Implementation of GPSCAD may enable identification of high-risk individuals early in life, decades in advance of manifest risk factors or disease.
The calcium-sensing receptor (CaSR) regulates serum calcium concentrations. CASR loss- or gain-of-function mutations cause familial hypocalciuric hypercalcemia type 1 (FHH1) or autosomal-dominant ...hypocalcemia type 1 (ADH1), respectively, but the population prevalence of FHH1 or ADH1 is unknown. Rare CASR variants were identified in whole-exome sequences from 51,289 de-identified individuals in the DiscovEHR cohort derived from a single US healthcare system. We integrated bioinformatics pathogenicity triage, mean serum Ca concentrations, and mode of inheritance to identify potential FHH1 or ADH1 variants, and we used a Sequence Kernel Association Test (SKAT) to identify rare variant-associated diseases. We identified predicted heterozygous loss-of-function CASR variants (6 different nonsense/frameshift variants and 12 different missense variants) in 38 unrelated individuals, 21 of whom were hypercalcemic. Missense CASR variants were identified in two unrelated hypocalcemic individuals. Functional studies showed that all hypercalcemia-associated missense variants impaired heterologous expression, plasma membrane targeting, and/or signaling, whereas hypocalcemia-associated missense variants increased expression, plasma membrane targeting, and/or signaling. Thus, 38 individuals with a genetic diagnosis of FHH1 and two individuals with a genetic diagnosis of ADH1 were identified in the 51,289 cohort, giving a prevalence in this population of 74.1 per 100,000 for FHH1 and 3.9 per 100,000 for ADH1. SKAT combining all nonsense, frameshift, and missense loss-of-function variants revealed associations with cardiovascular, neurological, and other diseases. In conclusion, FHH1 is a common cause of hypercalcemia, with prevalence similar to that of primary hyperparathyroidism, and is associated with altered disease risks, whereas ADH1 is a major cause of non-surgical hypoparathyroidism.
GPR37L1 is an orphan receptor that couples through heterotrimeric G-proteins to regulate physiological functions. Since its role in humans is not fully defined, we used an unbiased computational ...approach to assess the clinical significance of rare
(
) genetic variants found among 51,289 whole-exome sequences from the DiscovEHR cohort. Rare
coding variants were binned according to predicted pathogenicity and analyzed by sequence kernel association testing to reveal significant associations with disease diagnostic codes for epilepsy and migraine, among others. Since associations do not prove causality, rare
variants were functionally analyzed in SK-N-MC cells to evaluate potential signaling differences and pathogenicity. Notably, receptor variants exhibited varying abilities to reduce cAMP levels, activate mitogen-activated protein kinase (MAPK) signaling, and/or upregulate receptor expression in response to the agonist prosaptide (TX14(A)), as compared with the wild-type receptor. In addition to signaling changes, knock-out (KO) of
or expression of certain rare variants altered cellular cholesterol levels, which were also acutely regulated by administration of the agonist TX14(A) via activation of the MAPK pathway. Finally, to simulate the impact of rare nonsense variants found in the large patient cohort, a KO mouse line lacking
was generated. Although KO animals did not recapitulate an acute migraine phenotype, the loss of this receptor produced sex-specific changes in anxiety-related disorders often seen in chronic migraineurs. Collectively, these observations define the existence of rare
variants associated with neuropsychiatric conditions in the human population and identify the signaling changes contributing to pathological processes.
Osteoarthritis affects over 300 million people worldwide. Here, we conduct a genome-wide association study meta-analysis across 826,690 individuals (177,517 with osteoarthritis) and identify 100 ...independently associated risk variants across 11 osteoarthritis phenotypes, 52 of which have not been associated with the disease before. We report thumb and spine osteoarthritis risk variants and identify differences in genetic effects between weight-bearing and non-weight-bearing joints. We identify sex-specific and early age-at-onset osteoarthritis risk loci. We integrate functional genomics data from primary patient tissues (including articular cartilage, subchondral bone, and osteophytic cartilage) and identify high-confidence effector genes. We provide evidence for genetic correlation with phenotypes related to pain, the main disease symptom, and identify likely causal genes linked to neuronal processes. Our results provide insights into key molecular players in disease processes and highlight attractive drug targets to accelerate translation.
Display omitted
•A multicohort study identifies 52 previously unknown osteoarthritis genetic risk variants•Similarities and differences in osteoarthritis genetic risk depend on joint sites•Osteoarthritis genetic components are associated with pain-related phenotypes•High-confidence effector genes highlight potential targets for drug intervention
A multicohort genome-wide association meta-analysis of osteoarthritis highlights the impact of joint site types on the features of genetic risk variants and the link between osteoarthritis genetics and pain-related phenotypes, pointing toward potential targets for therapeutic intervention.