Genome-wide association analysis of cohorts with thousands of phenotypes is computationally expensive, particularly when accounting for sample relatedness or population structure. Here we present a ...novel machine-learning method called REGENIE for fitting a whole-genome regression model for quantitative and binary phenotypes that is substantially faster than alternatives in multi-trait analyses while maintaining statistical efficiency. The method naturally accommodates parallel analysis of multiple phenotypes and requires only local segments of the genotype matrix to be loaded in memory, in contrast to existing alternatives, which must load genome-wide matrices into memory. This results in substantial savings in compute time and memory usage. We introduce a fast, approximate Firth logistic regression test for unbalanced case-control phenotypes. The method is ideally suited to take advantage of distributed computing frameworks. We demonstrate the accuracy and computational benefits of this approach using the UK Biobank dataset with up to 407,746 individuals.
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) enters human host cells via angiotensin-converting enzyme 2 (ACE2) and causes coronavirus disease 2019 (COVID-19). Here, through a ...genome-wide association study, we identify a variant (rs190509934, minor allele frequency 0.2-2%) that downregulates ACE2 expression by 37% (P = 2.7 × 10
) and reduces the risk of SARS-CoV-2 infection by 40% (odds ratio = 0.60, P = 4.5 × 10
), providing human genetic evidence that ACE2 expression levels influence COVID-19 risk. We also replicate the associations of six previously reported risk variants, of which four were further associated with worse outcomes in individuals infected with the virus (in/near LZTFL1, MHC, DPP9 and IFNAR2). Lastly, we show that common variants define a risk score that is strongly associated with severe disease among cases and modestly improves the prediction of disease severity relative to demographic and clinical factors alone.
Understanding mechanisms of hepatocellular damage may lead to new treatments for liver disease, and genome-wide association studies (GWAS) of alanine aminotransferase (ALT) and aspartate ...aminotransferase (AST) serum activities have proven useful for investigating liver biology. Here we report 100 loci associating with both enzymes, using GWAS across 411,048 subjects in the UK Biobank. The rare missense variant SLC30A10 Thr95Ile (rs188273166) associates with the largest elevation of both enzymes, and this association replicates in the DiscovEHR study. SLC30A10 excretes manganese from the liver to the bile duct, and rare homozygous loss of function causes the syndrome hypermanganesemia with dystonia-1 (HMNDYT1) which involves cirrhosis. Consistent with hematological symptoms of hypermanganesemia, SLC30A10 Thr95Ile carriers have increased hematocrit and risk of iron deficiency anemia. Carriers also have increased risk of extrahepatic bile duct cancer. These results suggest that genetic variation in SLC30A10 adversely affects more individuals than patients with diagnosed HMNDYT1.
Enlargement of the aorta is an important risk factor for aortic aneurysm and dissection, a leading cause of morbidity in the developed world. Here we performed automated extraction of ascending ...aortic diameter from cardiac magnetic resonance images of 36,021 individuals from the UK Biobank, followed by genome-wide association. We identified lead variants across 41 loci, including genes related to cardiovascular development (HAND2, TBX20) and Mendelian forms of thoracic aortic disease (ELN, FBN1). A polygenic score significantly predicted prevalent risk of thoracic aortic aneurysm and the need for surgical intervention for patients with thoracic aneurysm across multiple ancestries within the UK Biobank, FinnGen, the Penn Medicine Biobank and the Million Veterans Program (MVP). Additionally, we highlight the primary causal role of blood pressure in reducing aortic dilation using Mendelian randomization. Overall, our findings provide a roadmap for using genetic determinants of human anatomy to understand cardiovascular development while improving prediction of diseases of the thoracic aorta.
Sequencing of large cohorts offers an unprecedented opportunity to identify rare genetic variants and to find novel contributors to human disease. We used gene-based collapsing tests to identify ...genes associated with glucose, HbA1c and type 2 diabetes (T2D) diagnosis in 379,066 exome-sequenced participants in the UK Biobank. We identified associations for variants in GCK, HNF1A and PDX1, which are known to be involved in Mendelian forms of diabetes. Notably, we uncovered novel associations for GIGYF1, a gene not previously implicated by human genetics in diabetes. GIGYF1 predicted loss of function (pLOF) variants associated with increased levels of glucose (0.77 mmol/L increase, p = 4.42 × 10
) and HbA1c (4.33 mmol/mol, p = 1.28 × 10
) as well as T2D diagnosis (OR = 4.15, p = 6.14 × 10
). Multiple rare variants contributed to these associations, including singleton variants. GIGYF1 pLOF also associated with decreased cholesterol levels as well as an increased risk of hypothyroidism. The association of GIGYF1 pLOF with T2D diagnosis replicated in an independent cohort from the Geisinger Health System. In addition, a common variant association for glucose and T2D was identified at the GIGYF1 locus. Our results highlight the role of GIGYF1 in regulating insulin signaling and protecting from diabetes.
Anterior Uveitis (AU) is the inflammation of the anterior part of the eye, the iris and ciliary body and is strongly associated with HLA-B*27. We report AU exome sequencing results from eight ...independent cohorts consisting of 3,850 cases and 916,549 controls. We identify common genome-wide significant loci in HLA-B (OR = 3.37, p = 1.03e-196) and ERAP1 (OR = 0.86, p = 1.1e-08), and find IPMK (OR = 9.4, p = 4.42e-09) and IDO2 (OR = 3.61, p = 6.16e-08) as genome-wide significant genes based on the burden of rare coding variants. Dividing the cohort into HLA-B*27 positive and negative individuals, we find ERAP1 haplotype is strongly protective only for B*27-positive AU (OR = 0.73, p = 5.2e-10). Investigation of B*27-negative AU identifies a common signal near HLA-DPB1 (rs3117230, OR = 1.26, p = 2.7e-08), risk genes IPMK and IDO2, and several additional candidate risk genes, including ADGFR5, STXBP2, and ACHE. Taken together, we decipher the genetics underlying B*27-positive and -negative AU and identify rare and common genetic signals for both subtypes of disease.
Background
Severe alpha‐1‐antitrypsin deficiency (AATD), phenotype PiZZ, was associated with venous thromboembolism (VTE) in a case‐control study.
Objectives
This study aimed to determine the genetic ...variation in the SERPINA1 gene and a possible thrombotic risk of these variants in a population‐based cohort study.
Patients/Methods
The coding sequence of SERPINA1 was analyzed for the Z (rs28929474), S (rs17580), and other qualifying variants in 28,794 subjects without previous VTE (born 1923–1950, 60% women), who participated in the Malmö Diet and Cancer study (1991–1996). Individuals were followed from baseline until the first event of VTE, death, or 2018.
Results
Resequencing the coding sequence of SERPINA1 identified 84 variants in the total study population, 21 synonymous, 62 missense, and 1 loss‐of‐function variant. Kaplan‐Meier analysis showed that homozygosity for the Z allele increased the risk of VTE whereas heterozygosity showed no effect. The S (rs17580) variant was not associated with VTE. Thirty‐one rare variants were qualifying and included in collapsing analysis using the following selection criteria, loss of function, in frame deletion or non‐benign (PolyPhen‐2) missense variants with minor allele frequency (MAF) <0.1%. Combining the rare qualifying variants with the Z variant showed that carrying two alleles (ZZ or compound heterozygotes) showed increased risk. Cox regression analysis revealed an adjusted hazard ratio of 4.5 (95% confidence interval 2.0–10.0) for combinations of the Z variant and rare qualifying variants. One other variant (rs141620200; MAF = 0.002) showed an increased risk of VTE.
Conclusions
The SERPINA1 ZZ genotype and compound heterozygotes for severe AATD are rare but associated with VTE in a population‐based Swedish study.
Genome-wide association studies have identified hundreds of single nucleotide variations (formerly single nucleotide polymorphisms) associated with several cancers, but the predictive ability of ...polygenic risk scores (PRSs) is unclear, especially among non-Whites.
PRSs were derived from genome-wide significant single-nucleotide variations for 15 cancers in 20,079 individuals in an academic biobank. We evaluated the improvement in discriminatory accuracy by including cancer-specific PRS in patients of genetically-determined African and European ancestry.
Among the individuals of European genetic ancestry, PRSs for breast, colon, melanoma, and prostate were significantly associated with their respective cancers. Among the individuals of African genetic ancestry, PRSs for breast, colon, prostate, and thyroid were significantly associated with their respective cancers. The area under the curve of the model consisting of age, sex, and principal components was 0.621 to 0.710, and it increased by 1% to 4% with the inclusion of PRS in individuals of European genetic ancestry. In individuals of African genetic ancestry, area under the curve was overall higher in the model without the PRS (0.723-0.810) but increased by <1% with the inclusion of PRS for most cancers.
PRS moderately increased the ability to discriminate the cancer status in individuals of European but not African ancestry. Further large-scale studies are needed to identify ancestry-specific genetic factors in non-White populations to incorporate PRS into cancer risk assessment.
Up to one of every six individuals diagnosed with one cancer will be diagnosed with a second primary cancer in their lifetime. Genetic factors contributing to the development of multiple primary ...cancers, beyond known cancer syndromes, have been underexplored.
To characterize genetic susceptibility to multiple cancers, we conducted a pan-cancer, whole-exome sequencing study of individuals drawn from two large multi-ancestry populations (6429 cases, 165,853 controls). We created two groupings of individuals diagnosed with multiple primary cancers: (1) an overall combined set with at least two cancers across any of 36 organ sites and (2) cancer-specific sets defined by an index cancer at one of 16 organ sites with at least 50 cases from each study population. We then investigated whether variants identified from exome sequencing were associated with these sets of multiple cancer cases in comparison to individuals with one and, separately, no cancers.
We identified 22 variant-phenotype associations, 10 of which have not been previously discovered and were significantly overrepresented among individuals with multiple cancers, compared to those with a single cancer.
Overall, we describe variants and genes that may play a fundamental role in the development of multiple primary cancers and improve our understanding of shared mechanisms underlying carcinogenesis.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Background Five classic thrombophilias have been recognized: factor V Leiden (rs6025), the prothrombin G20210A variant (rs1799963), and protein C, protein S, and antithrombin deficiencies. This study ...aimed to determine the thrombotic risk of classic thrombophilias in a cohort of middle-aged and older adults. Methods and Results Factor V Leiden, prothrombin G20210A and protein-coding variants in the
(protein C),
(protein S), and
(antithrombin) anticoagulant genes were determined in 29 387 subjects (born 1923-1950, 60% women) who participated in the Malmö Diet and Cancer study (1991-1996). The Human Gene Mutation Database was used to define 68 disease-causing mutations. Patients were followed up from baseline until the first event of venous thromboembolism (VTE), death, or Dec 31, 2018. Carriership (n=908, 3.1%) for disease-causing mutations in the
,
, and
genes was associated with incident VTE: Hazard ratio (HR) was 1.6 (95% CI, 1.3-1.9). Variants not in Human Gene Mutation Database were not linked to VTE (HR, 1.1; 95% CI, 0.8-1.5). Heterozygosity for rs6025 and rs1799963 was associated with incident VTE: HR, 1.8 (95% CI, 1.6-2.0) and HR, 1.6 (95% CI, 1.3-2.0), respectively. The HR for carrying 1 classical thrombophilia variant was 1.7 (95% CI, 1.6-1.9). HR was 3.9 (95% CI, 3.1-5.0) for carriers of ≥2 thrombophilia variants. Conclusions The 5 classic thrombophilias are associated with a dose-graded risk of VTE in middle-aged and older adults. Disease-causing variants in the
,
, and
genes were more common than the rs1799963 variant but the conferred genetic risk was comparable with the rs6025 and rs1799963 variants.