In genome-wide association studies (GWAS) for thousands of phenotypes in large biobanks, most binary traits have substantially fewer cases than controls. Both of the widely used approaches, the ...linear mixed model and the recently proposed logistic mixed model, perform poorly; they produce large type I error rates when used to analyze unbalanced case-control phenotypes. Here we propose a scalable and accurate generalized mixed model association test that uses the saddlepoint approximation to calibrate the distribution of score test statistics. This method, SAIGE (Scalable and Accurate Implementation of GEneralized mixed model), provides accurate P values even when case-control ratios are extremely unbalanced. SAIGE uses state-of-art optimization strategies to reduce computational costs; hence, it is applicable to GWAS for thousands of phenotypes by large biobanks. Through the analysis of UK Biobank data of 408,961 samples from white British participants with European ancestry for > 1,400 binary phenotypes, we show that SAIGE can efficiently analyze large sample data, controlling for unbalanced case-control ratios and sample relatedness.
To identify genetic variation underlying atrial fibrillation, the most common cardiac arrhythmia, we performed a genome-wide association study of >1,000,000 people, including 60,620 atrial ...fibrillation cases and 970,216 controls. We identified 142 independent risk variants at 111 loci and prioritized 151 functional candidate genes likely to be involved in atrial fibrillation. Many of the identified risk variants fall near genes where more deleterious mutations have been reported to cause serious heart defects in humans (GATA4, MYH6, NKX2-5, PITX2, TBX5)
, or near genes important for striated muscle function and integrity (for example, CFL2, MYH7, PKP2, RBM20, SGCG, SSPN). Pathway and functional enrichment analyses also suggested that many of the putative atrial fibrillation genes act via cardiac structural remodeling, potentially in the form of an 'atrial cardiomyopathy'
, either during fetal heart development or as a response to stress in the adult heart.
Cardiovascular disease is the most common cause of death worldwide, especially beyond the age of 65 years, with the vast majority of morbidity and mortality due to myocardial infarction and stroke. ...Vascular pathology stems from a combination of genetic risk, environmental factors, and the biologic changes associated with aging. The pathogenesis underlying the development of vascular aging, and vascular calcification with aging, in particular, is still not fully understood. Accumulating data suggests that genetic risk, likely compounded by epigenetic modifications, environmental factors, including diabetes and chronic kidney disease, and the plasticity of vascular smooth muscle cells to acquire an osteogenic phenotype are major determinants of age-associated vascular calcification. Understanding the molecular mechanisms underlying genetic and modifiable risk factors in regulating age-associated vascular pathology may inspire strategies to promote healthy vascular aging. This article summarizes current knowledge of concepts and mechanisms of age-associated vascular disease, with an emphasis on vascular calcification.
ABSTRACT
Motivation
In the genome-wide association analysis of population-based biobanks, most diseases have low prevalence, which results in low detection power. One approach to tackle the problem ...is using family disease history, yet existing methods are unable to address type I error inflation induced by increased correlation of phenotypes among closely related samples, as well as unbalanced phenotypic distribution.
Results
We propose a new method for genetic association test with family disease history, mixed-model-based Test with Adjusted Phenotype and Empirical saddlepoint approximation, which controls for increased phenotype correlation by adopting a two-variance-component mixed model, accounts for case–control imbalance by using empirical saddlepoint approximation, and is flexible to incorporate any existing adjusted phenotypes, such as phenotypes from the LT-FH method. We show through simulation studies and analysis of UK Biobank data of white British samples and the Korean Genome and Epidemiology Study of Korean samples that the proposed method is robust and yields better calibration compared to existing methods while gaining power for detection of variant–phenotype associations.
Availability and implementation
The summary statistics and code generated in this study are available at https://github.com/styvon/TAPE.
Supplementary information
Supplementary data are available at Bioinformatics online.
Abstract Polygenic scores (PGSs) offer the ability to predict genetic risk for complex diseases across the life course; a key benefit over short-term prediction models. To produce risk estimates ...relevant to clinical and public health decision-making, it is important to account for varying effects due to age and sex. Here, we develop a novel framework to estimate country-, age-, and sex-specific estimates of cumulative incidence stratified by PGS for 18 high-burden diseases. We integrate PGS associations from seven studies in four countries ( N = 1,197,129) with disease incidences from the Global Burden of Disease. PGS has a significant sex-specific effect for asthma, hip osteoarthritis, gout, coronary heart disease and type 2 diabetes (T2D), with all but T2D exhibiting a larger effect in men. PGS has a larger effect in younger individuals for 13 diseases, with effects decreasing linearly with age. We show for breast cancer that, relative to individuals in the bottom 20% of polygenic risk, the top 5% attain an absolute risk for screening eligibility 16.3 years earlier. Our framework increases the generalizability of results from biobank studies and the accuracy of absolute risk estimates by appropriately accounting for age- and sex-specific PGS effects. Our results highlight the potential of PGS as a screening tool which may assist in the early prevention of common diseases.
Molecular mechanisms remain unknown for most type 2 diabetes genome-wide association study identified loci. Variants associated with type 2 diabetes and fasting glucose levels reside in introns of
, ...a gene that encodes adenylate cyclase 5. Adenylate cyclase 5 catalyzes the production of cyclic AMP, which is a second messenger molecule involved in cell signaling and pancreatic β-cell insulin secretion. We demonstrated that type 2 diabetes risk alleles are associated with decreased
expression in human islets and examined candidate variants for regulatory function. rs11708067 overlaps a predicted enhancer region in pancreatic islets. The type 2 diabetes risk rs11708067-A allele showed fewer H3K27ac ChIP-seq reads in human islets, lower transcriptional activity in reporter assays in rodent β-cells (rat 832/13 and mouse MIN6), and increased nuclear protein binding compared with the rs11708067-G allele. Homozygous deletion of the orthologous enhancer region in 832/13 cells resulted in a 64% reduction in expression level of
, but not adjacent gene
, and a 39% reduction in insulin secretion. Together, these data suggest that rs11708067-A risk allele contributes to type 2 diabetes by disrupting an islet enhancer, which results in reduced
expression and impaired insulin secretion.
To evaluate aortic disease progression and reintervention after an initial thoracic aortic dissection in pathogenic variant carriers.
Of 175 participants diagnosed with thoracic aortic dissection, 31 ...had a pathogenic variant (pathogenic group) across 6 genes (COL3A1, FBN1, LOX, PRKG1, SMAD3, TGFBR2) identified by whole exome sequencing. Those with benign or normal variants (benign/normal group, n = 144) comprised the control group. Clinical data were collected through medical record review (1985-2018) and supplemented with the National Death Index database (December 2018).
The entire cohort (n = 175) consisted of 108 type A aortic dissections and 67 type B aortic dissections, similarly distributed between groups. The pathogenic group was significantly younger (43 vs 56 years, P < .0001) and had significantly more aortic root replacements and similar extents of arch replacement at initial type A aortic dissection repair. The median follow-up time was 7.5 (4.6-12) years. After initial treatment, the pathogenic group required significantly more aortic reinterventions (median 1 vs 0, P < .0001) and mean cumulative aortic reinterventions for each patient (10 years: 1 vs 0.5, P = .029). Both incidence rate (12%/year vs 1.2%/year, P = .0001) and cumulative incidence of reinterventions (9 years: 70% vs 6%, P < .0001) for the preserved native aortic root were significantly higher in the pathogenic group, but were similar for the preserved native aortic arch and distal aorta between groups. Ten-year survival was similar in the pathogenic and benign/normal groups (92% vs 85%).
Aggressive aortic root replacement and similar arch management should be considered in pathogenic variant carriers at initial type A aortic dissection repair compared with benign/normal variant carriers.
A summary of the key findings and their implications: Compared with the patients with benign and normal genetic variants, patients with pathogenic variants had a significantly higher rate of reinterventions only of the native aortic root after initial thoracic aortic dissection. Display omitted
From whole organisms to individual cells, responses to environmental conditions are influenced by genetic makeup, where the effect of genetic variation on a trait depends on the environmental ...context. RNA-sequencing quantifies gene expression as a molecular trait, and is capable of capturing both genetic and environmental effects. In this study, we explore opportunities of using allele-specific expression (ASE) to discover cis-acting genotype-environment interactions (GxE)-genetic effects on gene expression that depend on an environmental condition. Treating 17 common, clinical traits as approximations of the cellular environment of 267 skeletal muscle biopsies, we identify 10 candidate environmental response expression quantitative trait loci (reQTLs) across 6 traits (12 unique gene-environment trait pairs; 10% FDR per trait) including sex, systolic blood pressure, and low-density lipoprotein cholesterol. Although using ASE is in principle a promising approach to detect GxE effects, replication of such signals can be challenging as validation requires harmonization of environmental traits across cohorts and a sufficient sampling of heterozygotes for a transcribed SNP. Comprehensive discovery and replication will require large human transcriptome datasets, or the integration of multiple transcribed SNPs, coupled with standardized clinical phenotyping.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Methods of estimating polygenic scores (PGSs) from genome-wide association studies are increasingly utilized. However, independent method evaluation is lacking, and method comparisons are often ...limited. Here, we evaluate polygenic scores derived via seven methods in five biobank studies (totaling about 1.2 million participants) across 16 diseases and quantitative traits, building on a reference-standardized framework. We conducted meta-analyses to quantify the effects of method choice, hyperparameter tuning, method ensembling, and the target biobank on PGS performance. We found that no single method consistently outperformed all others. PGS effect sizes were more variable between biobanks than between methods within biobanks when methods were well tuned. Differences between methods were largest for the two investigated autoimmune diseases, seropositive rheumatoid arthritis and type 1 diabetes. For most methods, cross-validation was more reliable for tuning hyperparameters than automatic tuning (without the use of target data). For a given target phenotype, elastic net models combining PGS across methods (ensemble PGS) tuned in the UK Biobank provided consistent, high, and cross-biobank transferable performance, increasing PGS effect sizes (β coefficients) by a median of 5.0% relative to LDpred2 and MegaPRS (the two best-performing single methods when tuned with cross-validation). Our interactively browsable online-results and open-source workflow prspipe provide a rich resource and reference for the analysis of polygenic scoring methods across biobanks.
Display omitted
Systematic evaluation of polygenic scoring methods in 1.2 million individuals across five biobanks finds that no single method performs best. Performance varied more between biobanks than between methods, suggesting that future research should address between-biobank variability. Ensembles provided high, robust, and transferable performance. Workflow and results browser are open source.
Background
Variation in mitochondrial DNA (mtDNA) has been indicated in migraine pathogenesis, but genetic studies to date have focused on candidate variants, with sparse findings. We aimed to ...perform the first mitochondrial genome-wide association study of migraine, examining both single variants and mitochondrial haplogroups.
Methods
In total, 71,860 participants from the population-based Nord-Trøndelag Health Study were genotyped. We excluded samples not passing quality control for nuclear genotypes, in addition to samples with low call rate and closely maternally related. We analysed 775 mitochondrial DNA variants in 4021 migraine cases and 14,288 headache-free controls, using logistic regression. In addition, we analysed 3831 cases and 13,584 controls who could be reliably assigned to a mitochondrial haplogroup. Lastly, we attempted to replicate previously reported mitochondrial DNA candidate variants.
Results
Neither of the mitochondrial variants or haplogroups were associated with migraine. In addition, none of the previously reported mtDNA candidate variants replicated in our data.
Conclusions
Our findings do not support a major role of mitochondrial genetic variation in migraine pathophysiology, but a larger sample is needed to detect rare variants and future studies should also examine heteroplasmic variation, epigenetic changes and copy-number variation.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK