Few studies have explored the impact of rare variants (minor allele frequency < 1%) on highly heritable plasma metabolites identified in metabolomic screens. The Finnish population provides an ideal ...opportunity for such explorations, given the multiple bottlenecks and expansions that have shaped its history, and the enrichment for many otherwise rare alleles that has resulted. Here, we report genetic associations for 1391 plasma metabolites in 6136 men from the late-settlement region of Finland. We identify 303 novel association signals, more than one third at variants rare or enriched in Finns. Many of these signals identify genes not previously implicated in metabolite genome-wide association studies and suggest mechanisms for diseases and disease-related traits.
Polygenic risk scores (PRSs) have shown promise in predicting susceptibility to common diseases
. We estimated their added value in clinical risk prediction of five common diseases, using large-scale ...biobank data (FinnGen; n = 135,300) and the FINRISK study with clinical risk factors to test genome-wide PRSs for coronary heart disease, type 2 diabetes, atrial fibrillation, breast cancer and prostate cancer. We evaluated the lifetime risk at different PRS levels, and the impact on disease onset and on prediction together with clinical risk scores. Compared to having an average PRS, having a high PRS contributed 21% to 38% higher lifetime risk, and 4 to 9 years earlier disease onset. PRSs improved model discrimination over age and sex in type 2 diabetes, atrial fibrillation, breast cancer and prostate cancer, and over clinical risk in type 2 diabetes, breast cancer and prostate cancer. In all diseases, PRSs improved reclassification over clinical thresholds, with the largest net reclassification improvements for early-onset coronary heart disease, atrial fibrillation and prostate cancer. This study provides evidence for the additional value of PRSs in clinical disease prediction. The practical applications of polygenic risk information for stratified screening or for guiding lifestyle and medical interventions in the clinical setting remain to be defined in further studies.
Genome-wide association studies have identified several genetic variants associated with coronary heart disease (CHD). The aim of this study was to evaluate the genetic risk discrimination and ...reclassification and apply the results for a 2-stage population risk screening strategy for CHD.
We genotyped 28 genetic variants in 24 124 participants in 4 Finnish population-based, prospective cohorts (recruitment years 1992-2002). We constructed a multilocus genetic risk score and evaluated its association with incident cardiovascular disease events. During the median follow-up time of 12 years (interquartile range 8.75-15.25 years), we observed 1093 CHD, 1552 cardiovascular disease, and 731 acute coronary syndrome events. Adding genetic information to conventional risk factors and family history improved risk discrimination of CHD (C-index 0.856 versus 0.851; P=0.0002) and other end points (cardiovascular disease: C-index 0.840 versus 0.837, P=0.0004; acute coronary syndrome: C-index 0.859 versus 0.855, P=0.001). In a standard population of 100 000 individuals, additional genetic screening of subjects at intermediate risk for CHD would reclassify 2144 subjects (12%) into high-risk category. Statin allocation for these subjects is estimated to prevent 135 CHD cases over 14 years. Similar results were obtained by external validation, where the effects were estimated from a training data set and applied for a test data set.
Genetic risk score improves risk prediction of CHD and helps to identify individuals at high risk for the first CHD event. Genetic screening for individuals at intermediate cardiovascular risk could help to prevent future cases through better targeting of statins.
Genetics plays an important role in coronary heart disease (CHD) but the clinical utility of genomic risk scores (GRSs) relative to clinical risk scores, such as the Framingham Risk Score (FRS), is ...unclear. Our aim was to construct and externally validate a CHD GRS, in terms of lifetime CHD risk and relative to traditional clinical risk scores.
We generated a GRS of 49 310 SNPs based on a CARDIoGRAMplusC4D Consortium meta-analysis of CHD, then independently tested it using five prospective population cohorts (three FINRISK cohorts, combined n = 12 676, 757 incident CHD events; two Framingham Heart Study cohorts (FHS), combined n = 3406, 587 incident CHD events). The GRS was associated with incident CHD (FINRISK HR = 1.74, 95% confidence interval (CI) 1.61-1.86 per S.D. of GRS; Framingham HR = 1.28, 95% CI 1.18-1.38), and was largely unchanged by adjustment for known risk factors, including family history. Integration of the GRS with the FRS or ACC/AHA13 scores improved the 10 years risk prediction (meta-analysis C-index: +1.5-1.6%, P < 0.001), particularly for individuals ≥60 years old (meta-analysis C-index: +4.6-5.1%, P < 0.001). Importantly, the GRS captured substantially different trajectories of absolute risk, with men in the top 20% of attaining 10% cumulative CHD risk 12-18 y earlier than those in the bottom 20%. High genomic risk was partially compensated for by low systolic blood pressure, low cholesterol level, and non-smoking.
A GRS based on a large number of SNPs improves CHD risk prediction and encodes different trajectories of lifetime risk not captured by traditional clinical risk scores.
The Alzheimer's Disease Sequencing Project (ADSP) undertook whole exome sequencing in 5,740 late-onset Alzheimer disease (AD) cases and 5,096 cognitively normal controls primarily of European ...ancestry (EA), among whom 218 cases and 177 controls were Caribbean Hispanic (CH). An age-, sex- and APOE based risk score and family history were used to select cases most likely to harbor novel AD risk variants and controls least likely to develop AD by age 85 years. We tested ~1.5 million single nucleotide variants (SNVs) and 50,000 insertion-deletion polymorphisms (indels) for association to AD, using multiple models considering individual variants as well as gene-based tests aggregating rare, predicted functional, and loss of function variants. Sixteen single variants and 19 genes that met criteria for significant or suggestive associations after multiple-testing correction were evaluated for replication in four independent samples; three with whole exome sequencing (2,778 cases, 7,262 controls) and one with genome-wide genotyping imputed to the Haplotype Reference Consortium panel (9,343 cases, 11,527 controls). The top findings in the discovery sample were also followed-up in the ADSP whole-genome sequenced family-based dataset (197 members of 42 EA families and 501 members of 157 CH families). We identified novel and predicted functional genetic variants in genes previously associated with AD. We also detected associations in three novel genes: IGHG3 (p = 9.8 × 10
), an immunoglobulin gene whose antibodies interact with β-amyloid, a long non-coding RNA AC099552.4 (p = 1.2 × 10
), and a zinc-finger protein ZNF655 (gene-based p = 5.0 × 10
). The latter two suggest an important role for transcriptional regulation in AD pathogenesis.
Despite evidence that genetic factors contribute to the duration of gestation and the risk of preterm birth, robust associations with genetic variants have not been identified. We used large data ...sets that included the gestational duration to determine possible genetic associations.
We performed a genomewide association study in a discovery set of samples obtained from 43,568 women of European ancestry using gestational duration as a continuous trait and term or preterm (<37 weeks) birth as a dichotomous outcome. We used samples from three Nordic data sets (involving a total of 8643 women) to test for replication of genomic loci that had significant genomewide association (P<5.0×10
) or an association with suggestive significance (P<1.0×10
) in the discovery set.
In the discovery and replication data sets, four loci (EBF1, EEFSEC, AGTR2, and WNT4) were significantly associated with gestational duration. Functional analysis showed that an implicated variant in WNT4 alters the binding of the estrogen receptor. The association between variants in ADCY5 and RAP2C and gestational duration had suggestive significance in the discovery set and significant evidence of association in the replication sets; these variants also showed genomewide significance in a joint analysis. Common variants in EBF1, EEFSEC, and AGTR2 showed association with preterm birth with genomewide significance. An analysis of mother-infant dyads suggested that these variants act at the level of the maternal genome.
In this genomewide association study, we found that variants at the EBF1, EEFSEC, AGTR2, WNT4, ADCY5, and RAP2C loci were associated with gestational duration and variants at the EBF1, EEFSEC, and AGTR2 loci with preterm birth. Previously established roles of these genes in uterine development, maternal nutrition, and vascular control support their mechanistic involvement. (Funded by the March of Dimes and others.).
Polygenic scores (PSs) are becoming a useful tool to identify individuals with high genetic risk for complex diseases, and several projects are currently testing their utility for translational ...applications. It is also tempting to use PSs to assess whether genetic variation can explain a part of the geographic distribution of a phenotype. However, it is not well known how the population genetic properties of the training and target samples affect the geographic distribution of PSs. Here, we evaluate geographic differences, and related biases, of PSs in Finland in a geographically well-defined sample of 2,376 individuals from the National FINRISK study. First, we detect geographic differences in PSs for coronary artery disease (CAD), rheumatoid arthritis, schizophrenia, waist-hip ratio (WHR), body-mass index (BMI), and height, but not for Crohn disease or ulcerative colitis. Second, we use height as a model trait to thoroughly assess the possible population genetic biases in PSs and apply similar approaches to the other phenotypes. Most importantly, we detect suspiciously large accumulations of geographic differences for CAD, WHR, BMI, and height, suggesting bias arising from the population’s genetic structure rather than from a direct genotype-phenotype association. This work demonstrates how sensitive the geographic patterns of current PSs are for small biases even within relatively homogeneous populations and provides simple tools to identify such biases. A thorough understanding of the effects of population genetic structure on PSs is essential for translational applications of PSs.
Genetic imputation is a cost-efficient way to improve the power and resolution of genome-wide association (GWA) studies. Current publicly accessible imputation reference panels accurately predict ...genotypes for common variants with minor allele frequency (MAF)≥5% and low-frequency variants (0.5≤MAF<5%) across diverse populations, but the imputation of rare variation (MAF<0.5%) is still rather limited. In the current study, we evaluate imputation accuracy achieved with reference panels from diverse populations with a population-specific high-coverage (30 ×) whole-genome sequencing (WGS) based reference panel, comprising of 2244 Estonian individuals (0.25% of adult Estonians). Although the Estonian-specific panel contains fewer haplotypes and variants, the imputation confidence and accuracy of imputed low-frequency and rare variants was significantly higher. The results indicate the utility of population-specific reference panels for human genetic studies.
Despite great progress in identifying genetic variants that influence human disease, most inherited risk remains unexplained. A more complete understanding requires genome-wide studies that fully ...examine less common alleles in populations with a wide range of ancestry. To inform the design and interpretation of such studies, we genotyped 1.6 million common single nucleotide polymorphisms (SNPs) in 1,184 reference individuals from 11 global populations, and sequenced ten 100-kilobase regions in 692 of these individuals. This integrated data set of common and rare alleles, called 'HapMap 3', includes both SNPs and copy number polymorphisms (CNPs). We characterized population-specific differences among low-frequency variants, measured the improvement in imputation accuracy afforded by the larger reference panel, especially in imputing SNPs with a minor allele frequency of <or=5%, and demonstrated the feasibility of imputing newly discovered CNPs and SNPs. This expanded public resource of genome variants in global populations supports deeper interrogation of genomic variation and its role in human disease, and serves as a step towards a high-resolution map of the landscape of human genetic variation.
Early identification of ambulatory persons at high short-term risk of death could benefit targeted prevention. To identify biomarkers for all-cause mortality and enhance risk prediction, we conducted ...high-throughput profiling of blood specimens in two large population-based cohorts.
106 candidate biomarkers were quantified by nuclear magnetic resonance spectroscopy of non-fasting plasma samples from a random subset of the Estonian Biobank (n = 9,842; age range 18-103 y; 508 deaths during a median of 5.4 y of follow-up). Biomarkers for all-cause mortality were examined using stepwise proportional hazards models. Significant biomarkers were validated and incremental predictive utility assessed in a population-based cohort from Finland (n = 7,503; 176 deaths during 5 y of follow-up). Four circulating biomarkers predicted the risk of all-cause mortality among participants from the Estonian Biobank after adjusting for conventional risk factors: alpha-1-acid glycoprotein (hazard ratio HR 1.67 per 1-standard deviation increment, 95% CI 1.53-1.82, p = 5×10⁻³¹), albumin (HR 0.70, 95% CI 0.65-0.76, p = 2×10⁻¹⁸), very-low-density lipoprotein particle size (HR 0.69, 95% CI 0.62-0.77, p = 3×10⁻¹²), and citrate (HR 1.33, 95% CI 1.21-1.45, p = 5×10⁻¹⁰). All four biomarkers were predictive of cardiovascular mortality, as well as death from cancer and other nonvascular diseases. One in five participants in the Estonian Biobank cohort with a biomarker summary score within the highest percentile died during the first year of follow-up, indicating prominent systemic reflections of frailty. The biomarker associations all replicated in the Finnish validation cohort. Including the four biomarkers in a risk prediction score improved risk assessment for 5-y mortality (increase in C-statistics 0.031, p = 0.01; continuous reclassification improvement 26.3%, p = 0.001).
Biomarker associations with cardiovascular, nonvascular, and cancer mortality suggest novel systemic connectivities across seemingly disparate morbidities. The biomarker profiling improved prediction of the short-term risk of death from all causes above established risk factors. Further investigations are needed to clarify the biological mechanisms and the utility of these biomarkers for guiding screening and prevention.