We introduce cross-trait penalized regression (CTPR), a powerful and practical approach for multi-trait polygenic risk prediction in large cohorts. Specifically, we propose a novel cross-trait ...penalty function with the Lasso and the minimax concave penalty (MCP) to incorporate the shared genetic effects across multiple traits for large-sample GWAS data. Our approach extracts information from the secondary traits that is beneficial for predicting the primary trait based on individual-level genotypes and/or summary statistics. Our novel implementation of a parallel computing algorithm makes it feasible to apply our method to biobank-scale GWAS data. We illustrate our method using large-scale GWAS data (~1M SNPs) from the UK Biobank (N = 456,837). We show that our multi-trait method outperforms the recently proposed multi-trait analysis of GWAS (MTAG) for predictive performance. The prediction accuracy for height by the aid of BMI improves from R
= 35.8% (MTAG) to 42.5% (MCP + CTPR) or 42.8% (Lasso + CTPR) with UK Biobank data.
Breast parenchymal texture features, including grayscale variation (V), capture the patterns of texture variation on a mammogram and are associated with breast cancer risk, independent of ...mammographic density (MD). However, our knowledge on the genetic basis of these texture features is limited.
We conducted a genome-wide association study of V in 7040 European-ancestry women. V assessments were generated from digitized film mammograms. We used linear regression to test the single-nucleotide polymorphism (SNP)-phenotype associations adjusting for age, body mass index (BMI), MD phenotypes, and the top four genetic principal components. We further calculated genetic correlations and performed SNP-set tests of V with MD, breast cancer risk, and other breast cancer risk factors.
We identified three genome-wide significant loci associated with V: rs138141444 (6q24.1) in ECT2L, rs79670367 (8q24.22) in LINC01591, and rs113174754 (12q22) near PGAM1P5. 6q24.1 and 8q24.22 have not previously been associated with MD phenotypes or breast cancer risk, while 12q22 is a known locus for both MD and breast cancer risk. Among known MD and breast cancer risk SNPs, we identified four variants that were associated with V at the Bonferroni-corrected thresholds accounting for the number of SNPs tested: rs335189 (5q23.2) in PRDM6, rs13256025 (8p21.2) in EBF2, rs11836164 (12p12.1) near SSPN, and rs17817449 (16q12.2) in FTO. We observed significant genetic correlations between V and mammographic dense area (r
= 0.79, P = 5.91 × 10
), percent density (r
= 0.73, P = 1.00 × 10
), and adult BMI (r
= - 0.36, P = 3.88 × 10
). Additional significant relationships were observed for non-dense area (z = - 4.14, P = 3.42 × 10
), estrogen receptor-positive breast cancer (z = 3.41, P = 6.41 × 10
), and childhood body fatness (z = - 4.91, P = 9.05 × 10
) from the SNP-set tests.
These findings provide new insights into the genetic basis of mammographic texture variation and their associations with MD, breast cancer risk, and other breast cancer risk factors.
There is strong evidence for a role of environmental risk factors involved in susceptibility to develop multiple keratinocyte cancers (mKCs), but whether genes are also involved in mKCs ...susceptibility has not been thoroughly investigated. We investigated whether single nucleotide polymorphisms (SNPs) are associated with susceptibility for mKCs. A genome-wide association study (GWAS) of 1,666 cases with mKCs and 1,950 cases with single KC (sKCs; controls) from Harvard cohorts (the Nurses' Health Study NHS, NHS II, and the Health Professionals Follow-Up Study) and the Framingham Heart Study was carried-out using over 8 million SNPs (stage-1). We sought to replicate the most significant statistical associations (p-value≤ 5.5x10-6) in an independent cohort of 574 mKCs and 872 sKCs from the Rotterdam Study. In the discovery stage, 40 SNPs with suggestive associations (p-value ≤5.5x10-6) were identified, with eight independent SNPs tagging all 40 SNPs. The most significant SNP was located at chromosome 9 (rs7468390; p-value = 3.92x10-7). In stage-2, none of these SNPs replicated and only two of them were associated with mKCs in the same direction in the combined meta-analysis. We tested the associations for 19 previously reported basal cell carcinoma-related SNPs (candidate gene association analysis), and found that rs1805007 (MC1R locus) was significantly associated with risk of mKCs (p-value = 2.80x10-4). Although the suggestive SNPs with susceptibility for mKCs were not replicated, we found that previously identified BCC variants may also be associated with mKC, which the most significant association (rs1805007) located at the MC1R gene.
Background
While polygenic risk scores hold significant promise in estimating an individual's risk of developing a complex trait such as obesity, their application in the clinic has, to date, been ...limited by a lack of data from non‐European populations. As a collaboration model of the International Hundred K+ Cohorts Consortium (IHCC), we endeavored to develop a globally applicable trans‐ethnic PRS for body mass index (BMI) through this relatively new international effort.
Methods
The polygenic risk score (PRS) model was developed, trained and tested at the Center for Applied Genomics (CAG) of The Children's Hospital of Philadelphia (CHOP) based on a BMI meta‐analysis from the GIANT consortium. The validated PRS models were subsequently disseminated to the participating sites. Scores were generated by each site locally on their cohorts and summary statistics returned to CAG for final analysis.
Results
We show that in the absence of a well powered trans‐ethnic GWAS from which to derive marker SNPs and effect estimates for PRS, trans‐ethnic scores can be generated from European ancestry GWAS using Bayesian approaches such as LDpred, by adjusting the summary statistics using trans‐ethnic linkage disequilibrium reference panels. The ported trans‐ethnic scores outperform population specific‐PRS across all non‐European ancestry populations investigated including East Asians and three‐way admixed Brazilian cohort.
Conclusions
Here we show that for a truly polygenic trait such as BMI adjusting the summary statistics of a well powered European ancestry study using trans‐ethnic LD reference results in a score that is predictive across a range of ancestries including East Asians and three‐way admixed Brazilians.
This study is a project for the newly established global consortium, the International Hundred K+ Cohorts Consortium.
Polygenic risk scores have significant potential to inform clinical risk, however, research efforts in minor populations are warranted to avoid health disparities.
We present an international collaborative effort on the development of a trans‐ethnic PRS for BMI.
Abstract
Aims
To investigate whether metabolic signature composed of multiple plasma metabolites can be used to characterize adherence and metabolic response to the Mediterranean diet and whether ...such a metabolic signature is associated with cardiovascular disease (CVD) risk.
Methods and results
Our primary study cohort included 1859 participants from the Spanish PREDIMED trial, and validation cohorts included 6868 participants from the US Nurses’ Health Studies I and II, and Health Professionals Follow-up Study (NHS/HPFS). Adherence to the Mediterranean diet was assessed using a validated Mediterranean Diet Adherence Screener (MEDAS), and plasma metabolome was profiled by liquid chromatography-tandem mass spectrometry. We observed substantial metabolomic variation with respect to Mediterranean diet adherence, with nearly one-third of the assayed metabolites significantly associated with MEDAS (false discovery rate < 0.05). Using elastic net regularized regressions, we identified a metabolic signature, comprised of 67 metabolites, robustly correlated with Mediterranean diet adherence in both PREDIMED and NHS/HPFS (r = 0.28–0.37 between the signature and MEDAS; P = 3 × 10−35 to 4 × 10−118). In multivariable Cox regressions, the metabolic signature showed a significant inverse association with CVD incidence after adjusting for known risk factors (PREDIMED: hazard ratio HR per standard deviation increment in the signature = 0.71, P < 0.001; NHS/HPFS: HR = 0.85, P = 0.001), and the association persisted after further adjustment for MEDAS scores (PREDIMED: HR = 0.73, P = 0.004; NHS/HPFS: HR = 0.85, P = 0.004). Further genome-wide association analysis revealed that the metabolic signature was significantly associated with genetic loci involved in fatty acids and amino acids metabolism. Mendelian randomization analyses showed that the genetically inferred metabolic signature was significantly associated with risk of coronary heart disease (CHD) and stroke (odds ratios per SD increment in the genetically inferred metabolic signature = 0.92 for CHD and 0.91 for stroke; P < 0.001).
Conclusions
We identified a metabolic signature that robustly reflects adherence and metabolic response to a Mediterranean diet, and predicts future CVD risk independent of traditional risk factors, in Spanish and US cohorts.
Adiposity has been consistently associated with gallstone disease risk. We aimed to characterize associations of anthropometric measures (body mass index BMI, recent weight change, long-term weight ...change, waist circumference, and waist-to-hip ratio) with symptomatic gallstone disease according to strata of gallstone disease polygenic risk score (PRS).
We conducted analysis among 34,626 participants with available genome-wide genetic data within 3 large, prospective, U.S. cohorts-the Nurses' Health Study (NHS), Health Professionals Follow-Up Study, and NHS II. We characterized joint associations of PRS and anthropometric measures and tested for interactions on the relative and absolute risk scales.
Women in the highest BMI and PRS categories (BMI ≥30 kg/m
and PRS ≥1 SD above mean) had odds ratio for gallstone disease of 5.55 (95% confidence interval, 5.29 to 5.81) compared with those in the lowest BMI and PRS categories (BMI <25 kg/m
and PRS <1 SD below the mean). The corresponding odds ratio among men was 1.65 (95% confidence interval, 1.02 to 2.29). Associations for BMI did not vary within strata of PRS on the relative risk scale. On the absolute risk scale, the incidence rate difference between obese and normal-weight individuals was 1086 per 100,000 person-years within the highest PRS category, compared with 666 per 100,000 person-years in the lowest PRS category, with strong evidence for interaction with the ABCG8 locus.
While maintenance of a healthy body weight reduces gallstone disease risk among all individuals, risk reduction is higher among the subset with greater genetic susceptibility to gallstone disease.
Objective
This study aimed to uncover genetic contributors to adiposity in early life.
Methods
A genome‐wide association study of childhood body fatness in 34,401 individuals within the Nurses’ ...Health Studies and the Health Professionals Follow‐up Study was conducted. Data were imputed to the 1000 Genomes Phase 3 version 5 reference panel.
Results
A total of 1,354 single‐nucleotide polymorphisms (P < 10−4) were selected for replication in a previously published genome‐wide association study of childhood BMI. Nineteen significant genome‐wide (P < 5 × 10−8) regions were observed, fourteen of which were previously associated with childhood obesity and five were novel: BNDF (P = 7.58 × 10−13), PRKD1 (P = 1.43 × 10−10), 20p13 (P = 2.05 × 10−10), FHIT (P = 1.77 × 10−8), and LOC101927575 (P = 3.22 × 10−8). The BNDF, FHIT, and PRKD1 regions were previously associated with adult BMI. LOC101927575 and 20p13 regions have not previously been associated with adiposity phenotypes. In a transcriptome‐wide analysis, associations for POMC at 2p23.3 (P = 3.36 × 10−6) and with TMEM18 at 2p25.3 (P = 3.53 × 10−7) were observed. Childhood body fatness was genetically correlated with hip (rg = 0.42, P = 4.44 × 10−16) and waist circumference (rg = 0.39, P = 5.56 × 10−16), as well as age at menarche (rg = −0.37, P = 7.96 × 10−19).
Conclusions
Additional loci that contribute to childhood adiposity were identified, further explicating its genetic architecture.
Urinary incontinence and fecal incontinence are common disorders in women that negatively impact quality of life. In addition to known health and lifestyle risk factors, genetics may have a role in ...continence. Identification of genetic variants associated with urinary incontinence and fecal incontinence could result in a better understanding of etiologic pathways, and new interventions and treatments.
We previously generated genome-wide single nucleotide polymorphism data from Nurses' Health Studies participants. The participants provided longitudinal urinary incontinence and fecal incontinence information via questionnaires. Cases of urinary incontinence (6,120) had at least weekly urinary incontinence reported on a majority of questionnaires (3 or 4 across 12 to 16 years) while controls (4,811) consistently had little to no urinary incontinence reported. We classified cases of urinary incontinence in women into stress (1,809), urgency (1,942) and mixed (2,036) subtypes. Cases of fecal incontinence (4,247) had at least monthly fecal incontinence reported on a majority of questionnaires while controls (11,634) consistently had no fecal incontinence reported. We performed a genome-wide association study for each incontinence outcome.
We identified 8 single nucleotide polymorphisms significantly associated (p <5×10
) with urinary incontinence located in 2 loci, chromosomes 8q23.3 and 1p32.2. There were no genome-wide significant findings for the urinary incontinence subtype analyses. However, the significant associations for overall urinary incontinence were stronger for the urgency and mixed subtypes than for stress. While no single nucleotide polymorphism reached genome-wide significance for fecal incontinence, 4 single nucleotide polymorphisms had p <10
.
Few studies have collected genetic data and detailed urinary incontinence and fecal incontinence information. This genome-wide association study provides initial evidence of genetic associations for urinary incontinence and merits further research to replicate our findings and identify additional risk variants.
Abstract
Background
Increasing evidence suggests that conventional adenomas (CAs) and serrated polyps (SPs) represent two distinct groups of precursor lesions for colorectal cancer (CRC). The ...influence of common genetic variants on risk of CAs and SPs remain largely unknown.
Methods
Among 27 426 participants within three prospective cohort studies, we created a weighted genetic risk score (GRS) based on 40 CRC-related single nucleotide polymorphisms (SNPs) identified in previous genome-wide association studies; and we examined the association of GRS (per one standard deviation increment) with risk of CAs, SPs and synchronous CAs and SPs, by multivariable logistic regression. We also analysed individual variants in the secondary analysis.
Results
During 18–20 years of follow-up, we documented 2952 CAs, 1585 SPs and 794 synchronous CAs and SPs. Higher GRS was associated with increased risk of CAs odds ratio (OR) = 1.17, 95% confidence interval (CI): 1.12-1.21 and SPs (OR = 1.09, 95% CI: 1.03-1.14), with a stronger association for CAs than SPs (Pheterogeneity=0.01). An even stronger association was found for patients with synchronous CAs and SPs (OR = 1.32), advanced CAs (OR = 1.22) and multiple CAs (OR = 1.25). Different sets of variants were associated with CAs and SPs, with a Spearman correlation coefficient of 0.02 between the ORs associating the 40 SNPs with the two lesions. After correcting for multiple testing, three variants were associated with CAs (rs3802842, rs6983267 and rs7136702) and two with SPs (rs16892766 and rs4779584).
Conclusions
Common genetic variants play a potential role in the conventional and serrated pathways of CRC. Different sets of variants are identified for the two pathways, further supporting the aetiological heterogeneity of CRC.