Genome-wide association studies (GWAS) have successfully identified tens of thousands of genetic variants associated with various phenotypes, but together they explain only a fraction of ...heritability, suggesting many variants have yet to be discovered. Recently it has been recognized that incorporating functional information of genetic variants can improve power for identifying novel loci. For example, S-PrediXcan and TWAS tested the association of predicted gene expression with phenotypes based on GWAS summary statistics by leveraging the information on genetic regulation of gene expression and found many novel loci. However, as genetic variants may have effects on more than one gene and through different mechanisms, these methods likely only capture part of the total effects of these variants. In this paper, we propose a summary statistics-based mixed effects score test (sMiST) that tests for the total effect of both the effect of the mediator by imputing genetically predicted gene expression, like S-PrediXcan and TWAS, and the direct effects of individual variants. It allows for multiple functional annotations and multiple genetically predicted mediators. It can also perform conditional association analysis while adjusting for other genetic variants (e.g., known loci for the phenotype). Extensive simulation and real data analyses demonstrate that sMiST yields p-values that agree well with those obtained from individual level data but with substantively improved computational speed. Importantly, a broad application of sMiST to GWAS is possible, as only summary statistics of genetic variant associations are required. We apply sMiST to a large-scale GWAS of colorectal cancer using summary statistics from ∼120, 000 study participants and gene expression data from the Genotype-Tissue Expression (GTEx) project. We identify several novel and secondary independent genetic loci.
Guidelines for initiating colorectal cancer (CRC) screening are based on family history but do not consider lifestyle, environmental, or genetic risk factors. We developed models to determine risk of ...CRC, based on lifestyle and environmental factors and genetic variants, and to identify an optimal age to begin screening.
We collected data from 9748 CRC cases and 10,590 controls in the Genetics and Epidemiology of Colorectal Cancer Consortium and the Colorectal Transdisciplinary study, from 1992 through 2005. Half of the participants were used to develop the risk determination model and the other half were used to evaluate the discriminatory accuracy (validation set). Models of CRC risk were created based on family history, 19 lifestyle and environmental factors (E-score), and 63 CRC-associated single-nucleotide polymorphisms identified in genome-wide association studies (G-score). We evaluated the discriminatory accuracy of the models by calculating area under the receiver operating characteristic curve values, adjusting for study, age, and endoscopy history for the validation set. We used the models to project the 10-year absolute risk of CRC for a given risk profile and recommend ages to begin screening in comparison to CRC risk for an average individual at 50 years of age, using external population incidence rates for non-Hispanic whites from the Surveillance, Epidemiology, and End Results program registry.
In our models, E-score and G-score each determined risk of CRC with greater accuracy than family history. A model that combined both scores and family history estimated CRC risk with an area under the receiver operating characteristic curve value of 0.63 (95% confidence interval, 0.62–0.64) for men and 0.62 (95% confidence interval, 0.61–0.63) for women; area under the receiver operating characteristic curve values based on only family history ranged from 0.53 to 0.54 and those based only E-score or G-score ranged from 0.59 to 0.60. Although screening is recommended to begin at age 50 years for individuals with no family history of CRC, starting ages calculated based on combined E-score and G-score differed by 12 years for men and 14 for women, for individuals with the highest vs the lowest 10% of risk.
We used data from 2 large international consortia to develop CRC risk calculation models that included genetic and environmental factors along with family history. These determine risk of CRC and starting ages for screening with greater accuracy than the family history only model, which is based on the current screening guideline. These scoring systems might serve as a first step toward developing individualized CRC prevention strategies.
Display omitted
Background & Aims Risk for colorectal cancer (CRC) can be greatly reduced through screening. To aid in the development of screening strategies, we refined models designed to determine risk of CRC by ...incorporating information from common genetic susceptibility loci. Methods By using data collected from more than 12,000 participants in 6 studies performed from 1990 through 2011 in the United States and Germany, we developed risk determination models based on sex, age, family history, genetic risk score (number of risk alleles carried at 27 validated common CRC susceptibility loci), and history of endoscopic examinations. The model was validated using data collected from approximately 1800 participants in the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial, conducted from 1993 through 2001 in the United States. Results We identified a CRC genetic risk score that independently predicted which patients in the training set would develop CRC. Compared with determination of risk based only on family history, adding the genetic risk score increased the discriminatory accuracy from 0.51 to 0.59 ( P = .0028) for men and from 0.52 to 0.56 ( P = .14) for women. We calculated age- and sex-specific 10-year CRC absolute risk estimates based on the number of risk alleles, family history, and history of endoscopic examinations. A model that included a genetic risk score better determined the recommended starting age for screening in subjects with and without family histories of CRC. The starting age for high-risk men (family history of CRC and genetic risk score, 90%) was 42 years, and for low-risk men (no family history of CRC and genetic risk score, 10%) was 52 years. For men with no family history and a high genetic risk score (90%), the starting age would be 47 years; this is an intermediate value that is 5 years earlier than it would be for men with a genetic risk score of 10%. Similar trends were observed in women. Conclusions By incorporating information on CRC risk alleles, we created a model to determine the risk for CRC more accurately. This model might be used to develop screening and prevention strategies.
A sizable fraction of colorectal cancer (CRC) is expected to be explained by heritable factors, with heritability estimates ranging from 12 to 35% twin and family studies. Genome-wide association ...studies (GWAS) have successfully identified a number of common single-nucleotide polymorphisms (SNPs) associated with CRC risk. Although it has been shown that these CRC susceptibility SNPs only explain a small proportion of the genetic risk, it is not clear how much of the heritability these SNPs explain and how much is left to be detected by other, yet to be identified, common SNPs. Therefore, we estimated the heritability of CRC under different scenarios using Genome-Wide Complex Trait Analysis in the Genetics and Epidemiology of Colorectal Cancer Consortium including 8025 cases and 10 814 controls. We estimated that the heritability explained by known common CRC SNPs identified in GWAS was 0.65% (95% CI:0.3-1%; P = 1.11 × 10-16), whereas the heritability explained by all common SNPs was at least 7.42% (95% CI: 4.71-10.12%; P = 8.13 × 10(-8)), suggesting that many common variants associated with CRC risk remain to be detected. Comparing the heritability explained by the common variants with that from twin and family studies, a fraction of the heritability may be explained by other genetic variants, such as rare variants. In addition, our analysis showed that the gene × smoking interaction explained a significant proportion of the CRC variance (P = 1.26 × 10(-2)). In summary, our results suggest that known CRC SNPs only explain a small proportion of the heritability and more common SNPs have yet to be identified.
While evidence indicates that
(
) may promote colorectal carcinogenesis through its suppressive effect on T-cell-mediated antitumor immunity, the specific T-cell subsets involved remain uncertain.
We ...measured
DNA within tumor tissue by quantitative PCR on 933 cases (including 128
-positive cases) among 4,465 incident colorectal carcinoma cases in two prospective cohorts. Multiplex immunofluorescence combined with digital image analysis and machine learning algorithms for CD3, CD4, CD8, CD45RO (PTPRC isoform), and FOXP3 measured various T-cell subsets. We leveraged data on
, microsatellite instability (MSI), tumor whole-exome sequencing, and M1/M2-type tumor-associated macrophages TAM; by CD68, CD86, IRF5, MAF, and MRC1 (CD206) multimarker assay. Using the 4,465 cancer cases and inverse probability weighting method to control for selection bias due to tissue availability, multivariable-adjusted logistic regression analysis assessed the association between
and T-cell subsets.
The amount of
was inversely associated with tumor stromal CD3
lymphocytes multivariable OR, 0.47; 95% confidence interval (CI), 0.28-0.79, for
-high vs. -negative category;
= 0.0004 and specifically stromal CD3
CD4
CD45RO
cells (corresponding multivariable OR, 0.52; 95% CI, 0.32-0.85;
= 0.003). These relationships did not substantially differ by MSI status, neoantigen load, or exome-wide tumor mutational burden.
was not significantly associated with tumor intraepithelial T cells or with M1 or M2 TAMs.
The amount of tissue
is associated with lower density of stromal memory helper T cells. Our findings provide evidence for the interactive pathogenic roles of microbiota and specific immune cells.
Identification of new genetic markers may improve the prediction of colorectal cancer prognosis. Our objective was to examine genome-wide associations of germline genetic variants with ...disease-specific survival in an analysis of 16,964 cases of colorectal cancer. We analyzed genotype and colorectal cancer-specific survival data from a consortium of 15 studies. Approximately 7.5 million SNPs were examined under the log-additive model using Cox proportional hazards models, adjusting for clinical factors and principal components. Additionally, we ran secondary analyses stratifying by tumor site and disease stage. We used a genome-wide p-value threshold of 5 × 10
to assess statistical significance. No variants were statistically significantly associated with disease-specific survival in the full case analysis or in the stage-stratified analyses. Three SNPs were statistically significantly associated with disease-specific survival for cases with tumors located in the distal colon (rs698022, HR = 1.48, CI 1.30-1.69, p = 8.47 × 10
) and the proximal colon (rs189655236, HR = 2.14, 95% CI 1.65-2.77, p = 9.19 × 10
and rs144717887, HR = 2.01, 95% CI 1.57-2.58, p = 3.14 × 10
), whereas no associations were detected for rectal tumors. Findings from this large genome-wide association study highlight the potential for anatomical-site-stratified genome-wide studies to identify germline genetic risk variants associated with colorectal cancer-specific survival. Larger sample sizes and further replication efforts are needed to more fully interpret these findings.
Associations between candidate germline genetic variants and treatment outcome of oxaliplatin, a drug commonly used for patients with colorectal cancer, have been reported but not robustly ...established. This study aimed to construct polygenic hazard scores (PHSs) as predictive markers for oxaliplatin treatment outcome by using a supervised principal component approach (PCA).
Genome-wide association analysis for overall survival, including interaction terms (SNP*treatment type) was carried out using two phase III trials, 3,098 resected stage III colon cancer (rCC) patients of NCCTG N0147 and 506 metastatic colorectal cancer (mCRC) patients of NCCTG N9741, separately. SNPs showing interaction with genome-wide significance (P < 5 × 10-8) were selected for PCA to derive a PHS. PHS interaction with treatment was included in Cox regression models to predict outcome. Replication of prediction models was performed in an independent cohort, DACHS.
The two PHSs based on the first two principal components of selected SNPs (15SNPs for rCC and 13SNPs for mCRC) were used to construct interaction terms with treatment type and included in models adjusted for clinical covariables. However, in the DACHS study, the addition of the two PHS terms to clinical models did not improve the prediction error in either patients with rCC or mCRC. PHS interaction was also not replicated.
The PHSs derived using principal components efficiently combined multiple predictive SNPs for estimating likelihood of benefit from oxaliplatin versus other treatment but could not be replicated.
These results highlight the potential but also challenges in generating evidence for a predictive polygenic score for oxaliplatin efficacy.
We conducted the first large genome-wide association study to identify novel genetic variants that predict better (or poorer) prognosis in colorectal cancer patients receiving standard first-line ...oxaliplatin-based chemotherapy vs chemotherapy without oxaliplatin. We used data from two phase III trials, NCCTG N0147 and NCCTG N9741 and a population-based patient cohort, DACHS. Multivariable Cox proportional hazards models were employed, including an interaction term between each SNP and type of treatment for overall survival (OS) and progression-free survival. The analysis was performed for studies individually, and the results were combined using fixed-effect meta-analyses separately for resected stage III colon cancer (3098 patients from NCCTG N0147 and 549 patients from DACHS) and mCRC (505 patients from NCCTG N9741 and 437 patients from DACHS). We further performed gene-based analysis as well as in silico bioinformatics analysis for CRC-relevant functional genomic annotation of identified loci. In stage III colon cancer patients, a locus on chr22 (rs11912167) was associated with significantly poorer OS after oxaliplatin-based chemotherapy vs chemotherapy without oxaliplatin (P
< 5 × 10
). For mCRC patients, three loci on chr1 (rs1234556), chr12 (rs11052270) and chr15 (rs11858406) were found to be associated with differential OS (P < 5 × 10
). The locus on chr1 located in the intronic region of RCSD1 was replicated in an independent cohort of 586 mCRC patients from ALGB/SWOG 80405 (P
= .04). The GWA gene-based analysis yielded for RCSD1 the most significant association with differential OS in mCRC (P = 6.6 × 10
). With further investigation into its biological mechanisms, this finding could potentially be used to individualize first-line treatment and improve clinical outcomes.
Abstract
Background
The incidence of colorectal cancer (CRC) among individuals aged younger than 50 years has been increasing. As screening guidelines lower the recommended age of screening ...initiation, concerns including the burden on screening capacity and costs have been recognized, suggesting that an individualized approach may be warranted. We developed risk prediction models for early-onset CRC that incorporate an environmental risk score (ERS), including 16 lifestyle and environmental factors, and a polygenic risk score (PRS) of 141 variants.
Methods
Relying on risk score weights for ERS and PRS derived from studies of CRC at all ages, we evaluated risks for early-onset CRC in 3486 cases and 3890 controls aged younger than 50 years. Relative and absolute risks for early-onset CRC were assessed according to values of the ERS and PRS. The discriminatory performance of these scores was estimated using the covariate-adjusted area under the receiver operating characteristic curve.
Results
Increasing values of ERS and PRS were associated with increasing relative risks for early-onset CRC (odds ratio per SD of ERS = 1.14, 95% confidence interval CI = 1.08 to 1.20; odds ratio per SD of PRS = 1.59, 95% CI = 1.51 to 1.68), both contributing to case-control discrimination (area under the curve = 0.631, 95% CI = 0.615 to 0.647). Based on absolute risks, we can expect 26 excess cases per 10 000 men and 21 per 10 000 women among those scoring at the 90th percentile for both risk scores.
Conclusions
Personal risk scores have the potential to identify individuals at differential relative and absolute risk for early-onset CRC. Improved discrimination may aid in targeted CRC screening of younger, high-risk individuals, potentially improving outcomes.
Colorectal cancer (CRC) is a heterogeneous disease with evidence of distinct tumor types that develop through different somatically altered pathways. To better understand the impact of the host ...genome on somatically mutated genes and pathways, we assessed associations of germline variations with somatic events via two complementary approaches. We first analyzed the association between individual germline genetic variants and the presence of non-silent somatic mutations in genes in 1375 CRC cases with genome-wide SNPs data and a tumor sequencing panel targeting 205 genes. In the second analysis, we tested if germline variants located within previously identified regions of somatic allelic imbalance were associated with overall CRC risk using summary statistics from a recent large scale GWAS (n≃125 k CRC cases and controls). The first analysis revealed that a variant (rs78963230) located within a CNA region associated with TLR3 was also associated with a non-silent mutation within gene FBXW7. In the secondary analysis, the variant rs2302274 located in CDX1/PDGFRB frequently gained/lost in colorectal tumors was associated with overall CRC risk (OR = 0.96, p = 7.50e-7). In summary, we demonstrate that an integrative analysis of somatic and germline variation can lead to new insights about CRC.