Integrative analysis of multi-omics data is a powerful approach for gaining functional insights into biological and medical processes. Conducting these multifaceted analyses on human samples is often ...complicated by the fact that the raw sequencing output is rarely available under open access. The Personal Genome Project UK (PGP-UK) is one of few resources that recruits its participants under open consent and makes the resulting multi-omics data freely and openly available. As part of this resource, we describe the PGP-UK multi-omics reference panel consisting of ten genomic, methylomic and transcriptomic data. Specifically, we outline the data processing, quality control and validation procedures which were implemented to ensure data integrity and exclude sample mix-ups. In addition, we provide a REST API to facilitate the download of the entire PGP-UK dataset. The data are also available from two cloud-based environments, providing platforms for free integrated analysis. In conclusion, the genotype-validated PGP-UK multi-omics human reference panel described here provides a valuable new open access resource for integrated analyses in support of personal and medical genomics.
BackgroundA significant challenge within the field of personalized neoantigen therapies is the determination of which neoantigen targets will elicit durable, therapeutically relevant immune ...responses. T cell responses can be detected for circa 10–20% of neoepitopes selected for use in vaccines. Screening of memory responses in tumor-infiltrating T cells show much lower rates of 1–2%. Of this small percentage of neoantigens capable of driving an immune response, only a subset will be resistant to methods of tumor immune evasion. Therefore, it is paramount that both these challenges are faced in order to obtain a durable clinical response.Across different types of neoantigens, the relationship between clonal neoantigens and response to immunotherapy has previously been demonstrated across multiple indications supporting the key role of clonal neoantigens as substrate for T cell recognition of tumors.MethodsAchilles Therapeutics aims to deliver precision immunotherapies specifically targeting clonal neoantigens identified through the Achilles Clonality Engine methodology within our PELEUSTM bioinformatics platform. The PELEUSTM platform incorporates a Bayesian approach allowing for the determination of the probability of each potential neoantigen being clonal.In addition to clonality, and to improve our ability to select for immunogenic neoantigens, we have developed an extensive pipeline for identification of tumor-derived memory T cell responses to clonal neoantigens.ResultsThrough the use of data obtained by screening circa 10,000 neoantigens for T cell reactivity in expanded tumor-infiltrating lymphocytes, we developed and validated an AI method, NeoRanker, for predicting neoantigen immunogenicity. Using a small set of features incorporating genomic, transcriptomic and proteomic data for training purposes, NeoRanker is able to preferentially enrich our clonal neoantigen list for those capable of driving either CD8+ or CD4+ T cell responses. When benchmarked against well-known tools in the field including BigMHC and Prime, NeoRanker displayed the best performance as measured by the area under the receiver operator characteristic curve.ConclusionsWe believe this technology has broad applicability for optimising target selection across all types of personalized neoantigen vaccines and cell therapies.Trial RegistrationNCT03997474; NCT04032847; NCT03517917
The interplay between an evolving cancer and a dynamic immune microenvironment remains unclear. Here we analyse 258 regions from 88 early-stage, untreated non-small-cell lung cancers using RNA ...sequencing and histopathology-assessed tumour-infiltrating lymphocyte estimates. Immune infiltration varied both between and within tumours, with different mechanisms of neoantigen presentation dysfunction enriched in distinct immune microenvironments. Sparsely infiltrated tumours exhibited a waning of neoantigen editing during tumour evolution, indicative of historical immune editing, or copy-number loss of previously clonal neoantigens. Immune-infiltrated tumour regions exhibited ongoing immunoediting, with either loss of heterozygosity in human leukocyte antigens or depletion of expressed neoantigens. We identified promoter hypermethylation of genes that contain neoantigenic mutations as an epigenetic mechanism of immunoediting. Our results suggest that the immune microenvironment exerts a strong selection pressure in early-stage, untreated non-small-cell lung cancers that produces multiple routes to immune evasion, which are clinically relevant and forecast poor disease-free survival.
Full text
Available for:
GEOZS, IJS, IMTLJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBMB, UL, UM, UPUK, ZAGLJ
Abstract The phenomenon of mixed/heterogenous treatment responses to cancer therapies within an individual patient presents a challenging clinical scenario. Furthermore, the molecular basis of mixed ...intra-patient tumor responses remains unclear. Here, we show that patients with metastatic lung adenocarcinoma harbouring co-mutations of EGFR and TP53 , are more likely to have mixed intra-patient tumor responses to EGFR tyrosine kinase inhibition (TKI), compared to those with an EGFR mutation alone. The combined presence of whole genome doubling (WGD) and TP53 co-mutations leads to increased genome instability and genomic copy number aberrations in genes implicated in EGFR TKI resistance. Using mouse models and an in vitro isogenic p53 -mutant model system, we provide evidence that WGD provides diverse routes to drug resistance by increasing the probability of acquiring copy-number gains or losses relative to non-WGD cells. These data provide a molecular basis for mixed tumor responses to targeted therapy, within an individual patient, with implications for therapeutic strategies.
DNA methylation has long been known to play a role in tumourigenesis. To this day, interpretation of bulk tumour bisulphite sequencing data has been hampered by normal contamination levels and tumour ...copy number. To address this issue, we develop two computational tools: (1) ASCAT.m, which allows Allele-Specific Copy number Analysis of Tumour methylation data directly from bulk tumour reduced representation bisulphite sequencing (RRBS) data and (2) CAMDAC, a method for Copy Number-Aware Methylation Deconvolution Analysis of Cancer, from bulk tumour and adjacent normal RRBS data. We describe a set of rules to compute allelic imbalance independently of bisulphite conversion and correct normalised read coverage estimates for sequencing biases. We apply ASCAT.m to non-small cell lung cancers from the epiTRACERx study with multi-region bulk tumour RRBS and adjacent normal. ASCAT.m genotypes, allele-specific copy numbers and tumour purity and ploidy estimates are in excellent agreement with those obtained from matched whole-exome and a subset of whole-genome sequencing of the same samples. We observe a correlation between whole-genome doubling and relapse-free survival in lung squamous cell carcinoma but not in adenocarcinoma. We see widespread genomic instability across both histological subtypes. We model bulk tumour methylation rates as a mixture of tumour and normal signals weighed for tumour purity and copy number and formalise this relationship into CAMDAC equations. The errors between predicted and observed methylation rates were low. Normal infiltrates Fluorescence-activated cell sorting (FACS)-purified from the bulk tumour were similar in composition to the adjacent matched normal lung, suggesting the latter is a suitable proxy for deconvolution. Single nucleotide variant (SNV)- and FACS-purified tumour methylation rates are in good agreement with CAMDAC deconvoluted estimates. Purification successfully removes shared normal signal, decreasing correlations between patients and to normal after purification. Samples with shared ancestry remain highly correlated. Purified methylation rates yield accurate tumour-normal and tumour-tumour differential methylation calls independent of tumour purity and copy number. We find hundreds of ubiquitously early clonal gene promoter epimutations across the epiTRACERx cohort, showcasing the potential of DNA methylation markers for early cancer detection. CAMDAC purified profiles reveal both phylogenetic and inter-tumour relationships as well as provide insight in tumour evolutionary history. Quantifying allele-specific methylation on chromosome X in females, we uncover extraction biases against the Barr body. X inactivation is random at the scale of our normal lung cancer samples. Phasing of methylation rates with polymorphisms confirms the presence of allele-specific methylation at the H19/IGF2 locus. Loss of imprinting is observed in 5 tumours, all involving demethylation of the maternal allele. We attempt to quantify the ratio of clonal allele-specific to bi-allelic epimutations in tumours in regions of 1+1, which we define as regulatory and stochastic methylation changes, respectively. Utilising this ratio, we try to extract the number of stochastic epimutations in regions of 2+0 with copy numbers 1 and 2 and time those copy number gains. We find that SNVs at gene promoters often lead to hypermethylation of neighbouring CpGs on the same copy or allele, suggesting the ablation of a transcription factor binding site. Non-expressed neo-antigen are enriched for promoter hypermethylation, indicating methylation plays a role in immune escape. To conclude, CAMDAC purified methylation rates are key to unlock insights into comparative cancer epigenomics and intra-tumour epigenetic heterogeneity.
B cells are frequently found in the margins of solid tumours as organized follicles in ectopic lymphoid organs called tertiary lymphoid structures (TLS)
. Although TLS have been found to correlate ...with improved patient survival and response to immune checkpoint blockade (ICB), the underlying mechanisms of this association remain elusive
. Here we investigate lung-resident B cell responses in patients from the TRACERx 421 (Tracking Non-Small-Cell Lung Cancer Evolution Through Therapy) and other lung cancer cohorts, and in a recently established immunogenic mouse model for lung adenocarcinoma
. We find that both human and mouse lung adenocarcinomas elicit local germinal centre responses and tumour-binding antibodies, and further identify endogenous retrovirus (ERV) envelope glycoproteins as a dominant anti-tumour antibody target. ERV-targeting B cell responses are amplified by ICB in both humans and mice, and by targeted inhibition of KRAS(G12C) in the mouse model. ERV-reactive antibodies exert anti-tumour activity that extends survival in the mouse model, and ERV expression predicts the outcome of ICB in human lung adenocarcinoma. Finally, we find that effective immunotherapy in the mouse model requires CXCL13-dependent TLS formation. Conversely, therapeutic CXCL13 treatment potentiates anti-tumour immunity and synergizes with ICB. Our findings provide a possible mechanistic basis for the association of TLS with immunotherapy response.
Full text
Available for:
GEOZS, IJS, IMTLJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBMB, UL, UM, UPUK, ZAGLJ
Abstract Introduction: Analyses of bulk RNA sequencing data are central to most large-scale tumor sequencing studies. Bulk expression data represent population averages and its interpretation is ...confounded by both normal cell contamination and somatic copy number alterations. Several computational methods to deconvolve have been developed recently. However, these methods often rely on a set of cell-type-specific reference signatures and ignore the effect of copy number changes. Methods: To address these issues, we have developed a method that formalizes the relationship between allele-specific copy number, expression and sample purity to deconvolve the expression profiles of tumor and normal cells from bulk RNA-seq in an unbiased manner. Our method was applied to sequencing data produced by the TRACERx consortium, a longitudinal study with multi-region whole-exome and RNA-seq of non-small-cell lung cancers. A total of 414 primary tumor regions and 140 adjacent normal tissue samples from 140 patients with matched DNA and RNA-seq data were processed. Results: We were able to directly deconvolve a median of ~2,000 genes per sample and indirectly infer tumor and normal expression profiles of ~10,000 genes using an scaled and weighted mean approach. The accuracy of the deconvolution was validated using (1) in-silico mixtures of patient-derived tumor and normal cells, (2) pseudo-bulk scRNAseq and (3) in regions with loss of heterogeneity (LOH) directly on the bulk sequencing data, where the total fraction of expression attributed to tumor cells can be computed directly using somatic mutations. CREDAC allows for single sample bulk-level tumor-normal differential expression analysis and revealed a strong and constitutive genome-wide overexpression in cancer compared to admixed normal cells, with a more pronounced overexpression in lung squamous cell carcinoma than lung adenocarcinoma (p<0.001). CREDAC adjusts for variations in copy number, thus facilitating the investigation of cancer-specific dosage compensatory responses triggered by aneuploidy. We revealed that a majority of genes (~65%) exhibit proportionality to copy number alterations, while a substantial proportion (~35%) shows anti-scaling or dosage compensation strategies. We show an enrichment of genes involved in cell cycle and expression processes within the dosage compensated group and revealed methylation as a key mechanism driving this compensatory process. Conclusion: Overall, our results suggest that CREDAC is able to accurately disentangle the expression of tumor and normal cells from bulk RNA-seq without any previous knowledge. It has potential applications in many studies that include matched RNA-seq and copy number data and can provide new insights functional characterization, the taxonomy of cancer, and tumor evolution. Citation Format: Carla Castignani, Jonas Demeulemeester, Oriol Pich, Tom Lesluyes, Robert E. Hynds, David R. Pearce, Elizabeth Larose Cadieux, Stefan C. Dentro, TRACERx Consortium, Nnenna Kanu, Charles Swanton, Peter Van Loo. CREDAC: Copy number-based reference-free expression deconvolution analysis of cancers abstract. In: Proceedings of the American Association for Cancer Research Annual Meeting 2024; Part 1 (Regular Abstracts); 2024 Apr 5-10; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2024;84(6_Suppl):Abstract nr 2268.
Abstract
Introduction: Analyses of bulk RNA sequencing (RNA-seq) data are central to most large-scale tumour sequencing studies. Solid tumors often present a dynamic, heterogeneous environment ...consisting of both cancer (sub)clones and normal cells. Bulk expression data represent population averages and its interpretation is confounded by both normal cell contamination and somatic copy number alterations. Several computational methods to deconvolve tumour and normal expression profiles from bulk RNA-seq have been developed recently. However, these methods often rely on a set of cell-type-specific reference signatures and ignore the effect of copy number changes.
Methods: To address these issues, we have developed a method that formalizes the relationship between allele-specific copy number, expression and sample purity to deconvolve the expression profiles of tumor and normal cells from bulk RNA-seq data in an unbiased manner. Our method was applied to sequencing data produced by the TRACERx consortium, a longitudinal study with multi-region whole-exome and RNA-seq of non-small-cell lung cancers. A total of 414 primary tumor regions and 140 adjacent normal tissue samples from 140 TRACERx patients with matched DNA and RNA sequencing data were processed.
Results: Here, we were able to directly deconvolve a median of ~2,000 genes per sample and indirectly infer tumor and normal expression profiles of ~10,000 genes. The accuracy of the deconvolution was validated using in-silico mixtures of patient-derived tumour and normal cells and in regions with loss of heterogeneity (LOH) directly on the bulk sequencing data, where the total fraction of expression attributed to tumor cells can be computed directly using somatic mutations. Our method revealed a strong and constitutive genome-wide overexpression in cancer cells compared to admixed normal cells, this overexpression was more pronounced in lung squamous cell carcinoma than lung adenocarcinoma (p<0.001). Multidimensional projection of the purified tumor and normal expression profiles together with the adjacent normal tissue showed clear separation between the purified expression profiles and notably, the deconvolved normal expression was more similar to the normal adjacent profiles than the purified tumor profiles.
Conclusion: Overall, these results suggest that our method is able to accurately disentangle the expression of tumour and normal cells from bulk RNA-seq without any previous knowledge. It has potential applications in many studies that include matched RNA-seq and copy number data and can provide new insights functional characterization, the taxonomy of cancer, and tumor evolution.
Citation Format: Carla Castignani, Jonas Demeulemeester, Elizabeth Larose Cadieux, Robert E. Hynds, David R. Pearce, Stefan C. Dentro, Peter Van Loo, Charles Swanton, TRACERx Consortium. Allele-specific copy-number based deconvolution of bulk tumour RNA sequencing data from the TRACERx study abstract. In: Proceedings of the American Association for Cancer Research Annual Meeting 2022; 2022 Apr 8-13. Philadelphia (PA): AACR; Cancer Res 2022;82(12_Suppl):Abstract nr 1211.
Abstract
Introduction: Lung TRACERx is a prominent study employing multi-region and longitudinal multi-omics sequencing to unravel the evolutionary trajectories of lung cancer. Aberrant DNA ...methylation patterns have been widely described in nearly all human cancers, yet their interplay with DNA mutations in lung cancer is not well understood. Incorporating the contribution of epigenetic modifications to cancer evolution trajectories within TRACERx could improve our understanding of the intricate relationship between genetic and epigenetic changes in non-small cell lung cancer (NSCLC) evolution.
Methods: Multi-region sampling from 38 TRACERx patients including 112 tumor regions and 37 matched normal adjacent tissue samples was performed. Reduced representation bisulfite sequencing (RRBS) was performed to assess DNA methylation and the CAMDAC (Larose-Cadieux et al, 2020) was applied to estimate purified tumor methylation rates and correct for copy number changes. Whole exome sequencing and somatic copy number alterations (SCNAs) were inferred using the ASCAT tool (Van Loo et al, 2010) and Methsig (Pan et al, 2021) was performed to discover new methylation driver genes.
Results: Using multi-region sequencing, we identified ubiquitous hypermethylation of 29 known driver genes in both lung adenocarcinoma (LUAD) and squamous cell lung cancer (LUSC), together with an additional 9 and 27 genes exclusive to LUSC and LUAD, respectively. We also identified 13 and 7 driver genes non-ubiquitously hypermethylated exclusively in LUSC and LUAD, respectively. Using a differential methylation based approach, we describe a method to determine the extent of intra-tumor methylation heterogeneity akin to established ITH scores based on genomics data. In addition, we report the identification of novel subtype-specific methylation driver genes enriched in HOX family members which are related to cancer progression. Through integration of DNA methylation and genomic sequencing data, we identify parallel mechanisms contributing towards ubiquitous tumor suppressor gene alterations. At the patient level, multiple driver genes such as NSD1, GATA3 and MGA were subject to repression by both copy number loss and DNA hypermethylation. Finally, we describe dosage-compensation of genes such as the Notch ligands JAG2 and DLK1 that are proximal to amplified oncogenes and hypermethylated during tumor evolution.
Conclusion: We describe the contribution of DNA methylation and genomic alterations to altering the landscape of NSCLC. Leveraging DNA methylation, we can determine the extent of convergent repression mechanisms in different regions of the same tumor, assess DNA methylation heterogeneity, and discover DNA methylation-based driver genes in NSCLC.
Citation Format: Francisco Gimeno-Valiente, Carla Castignani, Elizabeth Larose-Cadieux, Kezhong Chen, Nana Mensah, Olga Chervova, Thomas Watkins, Pawan Dhami, Heli Vaikkinen, Andrew Feber, TRACERx Consortium, Jonas Demeulemeester, Miljana Tanic, Stephan Beck, Peter Van Loo, Charles Swanton, Nnennaya Kanu. Identification of convergent gene repression mechanisms through integrative genomic and DNA methylation analysis in TRACERx abstract. In: Proceedings of the American Association for Cancer Research Annual Meeting 2022; 2022 Apr 8-13. Philadelphia (PA): AACR; Cancer Res 2022;82(12_Suppl):Abstract nr 5710.
Abstract
Background: Genomic intra-tumor heterogeneity (ITH) drives tumor evolution, leading to immune evasion and resistance to therapy. Emerging evidence implicates the transcriptome as a source of ...important variation that impacts tumor phenotype. Here, we perform a genomic-transcriptomic analysis of intra-tumor transcriptomic diversity upon 941 tumor regions taken from 357 TRACERx non-small cell lung cancers (NSCLC) across primary and metastatic disease subjected to high-depth bulk DNA and RNA sequencing, as well as 91 tumor-adjacent normal tissue samples.
Results: Genomic and transcriptomic diversity are linked across primary and metastatic disease, with expression signatures of proliferation being enriched in the metastasis seeding subclone of the primary tumor relative to non-metastasis seeding subclones. Copy-number independent allele-specific expression (CN-independent ASE), a source of transcriptome-specific diversity, affects 1% (± 0.5%) of genes and is underpinned by aberrant allele-specific methylation (OR=7.58, p≤2.2x10-16), thus providing a window to the NSCLC epigenomic landscape. Driver mutations in chromatin remodellers and histone modifiers, in particular SETD2 and KDM5C, are associated with increased global levels of CN-independent ASE (p=0.0001). In genomically stable tumors, high levels of CN-independent ASE are linked to expression signatures consistent with genomic instability and proliferation (R=0.58, p=0.001), highlighting convergence between the genome and transcriptome in tumor evolution. For the first time, we uncover mutational signatures of RNA editing. Analysis of their activity links the expression of ADAR and APOBEC enzymes to editing processes revealing otherwise hidden APOBEC activity within tumors at sampling (RNA APOBEC activity identified in 188 tumor regions (32%) without evidence of DNA APOBEC activity). RNA editing activity is shared between primary tumors and paired metastasis, but not paired tumor-adjacent normal tissue, suggestive of heritability of this somatic transcriptional process. Finally, we combine multiple measures of genomic and transcriptomic variation in a multi-region approach to define important variation within cancer genes. We illustrate examples that would be missed with a purely genomic focus and demonstrate genomic-transcriptomic parallel evolution, converging on disruption to single cancer genes, such as FAT1 and APC, in different regions of a tumor.
Conclusions: This work highlights the importance of the transcriptome during tumor evolution, as well as the power of integrative multi-omic assessments of ITH, and provides novel insight into the role of transcriptomic variation in lung cancer.
Citation Format: James R. M. Black, Carlos Martinez-Ruiz, Clare Puttick, Jonas Demeulemeester, Elizabeth Larose Cadieux, Kerstin Thol, Thomas P. Jones, Selvaraju Veeriah, Cristina Naceur-Lombardelli, Andrew Rowan, Sophia Ward, Michelle Dietzen, Ariana Huebner, Maise Al Bakir, Miljana Tanic, Thomas B. Watkins, Emilia L. Lim, Ali M. Al Rashed, Daniel E. Cook, Rachel Rosenthal, Gareth Wilson, Alexander M. Frankell, Nnennaya Kanu, Kevin Litchfield, Nicolai J. Birkbak, Allan Hackshaw, Stephan Beck, Peter Van Loo, Mariam Jamal-Hanjani, the lung TRACERx Consortium, Charles Swanton, Nicholas McGranahan. Genomic transcriptomic evolution in TRACERx lung cancer and metastasis abstract. In: Proceedings of the American Association for Cancer Research Annual Meeting 2022; 2022 Apr 8-13. Philadelphia (PA): AACR; Cancer Res 2022;82(12_Suppl):Abstract nr 1603.