Somatic mutations in cancers affecting protein coding genes can give rise to potentially therapeutic neoepitopes. These neoepitopes can guide Adoptive Cell Therapies and Peptide- and RNA-based ...Neoepitope Vaccines to selectively target tumor cells using autologous patient cytotoxic T-cells. Currently, researchers have to independently align their data, call somatic mutations and haplotype the patient's HLA to use existing neoepitope prediction tools. We present ProTECT, a fully automated, reproducible, scalable, and efficient end-to-end analysis pipeline to identify and rank therapeutically relevant tumor neoepitopes in terms of potential immunogenicity starting directly from raw patient sequencing data, or from pre-processed data. The ProTECT pipeline encompasses alignment, HLA haplotyping, mutation calling (single nucleotide variants, short insertions and deletions, and gene fusions), peptide:MHC binding prediction, and ranking of final candidates. We demonstrate the scalability, efficiency, and utility of ProTECT on 326 samples from the TCGA Prostate Adenocarcinoma cohort, identifying recurrent potential neoepitopes from TMPRSS2-ERG fusions, and from SNVs in SPOP. We also compare ProTECT with results from published tools. ProTECT can be run on a standalone computer, a local cluster, or on a compute cloud using a Mesos backend. ProTECT is highly scalable and can process TCGA data in under 30 min per sample (on average) when run in large batches. ProTECT is freely available at https://www.github.com/BD2KGenomics/protect.
A newly developed transcription-mediated amplification assay was used to detect chikungunya virus infection in 3 of 557 asymptomatic donors (0.54%) from Puerto Rico during the 2014-2015 Caribbean ...epidemic. Viral detection was confirmed by using PCR, microarray, and next-generation sequencing. Molecular clock analysis dated the emergence of the Puerto Rico strains to early 2013.
Predictive biomarkers of immune checkpoint inhibitor (ICI) efficacy are currently lacking for non-small cell lung cancer (NSCLC). Here, we describe the results from the Anti-PD-1 Response Prediction ...DREAM Challenge, a crowdsourced initiative that enabled the assessment of predictive models by using data from two randomized controlled clinical trials (RCTs) of ICIs in first-line metastatic NSCLC.
Participants developed and trained models using public resources. These were evaluated with data from the CheckMate 026 trial (NCT02041533), according to the model-to-data paradigm to maintain patient confidentiality. The generalizability of the models with the best predictive performance was assessed using data from the CheckMate 227 trial (NCT02477826). Both trials were phase III RCTs with a chemotherapy control arm, which supported the differentiation between predictive and prognostic models. Isolated model containers were evaluated using a bespoke strategy that considered the challenges of handling transcriptome data from clinical trials.
A total of 59 teams participated, with 417 models submitted. Multiple predictive models, as opposed to a prognostic model, were generated for predicting overall survival, progression-free survival, and progressive disease status with ICIs. Variables within the models submitted by participants included tumor mutational burden (TMB), programmed death ligand 1 (PD-L1) expression, and gene-expression-based signatures. The best-performing models showed improved predictive power over reference variables, including TMB or PD-L1.
This DREAM Challenge is the first successful attempt to use protected phase III clinical data for a crowdsourced effort towards generating predictive models for ICI clinical outcomes and could serve as a blueprint for similar efforts in other tumor types and disease states, setting a benchmark for future studies aiming to identify biomarkers predictive of ICI efficacy.
CheckMate 026; NCT02041533, registered January 22, 2014. CheckMate 227; NCT02477826, registered June 23, 2015.
Histone post-translational modifications play vital roles in a variety of nuclear processes, including DNA repair. It has been previously shown that histone H3K79 methylation is important for the ...cellular response to DNA damage caused by ultraviolet (UV) radiation, with evidence that specific methylation states play distinct roles in UV repair. Here, we report that H3K79 methylation is reduced in response to UV exposure in
This reduction is specific to the dimethylated state, as trimethylation levels are minimally altered by UV exposure. Inhibition of this reduction has a deleterious effect on UV-induced sister chromatid exchange, suggesting that H3K79 dimethylation levels play a regulatory role in UV repair. Further evidence implicates an additional role for H3K79 dimethylation levels in error-free translesion synthesis, but not in UV-induced G1/S checkpoint activation or double-stranded break repair. Additionally, we find that H3K79 dimethylation levels are influenced by acetylatable lysines on the histone H4 N-terminal tail, which are hyperacetylated in response to UV exposure. Preclusion of H4 acetylation prevents UV-induced reduction of H3K79 dimethylation, and similarly has a negative effect on UV-induced sister chromatid exchange. These results point to the existence of a novel histone crosstalk pathway that is important for the regulation of UV-induced DNA damage repair.
Precision oncology has primarily relied on coding mutations as biomarkers of response to therapies. While transcriptome analysis can provide valuable information, incorporation into workflows has ...been difficult. For example, the relative rather than absolute gene expression level needs to be considered, requiring differential expression analysis across samples. However, expression programs related to the cell-of-origin and tumor microenvironment effects confound the search for cancer-specific expression changes. To address these challenges, we developed an unsupervised clustering approach for discovering differential pathway expression within cancer cohorts using gene expression measurements. The hydra approach uses a Dirichlet process mixture model to automatically detect multimodally distributed genes and expression signatures without the need for matched normal tissue. We demonstrate that the hydra approach is more sensitive than widely-used gene set enrichment approaches for detecting multimodal expression signatures. Application of the hydra analysis framework to small blue round cell tumors (including rhabdomyosarcoma, synovial sarcoma, neuroblastoma, Ewing sarcoma, and osteosarcoma) identified expression signatures associated with changes in the tumor microenvironment. The hydra approach also identified an association between ATRX deletions and elevated immune marker expression in high-risk neuroblastoma. Notably, hydra analysis of all small blue round cell tumors revealed similar subtypes, characterized by changes to infiltrating immune and stromal expression signatures.
Diffuse midline gliomas with histone H3 K27M (H3K27M) mutations occur in early childhood and are marked by an invasive phenotype and global decrease in H3K27me3, an epigenetic mark that regulates ...differentiation and development. H3K27M mutation timing and effect on early embryonic brain development are not fully characterized.
We analyzed multiple publicly available RNA sequencing datasets to identify differentially expressed genes between H3K27M and non-K27M pediatric gliomas. We found that genes involved in the epithelial-mesenchymal transition (EMT) were significantly overrepresented among differentially expressed genes. Overall, the expression of pre-EMT genes was increased in the H3K27M tumors as compared to non-K27M tumors, while the expression of post-EMT genes was decreased. We hypothesized that H3K27M may contribute to gliomagenesis by stalling an EMT required for early brain development, and evaluated this hypothesis by using another publicly available dataset of single-cell and bulk RNA sequencing data from developing cerebral organoids. This analysis revealed similarities between H3K27M tumors and pre-EMT normal brain cells. Finally, a previously published single-cell RNA sequencing dataset of H3K27M and non-K27M gliomas revealed subgroups of cells at different stages of EMT. In particular, H3.1K27M tumors resemble a later EMT stage compared to H3.3K27M tumors.
Our data analyses indicate that this mutation may be associated with a differentiation stall evident from the failure to proceed through the EMT-like developmental processes, and that H3K27M cells preferentially exist in a pre-EMT cell phenotype. This study demonstrates how novel biological insights could be derived from combined analysis of several previously published datasets, highlighting the importance of making genomic data available to the community in a timely manner.
Cancer cell lines have been widely used for decades to study biological processes driving cancer development, and to identify biomarkers of response to therapeutic agents. Advances in genomic ...sequencing have made possible large-scale genomic characterizations of collections of cancer cell lines and primary tumors, such as the Cancer Cell Line Encyclopedia (CCLE) and The Cancer Genome Atlas (TCGA). These studies allow for the first time a comprehensive evaluation of the comparability of cancer cell lines and primary tumors on the genomic and proteomic level. Here we employ bulk mRNA and micro-RNA sequencing data from thousands of samples in CCLE and TCGA, and proteomic data from partner studies in the MD Anderson Cell Line Project (MCLP) and The Cancer Proteome Atlas (TCPA), to characterize the extent to which cancer cell lines recapitulate tumors. We identify dysregulation of a long non-coding RNA and microRNA regulatory network in cancer cell lines, associated with differential expression between cell lines and primary tumors in four key cancer driver pathways: KRAS signaling, NFKB signaling, IL2/STAT5 signaling and TP53 signaling. Our results emphasize the necessity for careful interpretation of cancer cell line experiments, particularly with respect to therapeutic treatments targeting these important cancer pathways.
Accelerating cures for children with cancer remains an immediate challenge as a result of extensive oncogenic heterogeneity between and within histologies, distinct molecular mechanisms evolving ...between diagnosis and relapsed disease, and limited therapeutic options. To systematically prioritize and rationally test novel agents in preclinical murine models, researchers within the Pediatric Preclinical Testing Consortium are continuously developing patient-derived xenografts (PDXs)—many of which are refractory to current standard-of-care treatments—from high-risk childhood cancers. Here, we genomically characterize 261 PDX models from 37 unique pediatric cancers; demonstrate faithful recapitulation of histologies and subtypes; and refine our understanding of relapsed disease. In addition, we use expression signatures to classify tumors for TP53 and NF1 pathway inactivation. We anticipate that these data will serve as a resource for pediatric oncology drug development and will guide rational clinical trial design for children with cancer.
Display omitted
•Multiplatform analysis facilitates genomic resource of 261 pediatric cancer PDX models•PPTC PDX models are reflective of high-risk and chemotherapy resistant disease•Inferred TP53 pathway inactivation correlates with pediatric cancer copy number burden•Pediatric cancer PDX models will be useful for drug development prioritization
Rokita et. al provide an extensively annotated genomic dataset of somatic oncogenic regulation across 37 distinct pediatric malignancies. The 261 patient-derived xenograft models are available to the scientific community, and the genomic annotations will enable rational preclinical agent prioritization and acceleration of therapeutic targets for early-phase pediatric oncology clinical trials.
Abstract
Introduction: Leukemia is the most common cancer in children, accounting for approximately one third of all malignancies that occur in the pediatric age group. Acute Lymphoblastic Leukemia ...(ALL) and Acute Myeloid Leukemia (AML) account for most leukemia diagnosed in this age group. While known markers for poor prognosis include higher age, higher white blood cell count at diagnosis and certain translocations, innovative approaches in tumor RNA sequencing (RNA-Seq) data analysis can discover novel prognostic factors that could be exploited for future therapeutic development in fusion-negative ALL and AML.
Methods: To reveal gene expression signatures among fusion-negative leukemias, we used a novel unsupervised analysis model called Hydra. Hydra uses a Dirichlet process mixture model to detect multimodally expressed genes to use in characterizing clusters within cancer cohorts. This approach can detect subtle yet robust differences in gene expression without the reliance on reference normal RNA-Seq datasets. The Hydra model reveals clusters of the cancer cohort, and differences among these clusters can be investigated by finding enriched pathways via Gene Set Enrichment Analysis (GSEA). The cluster-specific enriched pathways can be used in conjunction with survival data to determine how certain pathways are associated with outcome. This analysis used publicly available data from the National Cancer Institute (NCI) Therapeutically Applicable Research to Generate Effective Treatments (TARGET) database that was uniformly processed by the Treehouse Childhood Cancer Initiative.
Results: First, 202 fusion-negative AML and fusion-negative B-cell precursor ALL samples were run through Hydra and five clusters were identified. These clusters had different enriched pathways, such as high mitochondrial activity, high cell proliferation, and high cell signaling. Though these are characteristics of all cancer cells, each cluster demonstrated that one pathway was most distinctive of those samples. Most clusters were differentiated by disease, however, one cluster with enriched heme metabolism and immunoglobulin pathways contained almost equal amounts of AML and ALL samples, suggesting that specific cohorts of AML and ALL patients had increased inflammatory response. Another cluster contained 72 AML samples and 4 ALL samples. The four ALL samples in this cluster showed lowered expression of CD19, a B-cell lineage immune marker, and elevated expression of CD14, a myeloid lineage immune marker. These ALL patients exhibited genomic characteristics of AML, which may suggest a more specialized treatment regimen.
Discussion: Despite extensive characterization of pediatric high-risk leukemias using genomic approaches, there is ample opportunity to study RNA-Seq-derived gene expression profiles to help accurately diagnose and treat pediatric patients.
Citation Format: Sneha S. Jariwala, Alfred Geoffrey Lyle, Jacob Pfeil, Lauren Sanders, Holly C. Beale, Ellen T. Kephart, Katrina Learned, Allison Cheney, Olena M. Vaske. Molecular classification of pediatric high-risk leukemias using expression profiles of multimodally expressed genes abstract. In: Proceedings of the American Association for Cancer Research Annual Meeting 2021; 2021 Apr 10-15 and May 17-21. Philadelphia (PA): AACR; Cancer Res 2021;81(13_Suppl):Abstract nr 3033.