Long noncoding RNAs (lncRNAs) are emerging as important regulators of tissue physiology and disease processes including cancer. To delineate genome-wide lncRNA expression, we curated 7,256 RNA ...sequencing (RNA-seq) libraries from tumors, normal tissues and cell lines comprising over 43 Tb of sequence from 25 independent studies. We applied ab initio assembly methodology to this data set, yielding a consensus human transcriptome of 91,013 expressed genes. Over 68% (58,648) of genes were classified as lncRNAs, of which 79% were previously unannotated. About 1% (597) of the lncRNAs harbored ultraconserved elements, and 7% (3,900) overlapped disease-associated SNPs. To prioritize lineage-specific, disease-associated lncRNA expression, we employed non-parametric differential expression testing and nominated 7,942 lineage- or cancer-associated lncRNA genes. The lncRNA landscape characterized here may shed light on normal biology and cancer pathogenesis and may be valuable for future biomarker development.
Pseudogene transcripts can provide a novel tier of gene regulation through generation of endogenous siRNAs or miRNA-binding sites. Characterization of pseudogene expression, however, has remained ...confined to anecdotal observations due to analytical challenges posed by the extremely close sequence similarity with their counterpart coding genes. Here, we describe a systematic analysis of pseudogene “transcription” from an RNA-Seq resource of 293 samples, representing 13 cancer and normal tissue types, and observe a surprisingly prevalent, genome-wide expression of pseudogenes that could be categorized as ubiquitously expressed or lineage and/or cancer specific. Further, we explore disease subtype specificity and functions of selected expressed pseudogenes. Taken together, we provide evidence that transcribed pseudogenes are a significant contributor to the transcriptional landscape of cells and are positioned to play significant roles in cellular differentiation and cancer progression, especially in light of the recently described ceRNA networks. Our work provides a transcriptome resource that enables high-throughput analyses of pseudogene expression.
Display omitted
► Large-scale transcriptome analysis examines the expression of human pseudogenes ► Pseudogene expression can be ubiquitous or lineage or cancer specific ► ATP8A2Ψ and CXADRΨ show specific breast and prostate cancer expression, respectively ► The study provides a framework for analyzing pseudogene expression from RNA-Seq data
A large-scale analysis of human pseudogenes from 13 tumor and normal tissue types shows tissue- and cancer-specific expression patterns. Examples from prostate and breast cancer illustrate functional roles for a class of genes often thought of as “junk DNA.”
Using integrative genomic analysis of 360 metastatic castration-resistant prostate cancer (mCRPC) samples, we identified a novel subtype of prostate cancer typified by biallelic loss of CDK12 that is ...mutually exclusive with tumors driven by DNA repair deficiency, ETS fusions, and SPOP mutations. CDK12 loss is enriched in mCRPC relative to clinically localized disease and characterized by focal tandem duplications (FTDs) that lead to increased gene fusions and marked differential gene expression. FTDs associated with CDK12 loss result in highly recurrent gains at loci of genes involved in the cell cycle and DNA replication. CDK12 mutant cases are baseline diploid and do not exhibit DNA mutational signatures linked to defects in homologous recombination. CDK12 mutant cases are associated with elevated neoantigen burden ensuing from fusion-induced chimeric open reading frames and increased tumor T cell infiltration/clonal expansion. CDK12 inactivation thereby defines a distinct class of mCRPC that may benefit from immune checkpoint immunotherapy.
Display omitted
•CDK12 biallelic inactivating mutations define a distinct subtype of prostate cancer•CDK12 loss is associated with genomic instability and focal tandem duplications•CDK12 loss leads to increased gene fusions, neoantigen burden, and T cell infiltration•Patients with CDK12 mutant tumors may benefit from immune checkpoint inhibition
Loss of both alleles of the CDK12 gene defines a molecular subtype of metastatic castration-resistant prostate cancer that is potentially targetable with immune checkpoint inhibitors.
Forkhead box A1 (FOXA1) is a pioneer transcription factor that is essential for the normal development of several endoderm-derived organs, including the prostate gland
. FOXA1 is frequently mutated ...in hormone-receptor-driven prostate, breast, bladder and salivary-gland tumours
. However, it is unclear how FOXA1 alterations affect the development of cancer, and FOXA1 has previously been ascribed both tumour-suppressive
and oncogenic
roles. Here we assemble an aggregate cohort of 1,546 prostate cancers and show that FOXA1 alterations fall into three structural classes that diverge in clinical incidence and genetic co-alteration profiles, with a collective prevalence of 35%. Class-1 activating mutations originate in early prostate cancer without alterations in ETS or SPOP, selectively recur within the wing-2 region of the DNA-binding forkhead domain, enable enhanced chromatin mobility and binding frequency, and strongly transactivate a luminal androgen-receptor program of prostate oncogenesis. By contrast, class-2 activating mutations are acquired in metastatic prostate cancers, truncate the C-terminal domain of FOXA1, enable dominant chromatin binding by increasing DNA affinity and-through TLE3 inactivation-promote metastasis driven by the WNT pathway. Finally, class-3 genomic rearrangements are enriched in metastatic prostate cancers, consist of duplications and translocations within the FOXA1 locus, and structurally reposition a conserved regulatory element-herein denoted FOXA1 mastermind (FOXMIND)-to drive overexpression of FOXA1 or other oncogenes. Our study reaffirms the central role of FOXA1 in mediating oncogenesis driven by the androgen receptor, and provides mechanistic insights into how the classes of FOXA1 alteration promote the initiation and/or metastatic progression of prostate cancer. These results have direct implications for understanding the pathobiology of other hormone-receptor-driven cancers and rationalize the co-targeting of FOXA1 activity in therapeutic strategies.
Circular RNAs (circRNAs) are an intriguing class of RNA due to their covalently closed structure, high stability, and implicated roles in gene regulation. Here, we used an exome capture RNA ...sequencing protocol to detect and characterize circRNAs across >2,000 cancer samples. When compared against Ribo-Zero and RNase R, capture sequencing significantly enhanced the enrichment of circRNAs and preserved accurate circular-to-linear ratios. Using capture sequencing, we built the most comprehensive catalog of circRNA species to date: MiOncoCirc, the first database to be composed primarily of circRNAs directly detected in tumor tissues. Using MiOncoCirc, we identified candidate circRNAs to serve as biomarkers for prostate cancer and were able to detect circRNAs in urine. We further detected a novel class of circular transcripts, termed read-through circRNAs, that involved exons originating from different genes. MiOncoCirc will serve as a valuable resource for the development of circRNAs as diagnostic or therapeutic targets across cancer types.
Display omitted
•Use of exome capture transcriptome sequencing to compile a cancer circRNA landscape•MiOncoCirc is the most comprehensive catalog of cancer-based circRNA species•MiOncoCirc contains circRNA from cancer cell lines as well as tumor samples•Novel biomarkers can be nominated through MiOncoCirc
MiOncoCirc provides a reference of the circular RNA landscape across 40 cancer types.
Characterization of the prostate cancer transcriptome and genome has identified chromosomal rearrangements and copy number gains and losses, including ETS gene family fusions, PTEN loss and androgen ...receptor (AR) amplification, which drive prostate cancer development and progression to lethal, metastatic castration-resistant prostate cancer (CRPC). However, less is known about the role of mutations. Here we sequenced the exomes of 50 lethal, heavily pre-treated metastatic CRPCs obtained at rapid autopsy (including three different foci from the same patient) and 11 treatment-naive, high-grade localized prostate cancers. We identified low overall mutation rates even in heavily treated CRPCs (2.00 per megabase) and confirmed the monoclonal origin of lethal CRPC. Integrating exome copy number analysis identified disruptions of CHD1 that define a subtype of ETS gene family fusion-negative prostate cancer. Similarly, we demonstrate that ETS2, which is deleted in approximately one-third of CRPCs (commonly through TMPRSS2:ERG fusions), is also deregulated through mutation. Furthermore, we identified recurrent mutations in multiple chromatin- and histone-modifying genes, including MLL2 (mutated in 8.6% of prostate cancers), and demonstrate interaction of the MLL complex with the AR, which is required for AR-mediated signalling. We also identified novel recurrent mutations in the AR collaborating factor FOXA1, which is mutated in 5 of 147 (3.4%) prostate cancers (both untreated localized prostate cancer and CRPC), and showed that mutated FOXA1 represses androgen signalling and increases tumour growth. Proteins that physically interact with the AR, such as the ERG gene fusion product, FOXA1, MLL2, UTX (also known as KDM6A) and ASXL1 were found to be mutated in CRPC. In summary, we describe the mutational landscape of a heavily treated metastatic cancer, identify novel mechanisms of AR signalling deregulated in prostate cancer, and prioritize candidates for future study.
A 44-year old woman with recurrent solitary fibrous tumor (SFT)/hemangiopericytoma was enrolled in a clinical sequencing program including whole-exome and transcriptome sequencing. A gene fusion of ...the transcriptional repressor NAB2 with the transcriptional activator STAT6 was detected. Transcriptome sequencing of 27 additional SFTs identified the presence of a NAB2-STAT6 gene fusion in all tumors. Using RT-PCR and sequencing, we detected this fusion in all 51 SFTs, indicating high levels of recurrence. Expression of NAB2-STAT6 fusion proteins was confirmed in SFT, and the predicted fusion products harbor the early growth response (EGR)-binding domain of NAB2 fused to the activation domain of STAT6. Overexpression of the NAB2-STAT6 gene fusion induced proliferation in cultured cells and activated the expression of EGR-responsive genes. These studies establish NAB2-STAT6 as the defining driver mutation of SFT and provide an example of how neoplasia can be initiated by converting a transcriptional repressor of mitogenic pathways into a transcriptional activator.
Breast cancer is the most prevalent cancer in women, and over two-thirds of cases express estrogen receptor-α (ER-α, encoded by ESR1). Through a prospective clinical sequencing program for advanced ...cancers, we enrolled 11 patients with ER-positive metastatic breast cancer. Whole-exome and transcriptome analysis showed that six cases harbored mutations of ESR1 affecting its ligand-binding domain (LBD), all of whom had been treated with anti-estrogens and estrogen deprivation therapies. A survey of The Cancer Genome Atlas (TCGA) identified four endometrial cancers with similar mutations of ESR1. The five new LBD-localized ESR1 mutations identified here (encoding p.Leu536Gln, p.Tyr537Ser, p.Tyr537Cys, p.Tyr537Asn and p.Asp538Gly) were shown to result in constitutive activity and continued responsiveness to anti-estrogen therapies in vitro. Taken together, these studies suggest that activating mutations in ESR1 are a key mechanism in acquired endocrine resistance in breast cancer therapy.
Merkel cell carcinoma (MCC) is a rare but highly aggressive cutaneous neuroendocrine tumor. Merkel cell polyomavirus (MCPyV) may contribute to tumorigenesis in a subset of tumors via inhibition of ...tumor suppressors such as retinoblastoma (RB1) by mutated viral T antigens, but the molecular pathogenesis of MCPyV-negative MCC is largely unexplored. Through our MI-ONCOSEQ precision oncology study, we performed integrative sequencing on two cases of MCPyV-negative MCC, as well as a validation cohort of 14 additional MCC cases (n = 16). In addition to previously identified mutations in TP53, RB1, and PIK3CA, we discovered activating mutations of oncogenes, including HRAS and loss-of-function mutations in PRUNE2 and NOTCH family genes in MCPyV-negative MCC. MCPyV-negative tumors also displayed high overall mutation burden (10.09 ± 2.32 mutations/Mb) and were characterized by a prominent UV-signature pattern with C > T transitions comprising 85% of mutations. In contrast, mutation burden was low in MCPyV-positive tumors (0.40 ± 0.09 mutations/Mb) and lacked a UV signature. These findings suggest a potential ontologic dichotomy in MCC, characterized by either viral-dependent or UV-dependent tumorigenic pathways.
Through a prospective clinical sequencing program for advanced cancers, four index cases were identified which harbor gene rearrangements of FGFR2, including patients with cholangiocarcinoma, breast ...cancer, and prostate cancer. After extending our assessment of FGFR rearrangements across multiple tumor cohorts, we identified additional FGFR fusions with intact kinase domains in lung squamous cell cancer, bladder cancer, thyroid cancer, oral cancer, glioblastoma, and head and neck squamous cell cancer. All FGFR fusion partners tested exhibit oligomerization capability, suggesting a shared mode of kinase activation. Overexpression of FGFR fusion proteins induced cell proliferation. Two bladder cancer cell lines that harbor FGFR3 fusion proteins exhibited enhanced susceptibility to pharmacologic inhibition in vitro and in vivo. Because of the combinatorial possibilities of FGFR family fusion to a variety of oligomerization partners, clinical sequencing efforts, which incorporate transcriptome analysis for gene fusions, are poised to identify rare, targetable FGFR fusions across diverse cancer types.