Calling fusion genes from RNA-seq data is well established, but other transcriptional variants are difficult to detect using existing approaches. To identify all types of variants in transcriptomes ...we developed MINTIE, an integrated pipeline for RNA-seq data. We take a reference-free approach, combining de novo assembly of transcripts with differential expression analysis to identify up-regulated novel variants in a case sample. We compare MINTIE with eight other approaches, detecting > 85% of variants while no other method is able to achieve this. We posit that MINTIE will be able to identify new disease variants across a range of disease types.
Visualisation of the transcriptome relative to a reference genome is fraught with sparsity. This is due to RNA sequencing (RNA-Seq) reads being predominantly mapped to exons that account for just ...under 3% of the human genome. Recently, we have used exon-only references, superTranscripts, to improve visualisation of aligned RNA-Seq data through the omission of supposedly unexpressed regions such as introns. However, variation within these regions can lead to novel splicing events that may drive a pathogenic phenotype. In these cases, the loss of information in only retaining annotated exons presents significant drawbacks. Here we present Slinker, a bioinformatics pipeline written in Python and Bpipe that uses a data-driven approach to assemble sample-specific superTranscripts. At its core, Slinker uses
Stringtie2 to assemble transcripts with any sequence across any gene. This assembly is merged with reference transcripts, converted to a superTranscript, of which rich visualisations are made through
Plotly with associated annotation and coverage information. Slinker was validated on five novel splicing events of rare disease samples from a cohort of primary muscular disorders. In addition, Slinker was shown to be effective in visualising deletion events within transcriptomes of tumour samples in the important leukemia gene, IKZF1. Slinker offers a succinct visualisation of RNA-Seq alignments across typically sparse regions and is freely available on Github.
B-cell acute lymphoblastic leukemia (B-ALL) is the most common childhood cancer. Subtypes within B-ALL are distinguished by characteristic structural variants and mutations, which in some instances ...strongly correlate with responses to treatment. The World Health Organisation (WHO) recognises seven distinct classifications, or subtypes, as of 2016. However, recent studies have demonstrated that B-ALL can be segmented into 23 subtypes based on a combination of genomic features and gene expression profiles. A method to identify a patient's subtype would have clear utility. Despite this, no publically available classification methods using RNA-Seq exist for this purpose. Here we present ALLSorts: a publicly available method that uses RNA-Seq data to classify B-ALL samples to 18 known subtypes and five meta-subtypes. ALLSorts is the result of a hierarchical supervised machine learning algorithm applied to a training set of 1223 B-ALL samples aggregated from multiple cohorts. Validation revealed that ALLSorts can accurately attribute samples to subtypes and can attribute multiple subtypes to a sample. Furthermore, when applied to both paediatric and adult cohorts, ALLSorts was able to classify previously undefined samples into subtypes. ALLSorts is available and documented on GitHub (https://github.com/Oshlack/AllSorts/).
Genomic profiling efforts have revealed a rich diversity of oncogenic fusion genes. While there are many methods for identifying fusion genes from RNA-sequencing (RNA-seq) data, visualizing these ...transcripts and their supporting reads remains challenging.
Clinker is a bioinformatics tool written in Python, R, and Bpipe that leverages the superTranscript method to visualize fusion genes. We demonstrate the use of Clinker to obtain interpretable visualizations of the RNA-seq data that lead to fusion calls. In addition, we use Clinker to explore multiple fusion transcripts with novel breakpoints within the P2RY8-CRLF2 fusion gene in B-cell acute lymphoblastic leukemia.
Clinker is freely available software that allows visualization of fusion genes and the RNA-seq data used in their discovery.
Genomic markers define molecular subtypes and measurable residual disease (MRD) targets in B-cell acute lymphoblastic leukemia/lymphoma (B-ALL) and are essential determinants of treatment. Current ...diagnostic approaches typically involve serial multi-step testing utilizing conventional cytogenetics (CC)/FISH and molecular genetic (RT-qPCR, MLPA, clonality PCR, NGS panel) techniques which are time and sample consuming and ultimately may not adequately identify genomically complex B-ALL subtypes. In contrast, single-step comprehensive genomic profiling by whole genome and whole transcriptome sequencing (WGS/WTS) may be more efficient for the molecular classification of established and newly described entities which are of increasing therapeutic relevance. We have instituted a multimodal platform for molecular testing in B-ALL performing WGS/WTS in parallel with deep NGS-based immunoglobulin (IG) rearrangement MRD and exploratory DNA-breakpoint based MRD assays. We aimed to determine the utility of this approach for subtype classification compared to a standard-of-care diagnostic approach of CC/FISH testing.
Forty-two consecutive adult patients underwent both standard-of-care diagnostic testing and WGS/WTS. 20/42 (48%) patients had an abnormal CC/FISH result supporting classification into recognized molecular subgroups. WGS/WTS assessment incorporating somatic coding and non-coding mutations, structural variants, fusions, copy number abnormalities and gene expression subtype prediction (ALLSorts, https://github.com/Oshlack/ALLSorts) was performed with concordant results in all 20 patients. 16/22 patients that were unclassified by CC/FISH were successfully reclassified by WGS/WTS including subtypes enriched for cryptic rearrangements (Ph-like, DUX4, MEF2D) and groups characterized by heterogeneous genomic alterations or a distinctive gene expression signature (PAX5alt, ZEB2/CEBP). A low hypodiploid karyotype was observed in two cases with an apparently normal karyotype by CC. The six patients who remained without a subtype defining driver genetic alteration after comprehensive testing frequently harbored novel IGH translocations or a Ph-like expression signature but without a described fusion.
In order to understand the relative contribution from WGS versus WTS, analysis of 36 patients was performed using a truth classification. WGS and WTS produced equivalent classifications for 22 cases. Two cases were based solely on WGS findings (iAMP21 and ZEB2/CEBP) and three cases were based solely on WTS findings (DUX4). Importantly the combination of both WGS and WTS was critical to correctly classify nine cases (Ph-like and PAX5alt).
MRD was assessed by a sensitive NGS assay to IG rearrangements (Adaptive Biotechnologies) and by quantitative probe-based droplet digital PCR (ddPCR) assays designed to structural rearrangement DNA breakpoints from genome data (analytical sensitivity 10 -4). Patient-specific ddPCR assays were designed to eight structural variants (KMT2A and IGH translocations, and IKZF1 deletions) in seven patients and assessed in 36 remission samples with parallel testing by multiparametric flow cytometry (MFC). Concordant MFC and ddPCR was observed in 30/36 samples (19 MRD pos, 11 MRD neg). Discordances included two MRD pos by MFC-only and four MRD pos by ddPCR-only; the latter often occurring in the setting of antigen directed therapy or in ALL with a less informative immunophenotype, demonstrating the additional utility of non-MFC based MRD assessment in specific clinical settings. 27/42 patients in our cohort had ≥1 genomic structural rearrangement identified by WGS that could be used for patient-specific MRD monitoring to complement existing MRD assessment.
In conclusion WGS/WTS provided a molecular subtype classification in 86% of our cohort compared to 48% by standard-of-care diagnostic testing highlighting that CC/FISH alone is inadequate for contemporary molecular classification of B-ALL, which may have implications for treatment decisions. Importantly, the combination of WGS and WTS was superior to WGS-only or WTS-only for correct molecular subtype assignment. This approach has the potential to improve risk assessment in adult B-ALL and the routine feasibility, improvement in clinical outcomes and health economic impact warrant further assessment.
Bajel: Abbvie, Amgen, Novartis, Pfizer: Honoraria; Amgen: Speakers Bureau. Dickinson: Janssen: Consultancy, Honoraria; Amgen: Honoraria; Celgene: Research Funding; Gilead Sciences: Consultancy, Honoraria, Speakers Bureau; MSD: Consultancy, Honoraria, Research Funding, Speakers Bureau; Takeda: Research Funding; Bristol-Myers Squibb: Consultancy, Honoraria; Roche: Consultancy, Honoraria, Other: travel, accommodation, expenses, Research Funding, Speakers Bureau; Novartis: Consultancy, Honoraria, Research Funding, Speakers Bureau. Tiong: Servier: Consultancy, Speakers Bureau; Amgen: Speakers Bureau; Pfizer: Consultancy.
Transcriptome sequencing has identified multiple subtypes of B-progenitor acute lymphoblastic leukemia (B-ALL) of prognostic significance, but a minority of cases lack a known genetic driver. Here, ...we used integrated whole-genome (WGS) and -transcriptome sequencing (RNA-seq), enhancer mapping, and chromatin topology analysis to identify previously unrecognized genomic drivers in B-ALL. Newly diagnosed (n = 3221) and relapsed (n = 177) B-ALL cases with tumor RNA-seq were studied. WGS was performed to detect mutations, structural variants, and copy number alterations. Integrated analysis of histone 3 lysine 27 acetylation and chromatin looping was performed using HiChIP. We identified a subset of 17 newly diagnosed and 5 relapsed B-ALL cases with a distinct gene expression profile and 2 universal and unique genomic alterations resulting from aberrant recombination-activating gene activation: a focal deletion downstream of PAN3 at 13q12.2 resulting in CDX2 deregulation by the PAN3 enhancer and a focal deletion of exons 18-21 of UBTF at 17q21.31 resulting in a chimeric fusion, UBTF::ATXN7L3. A subset of cases also had rearrangement and increased expression of the PAX5 gene, which is otherwise uncommon in B-ALL. Patients were more commonly female and young adult with median age 35 (range,12-70 years). The immunophenotype was characterized by CD10 negativity and immunoglobulin M positivity. Among 16 patients with known clinical response, 9 (56.3%) had high-risk features including relapse (n = 4) or minimal residual disease >1% at the end of remission induction (n = 5). CDX2-deregulated, UBTF::ATXN7L3 rearranged (CDX2/UBTF) B-ALL is a high-risk subtype of leukemia in young adults for which novel therapeutic approaches are required.
•CDX2 deregulation and UBTF fusion define a B-ALL subtype with distinct immunophenotype, expression profile, and high-risk feature.•Somatic 13q12.2 deletions spanning FLT3 promoter lead to upregulation of CDX2 through a mechanism of enhancer retargeting.
Display omitted
Background: Philadelphia-like acute lymphoblastic leukaemia (Ph-like ALL) is a high-risk subtype of ALL driven by a range of tyrosine kinase and cytokine receptor rearrangements. ABL1-class ...rearrangements (ABL1, ABL2, CSF1R and PDGFRB) account for 17% of Ph-like ALL cases in children, and are clinically important to identify as they can be therapeutically targeted with tyrosine kinase inhibitors (TKIs). While the p190 BCR-ABL1 fusion is well described, less is known about the function and downstream signalling by rare ABL1 fusions. We identified a rare ABL1 fusion, SFPQ-ABL1, in a paediatric B-ALL patient using RNA-sequencing. This fusion lacks the ABL1 Src-homology-3 (SH3) and part of the SH2 domain, which are retained in BCR-ABL1. Other ABL1 fusions, RCSD1-ABL1 and SNX2-ABL1, have a similar structure. In this work we have utilised phosphoproteomics and Stable Isotope Labelling by Amino Acids in Cell Culture (SILAC), as well as in vitro and in vivo models, to determine differential signalling pathways between SFPQ-ABL1 and BCR-ABL1.
Methods: We cloned SFPQ-ABL1 from patient cDNA, and engineered SFPQ-ABL1 and BCR-ABL1 fusions to include or delete the SH2 and SH3 domains. We performed proliferation and viability assays to assess the ability of these fusions to transform Ba/F3 cells and test sensitivity to TKIs. We performed total phosphopeptide and phosphotyrosine enrichments and utilised mass spectrometry to identify the phosphoproteome activated by canonical SFPQ-ABL1 and BCR-ABL1. Over representation analysis was performed on phosphopeptides significantly differing between BCR-ABL and SFPQ-ABL (Log fold change cut-off > 2.5) using the Gene Ontology (GO) knowledge base under the biological process category. Furthermore, we compared the phosphoproteome of canonical SFPQ-ABL1 to SFPQ-ABL1 with the SH2 and SH3 domains reintroduced (SFPQ-ABL1+SH). We have also developed novel mouse models, using syngeneic transplantation, of SFPQ-ABL1 and SNX2-ABL1 driven leukaemia.
Results: SFPQ-ABL1 expressing Ba/F3 cells are sensitive to cell death induced by TKIs that block ABL1. Interestingly, while SFPQ-ABL1 and BCR-ABL1 both effectively blocked apoptosis, SFPQ-ABL1 was less able to drive cytokine-independent proliferation. Phosphoproteomic analysis showed that BCR-ABL1 and SFPQ-ABL1 differentially activate downstream signalling pathways, including SH-binding proteins. Hierarchical clustering of phosphopeptides quantified from cells expressing canonical BCR-ABL1, SFPQ-ABL1, and SFPQ-ABL1+SH, demonstrated that BCR-ABL1 and SFPQ-ABL1+SH were more similar to each other than to SFPQ-ABL1. SFPQ-ABL1 expression resulted in phosphorylation of proteins involved in RNA processing, metabolism and splicing, suggesting that SFPQ region of SFPQ-ABL1 also contributes to signalling.
Conclusions: In this study, we have utilised phosphoproteomics for the unbiased identification of signalling nodes that are required for the function of different classes of ABL fusions. We have developed novel in vitro and in vivo models to further understand how these fusions function to drive leukaemia. Our data also suggests that ABL1 fusion partners play a role beyond dimerization and transphosphorylation of the kinase domains in oncogenic signalling, but further study is needed to establish the contribution to leukaemogenesis. Establishing signalling pathways that are critical to the function of rare ABL1 fusions may inform clinical approaches to treating this disease.
No relevant conflicts of interest to declare.
Acute lymphoblastic leukemia (ALL) is the most common childhood malignancy, and implementation of risk-adapted therapy has been instrumental in the dramatic improvements in clinical outcomes. A key ...to risk-adapted therapies includes the identification of genomic features of individual tumors, including chromosome number (for hyper- and hypodiploidy) and gene fusions, notably ETV6-RUNX1, TCF3-PBX1, and BCR-ABL1 in B-cell ALL (B-ALL). RNA-sequencing (RNA-seq) of large ALL cohorts has expanded the number of recurrent gene fusions recognized as drivers in ALL, and identification of these new entities will contribute to refining ALL risk stratification. We used RNA-seq on 126 ALL patients from our clinical service to test the utility of including RNA-seq in standard-of-care diagnostic pipelines to detect gene rearrangements and IKZF1 deletions. RNA-seq identified 86% of rearrangements detected by standard-of-care diagnostics. KMT2A (MLL) rearrangements, although usually identified, were the most commonly missed by RNA-seq as a result of low expression. RNA-seq identified rearrangements that were not detected by standard-of-care testing in 9 patients. These were found in patients who were not classifiable using standard molecular assessment. We developed an approach to detect the most common IKZF1 deletion from RNA-seq data and validated this using an RQ-PCR assay. We applied an expression classifier to identify Philadelphia chromosome–like B-ALL patients. T-ALL proved a rich source of novel gene fusions, which have clinical implications or provide insights into disease biology. Our experience shows that RNA-seq can be implemented within an individual clinical service to enhance the current molecular diagnostic risk classification of ALL.
•RNA-seq can be implemented into a clinical service to inform risk stratification of patients with nonstandard molecular features.•RNA-seq data can be used to identify IKZF1 deletions.
Display omitted
We report two patients with leukaemia driven by the rare CNTRL‐FGFR1 fusion oncogene. This fusion arises from a t(8;9)(p12;q33) translocation, and is a rare driver of biphenotypic leukaemia in ...children. We used RNA sequencing to report novel features of expressed CNTRL‐FGFR1, including CNTRL‐FGFR1 fusion alternative splicing. From this knowledge, we designed and tested a Droplet Digital PCR assay that detects CNTRL‐FGFR1 expression to approximately one cell in 100 000 using fusion breakpoint‐specific primers and probes. We also utilised cell‐line models to show that effective tyrosine kinase inhibitors, which may be included in treatment regimens for this disease, are only those that block FGFR1 phosphorylation.