A feature common to all DNA sequencing technologies is the presence of base-call errors in the sequenced reads. The implications of such errors are application specific, ranging from minor ...informatics nuisances to major problems affecting biological inferences. Recently developed "next-gen" sequencing technologies have greatly reduced the cost of sequencing, but have been shown to be more error prone than previous technologies. Both position specific (depending on the location in the read) and sequence specific (depending on the sequence in the read) errors have been identified in Illumina and Life Technology sequencing platforms. We describe a new type of systematic error that manifests as statistically unlikely accumulations of errors at specific genome (or transcriptome) locations.
We characterize and describe systematic errors using overlapping paired reads from high-coverage data. We show that such errors occur in approximately 1 in 1000 base pairs, and that they are highly replicable across experiments. We identify motifs that are frequent at systematic error sites, and describe a classifier that distinguishes heterozygous sites from systematic error. Our classifier is designed to accommodate data from experiments in which the allele frequencies at heterozygous sites are not necessarily 0.5 (such as in the case of RNA-Seq), and can be used with single-end datasets.
Systematic errors can easily be mistaken for heterozygous sites in individuals, or for SNPs in population analyses. Systematic errors are particularly problematic in low coverage experiments, or in estimates of allele-specific expression from RNA-Seq data. Our characterization of systematic error has allowed us to develop a program, called SysCall, for identifying and correcting such errors. We conclude that correction of systematic errors is important to consider in the design and interpretation of high-throughput sequencing experiments.
Single‐cell transcriptomic studies are identifying novel cell populations with exciting functional roles in various in vivo contexts, but identification of succinct gene marker panels for such ...populations remains a challenge. In this work, we introduce COMET, a computational framework for the identification of candidate marker panels consisting of one or more genes for cell populations of interest identified with single‐cell RNA‐seq data. We show that COMET outperforms other methods for the identification of single‐gene panels and enables, for the first time, prediction of multi‐gene marker panels ranked by relevance. Staining by flow cytometry assay confirmed the accuracy of COMET's predictions in identifying marker panels for cellular subtypes, at both the single‐ and multi‐gene levels, validating COMET's applicability and accuracy in predicting favorable marker panels from transcriptomic input. COMET is a general non‐parametric statistical framework and can be used as‐is on various high‐throughput datasets in addition to single‐cell RNA‐sequencing data. COMET is available for use via a web interface (http://www.cometsc.com/) or a stand‐alone software package (https://github.com/MSingerLab/COMETSC).
Synopsis
COMET is a computational tool for marker‐panel selection from single‐cell RNA‐seq data. It generates ranked predictions of single‐ and multiple‐gene marker panels for a cell population of interest.
COMET is a computational tool for combinatorial prediction of marker panels from single‐cell transcriptomic data.
COMET's statistical framework enables controlling for specificity and sensitivity in predicted marker panels.
Staining by flow‐cytometry validates that COMET identifies novel and favorable single‐ and multi‐gene marker panels for cellular subtypes.
COMET is available via a web interface (http://www.cometsc.com/) or downloadable software package (https://github.com/MSingerLab/COMETSC).
COMET is a computational tool for marker‐panel selection from single‐cell RNA‐seq data. It generates ranked predictions of single‐ and multiple‐gene marker panels for a cell population of interest.
Brain tumor initiating cells (BTICs), also known as cancer stem cells, hijack high-affinity glucose uptake active normally in neurons to maintain energy demands. Here we link metabolic dysregulation ...in human BTICs to a nexus between MYC and de novo purine synthesis, mediating glucose-sustained anabolic metabolism. Inhibiting purine synthesis abrogated BTIC growth, self-renewal and in vivo tumor formation by depleting intracellular pools of purine nucleotides, supporting purine synthesis as a potential therapeutic point of fragility. In contrast, differentiated glioma cells were unaffected by the targeting of purine biosynthetic enzymes, suggesting selective dependence of BTICs. MYC coordinated the control of purine synthetic enzymes, supporting its role in metabolic reprogramming. Elevated expression of purine synthetic enzymes correlated with poor prognosis in glioblastoma patients. Collectively, our results suggest that stem-like glioma cells reprogram their metabolism to self-renew and fuel the tumor hierarchy, revealing potential BTIC cancer dependencies amenable to targeted therapy.
Patients with myelodysplastic syndromes (MDSs) display severe anemia but the mechanisms underlying this phenotype are incompletely understood. Right open-reading-frame kinase 2 (RIOK2) encodes a ...protein kinase located at 5q15, a region frequently lost in patients with MDS del(5q). Here we show that hematopoietic cell-specific haploinsufficient deletion of Riok2 (Riok2
Vav1
) led to reduced erythroid precursor frequency leading to anemia. Proteomic analysis of Riok2
Vav1
erythroid precursors suggested immune system activation, and transcriptomic analysis revealed an increase in p53-dependent interleukin (IL)-22 in Riok2
Vav1
CD4
T cells (T
22). Further, we discovered that the IL-22 receptor, IL-22RA1, was unexpectedly present on erythroid precursors. Blockade of IL-22 signaling alleviated anemia not only in Riok2
Vav1
mice but also in wild-type mice. Serum concentrations of IL-22 were increased in the subset of patients with del(5q) MDS as well as patients with anemia secondary to chronic kidney disease. This work reveals a possible therapeutic opportunity for reversing many stress-induced anemias by targeting IL-22 signaling.
It has long been appreciated that tumors are diverse, varying in mutational status, composition of cellular infiltrate, and organizational architecture. For the most part, the information embedded in ...this diversity has gone untapped due to the limited resolution and dimensionality of assays for analyzing nucleic acid expression in cells. The advent of high-throughput, next-generation sequencing (NGS) technologies that measure nucleic acids, particularly at the single-cell level, is fueling the characterization of the many components that comprise the tumor microenvironment (TME), with a strong focus on immune composition. Understanding the immune and nonimmune components of the TME, how they interact, and how this shapes their functional properties requires the development of novel computational methods and, eventually, the application of systems-based approaches. The continued development and application of NGS technologies holds great promise for accelerating discovery in the cancer immunology field.
IL-7 therapy has been evaluated in patients who do not regain normal CD4 T cell counts after virologically successful antiretroviral therapy. IL-7 increases total circulating CD4 and CD8 T cell ...counts; however, its effect on HIV-specific CD8 T cells has not been fully examined. TRAF1, a prosurvival signaling adaptor required for 4-1BB-mediated costimulation, is lost from chronically stimulated virus-specific CD8 T cells with progression of HIV infection in humans and during chronic lymphocytic choriomeningitis infection in mice. Previous results showed that IL-7 can restore TRAF1 expression in virus-specific CD8 T cells in mice, rendering them sensitive to anti-4-1BB agonist therapy. In this article, we show that IL-7 therapy in humans increases the number of circulating HIV-specific CD8 T cells. For a subset of patients, we also observed an increased frequency of TRAF1
HIV-specific CD8 T cells 10 wk after completion of IL-7 treatment. IL-7 treatment increased levels of phospho-ribosomal protein S6 in HIV-specific CD8 T cells, suggesting increased activation of the metabolic checkpoint kinase mTORC1. Thus, IL-7 therapy in antiretroviral therapy-treated patients induces sustained changes in the number and phenotype of HIV-specific T cells.
Chronic viral infections and cancer often lead to the emergence of dysfunctional or ‘exhausted’ CD8+ T cells, and the restoration of their functions is currently the focus of therapeutic ...interventions. In this review, we detail recent advances in the annotation of the gene modules and the epigenetic landscape associated with T-cell dysfunction. Together with analysis of single-cell transcriptomes, these findings have enabled a deeper and more precise understanding of the transcriptional mechanisms that induce and maintain the dysfunctional state and highlight the heterogeneity of CD8+ T-cell phenotypes present in chronically inflamed tissue. We discuss the relevance of these findings for understanding the transcriptional and spatial regulation of dysfunctional T cells and for the design of therapeutics.
Recent advances have identified novel CD8+ T-cell functional states in chronic inflammatory conditions associated with distinct transcriptional programs.
Single-cell analysis has revealed extensive transcriptional heterogeneity in the CD8+ T-cell response in cancer.
CRISPR/Cas9 genome editing in mature CD8+ T cells has enabled testing of candidate regulators in vivo.
Analysis of the chromatin landscape in CD8+ T cells has revealed distinct epigenetic changes associated with distinct functional states.
A commonplace analysis in high-throughput DNA methylation studies is the comparison of methylation extent between different functional regions, computed by averaging methylation states within region ...types and then comparing averages between regions. For example, it has been reported that methylation is more prevalent in coding regions as compared to their neighboring introns or UTRs, leading to hypotheses about novel forms of epigenetic regulation.
We have identified and characterized a bias present in these seemingly straightforward comparisons that results in the false detection of differences in methylation intensities across region types. This bias arises due to differences in conservation rates, rather than methylation rates, and is broadly present in the published literature. When controlling for conservation at coding start sites the differences in DNA methylation rates disappear. Moreover, a re-evaluation of methylation rates at intronexon junctions reveals that the magnitude of previously reported differences is greatly exaggerated. We introduce two correction methods to address this bias, an inferencebased matrix completion algorithm and an averaging approach, tailored to address different underlying biological questions. We evaluate how analysis using these corrections affects the detection of differences in DNA methylation across functional boundaries.
We report here on a bias in DNA methylation comparative studies that originates in conservation rate differences and manifests itself in the false discovery of differences in DNA methylation intensities and their extents. We have characterized this bias and its broad implications, and show how to control for it so as to enable the study of a variety of biological questions.
DNA methylation is an important epigenetic marker associated with gene expression regulation in eukaryotes. While promoter methylation is relatively well characterized, the role of intragenic DNA ...methylation remains unclear. Here, we investigated the relationship of DNA methylation at exons and flanking introns with gene expression and histone modifications generated from a human fibroblast cell-line and primary B cells. Consistent with previous work we found that intragenic methylation is positively correlated with gene expression and that exons are more highly methylated than their neighboring intronic environment. Intriguingly, in this study we identified a unique subset of hypomethylated exons that demonstrate significantly lower methylation levels than their surrounding introns. Furthermore, we observed a negative correlation between exon methylation and the density of the majority of histone modifications. Specifically, we demonstrate that hypo-methylated exons at highly expressed genes are associated with open chromatin and have a characteristic histone code comprised of significantly high levels of histone markings. Overall, our comprehensive analysis of the human exome supports the presence of regulatory hypomethylated exons in protein coding genes. In particular our results reveal a previously unrecognized diverse and complex role of the epigenetic landscape within the gene body.