Efforts to precisely identify tumor human leukocyte antigen (HLA) bound peptides capable of mediating T cell-based tumor rejection still face important challenges. Recent studies suggest that ...non-canonical tumor-specific HLA peptides derived from annotated non-coding regions could elicit anti-tumor immune responses. However, sensitive and accurate mass spectrometry (MS)-based proteogenomics approaches are required to robustly identify these non-canonical peptides. We present an MS-based analytical approach that characterizes the non-canonical tumor HLA peptide repertoire, by incorporating whole exome sequencing, bulk and single-cell transcriptomics, ribosome profiling, and two MS/MS search tools in combination. This approach results in the accurate identification of hundreds of shared and tumor-specific non-canonical HLA peptides, including an immunogenic peptide derived from an open reading frame downstream of the melanoma stem cell marker gene ABCB5. These findings hold great promise for the discovery of previously unknown tumor antigens for cancer immunotherapy.
Translation has a fundamental function in defining the fate of the transcribed genome. RNA-sequencing (RNA-seq) data enable the quantification of complex transcript mixtures, often detecting several ...transcript isoforms of unknown functions for one gene. Here, we describe ORFquant, a method to annotate and quantify translation at the level of single open reading frames (ORFs), using information from Ribo-seq data. By developing an approach for transcript filtering, we quantify translation transcriptome-wide, revealing translated ORFs on multiple isoforms per gene. For most genes, one ORF represents the dominant translation product, but we also detect genes with translated ORFs on multiple transcript isoforms, including targets of RNA surveillance mechanisms. Measuring translation across human cell lines reveals the extent of gene-specific differences in protein production, supported by steady-state protein abundance estimates. Computational analysis of Ribo-seq data with ORFquant (https://github.com/lcalviell/ORFquant) provides insights into the heterogeneous functions of complex transcriptomes.
Full text
Available for:
IJS, NUK, SBMB, UL, UM, UPUK
RNA-sequencing protocols can quantify gene expression regulation from transcription to protein synthesis. Ribosome profiling (Ribo-seq) maps the positions of translating ribosomes over the entire ...transcriptome. We have developed RiboTaper (available at https://ohlerlab.mdc-berlin.de/software/), a rigorous statistical approach that identifies translated regions on the basis of the characteristic three-nucleotide periodicity of Ribo-seq data. We used RiboTaper with deep Ribo-seq data from HEK293 cells to derive an extensive map of translation that covered open reading frame (ORF) annotations for more than 11,000 protein-coding genes. We also found distinct ribosomal signatures for several hundred upstream ORFs and ORFs in annotated noncoding genes (ncORFs). Mass spectrometry data confirmed that RiboTaper achieved excellent coverage of the cellular proteome. Although dozens of novel peptide products were validated in this manner, few of the currently annotated long noncoding RNAs appeared to encode stable polypeptides. RiboTaper is a powerful method for comprehensive de novo identification of actively used ORFs from Ribo-seq data.
Pervasive transcription of the human genome results in a heterogeneous mix of coding RNAs and long noncoding RNAs (lncRNAs). Only a small fraction of lncRNAs have demonstrated regulatory functions, ...thus making functional lncRNAs difficult to distinguish from nonfunctional transcriptional byproducts. This difficulty has resulted in numerous competing human lncRNA classifications that are complicated by a steady increase in the number of annotated lncRNAs. To address these challenges, we quantitatively examined transcription, splicing, degradation, localization and translation for coding and noncoding human genes. We observed that annotated lncRNAs had lower synthesis and higher degradation rates than mRNAs and discovered mechanistic differences explaining slower lncRNA splicing. We grouped genes into classes with similar RNA metabolism profiles, containing both mRNAs and lncRNAs to varying extents. These classes exhibited distinct RNA metabolism, different evolutionary patterns and differential sensitivity to cellular RNA-regulatory pathways. Our classification provides an alternative to genomic context-driven annotations of lncRNAs.
Full text
Available for:
IJS, NUK, SBMB, UL, UM, UPUK
DNase-seq and ATAC-seq are broadly used methods to assay open chromatin regions genome-wide. The single nucleotide resolution of DNase-seq has been further exploited to infer transcription factor ...binding sites (TFBSs) in regulatory regions through footprinting. Recent studies have demonstrated the sequence bias of DNase I and its adverse effects on footprinting efficiency. However, footprinting and the impact of sequence bias have not been extensively studied for ATAC-seq.
Here, we undertake a systematic comparison of the two methods and show that a modification to the ATAC-seq protocol increases its yield and its agreement with DNase-seq data from the same cell line. We demonstrate that the two methods have distinct sequence biases and correct for these protocol-specific biases when performing footprinting. Despite the differences in footprint shapes, the locations of the inferred footprints in ATAC-seq and DNase-seq are largely concordant. However, the protocol-specific sequence biases in conjunction with the sequence content of TFBSs impact the discrimination of footprint from the background, which leads to one method outperforming the other for some TFs. Finally, we address the depth required for reproducible identification of open chromatin regions and TF footprints.
We demonstrate that the impact of bias correction on footprinting performance is greater for DNase-seq than for ATAC-seq and that DNase-seq footprinting leads to better performance. It is possible to infer concordant footprints by using replicates, highlighting the importance of reproducibility assessment. The results presented here provide an overview of the advantages and limitations of footprinting analyses using ATAC-seq and DNase-seq.
MicroRNAs (miRNAs) are key mediators of post-transcriptional gene expression silencing. So far, no comprehensive experimental annotation of functional miRNA target sites exists in Drosophila. Here, ...we generated a transcriptome-wide in vivo map of miRNA-mRNA interactions in Drosophila melanogaster, making use of single nucleotide resolution in Argonaute1 (AGO1) crosslinking and immunoprecipitation (CLIP) data. Absolute quantification of cellular miRNA levels presents the miRNA pool in Drosophila cell lines to be more diverse than previously reported. Benchmarking two CLIP approaches, we identify a similar predictive potential to unambiguously assign thousands of miRNA-mRNA pairs from AGO1 interaction data at unprecedented depth, achieving higher signal-to-noise ratios than with computational methods alone. Quantitative RNA-seq and sub-codon resolution ribosomal footprinting data upon AGO1 depletion enabled the determination of miRNA-mediated effects on target expression and translation. We thus provide the first comprehensive resource of miRNA target sites and their quantitative functional impact in Drosophila.
The chromatin regulator FACT (facilitates chromatin transcription) is essential for ensuring stable gene expression by promoting transcription. In a genetic screen using Caenorhabditis elegans, we ...identified that FACT maintains cell identities and acts as a barrier for transcription factor-mediated cell fate reprogramming. Strikingly, FACT’s role as a barrier to cell fate conversion is conserved in humans as we show that FACT depletion enhances reprogramming of fibroblasts. Such activity is unexpected because FACT is known as a positive regulator of gene expression, and previously described reprogramming barriers typically repress gene expression. While FACT depletion in human fibroblasts results in decreased expression of many genes, a number of FACT-occupied genes, including reprogramming-promoting factors, show increased expression upon FACT depletion, suggesting a repressive function of FACT. Our findings identify FACT as a cellular reprogramming barrier in C. elegans and humans, revealing an evolutionarily conserved mechanism for cell fate protection.
Display omitted
•Chromatin regulator FACT blocks cellular reprogramming in C. elegans and humans•FACT maintains cell fates and antagonizes induction of ectopic fates in C. elegans•FACT depletion in human cells primes the transcriptome for reprogramming
Known barriers to cell fate reprogramming repress gene expression to prevent ectopic fates. Kolundzic et al. now show that the histone chaperone FACT, a positive regulator of gene expression, safeguards cell identities and acts as an evolutionarily conserved barrier for cell fate reprogramming in both C. elegans and human cells.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
Direct cell programming via overexpression of transcription factors (TFs) aims to control cell fate with the degree of precision needed for clinical applications. However, the regulatory steps ...involved in successful terminal cell fate programming remain obscure. We have investigated the underlying mechanisms by looking at gene expression, chromatin states, and TF binding during the uniquely efficient Ngn2, Isl1, and Lhx3 motor neuron programming pathway. Our analysis reveals a highly dynamic process in which Ngn2 and the Isl1/Lhx3 pair initially engage distinct regulatory regions. Subsequently, Isl1/Lhx3 binding shifts from one set of targets to another, controlling regulatory region activity and gene expression as cell differentiation progresses. Binding of Isl1/Lhx3 to later motor neuron enhancers depends on the Ebf and Onecut TFs, which are induced by Ngn2 during the programming process. Thus, motor neuron programming is the product of two initially independent transcriptional modules that converge with a feedforward transcriptional logic.
Display omitted
•ESC expression of Ngn2/Isl1/Lhx3 induces rapid transcriptional and chromatin changes•At early stages, Isl1/Lhx3 (homeodomain) and Ngn2 (bHLH) target distinct genomic sites•As programming progresses, Isl1/Lhx3 binding shows dynamic relocalization•Ngn2-induced factors guide Isl1/Lhx3 redistribution to initially inaccessible sites
Mazzoni and colleagues show that transcription factor-directed programming of ESCs to motor neurons involves two distinct regulatory modules that converge when programming TFs are relocated by the activity of factors induced in the earlier stage of the process.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
Divergent transcription from promoters and enhancers is pervasive in many species, but it remains unclear if it is a general feature of all eukaryotic cis regulatory elements. To address this, here ...we define cis regulatory elements in C. elegans, D. melanogaster and H. sapiens and investigate the determinants of their transcription directionality. In all three species, we find that divergent transcription is initiated from two separate core promoter sequences and promoter regions display competition between histone modifications on the + 1 and -1 nucleosomes. In contrast, promoter directionality, sequence composition surrounding promoters, and positional enrichment of chromatin states, are different across species. Integrative models of H3K4me3 levels and core promoter sequence are highly predictive of promoter and enhancer directionality and support two directional classes, skewed and balanced. The relative importance of features to these models are clearly distinct for promoters and enhancers. Differences in regulatory architecture within and between metazoans are therefore abundant, arguing against a unified eukaryotic model.
The development of multicellular organisms is accompanied by gene expression changes in differentiating cells. Profiling stage-specific expression during development may reveal important insights ...into gene sets that contributed to the morphological diversity across the animal kingdom.
We sequenced RNA-seq libraries throughout a developmental timecourse of the nematode Pristionchus pacificus. The transcriptomes reflect early larval stages, adult worms including late larvae, and growth-arrested dauer larvae and allowed the identification of developmentally regulated gene clusters. Our data reveals similar trends as previous transcriptome profiling of dauer worms and represents the first expression data for early larvae in P. pacificus. Gene expression clusters characterizing early larval stages show most significant enrichments of chaperones, while collagens are most significantly enriched in transcriptomes of late larvae and adult worms. By combining expression data with phylogenetic analysis, we found that developmentally regulated genes are found in paralogous clusters that have arisen through lineage-specific duplications after the split from the Caenorhabditis elegans branch.
We propose that gene duplications of developmentally regulated genes represent a plausible evolutionary mechanism to increase the dosage of stage-specific expression. Consequently, this may contribute to the substantial divergence in expression profiles that has been observed across larger evolutionary time scales.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK