Over the last decade, long noncoding RNAs (lncRNAs) have emerged as a fundamental molecular class whose members play pivotal roles in the regulation of the genome. The observation of pervasive ...transcription of mammalian genomes in the early 2000s sparked a revolution in the understanding of information flow in eukaryotic cells and the incredible flexibility and dynamic nature of the transcriptome. As a molecular class, distinct loci yielding lncRNAs are set to outnumber those yielding mRNAs. However, like many important discoveries, the road leading to uncovering this diverse class of molecules that act through a remarkable repertoire of mechanisms, was not a straight one. The same characteristic that most distinguishes lncRNAs from mRNAs, i.e. their developmental-stage, tissue-, and cell-specific expression, was one of the major impediments to their discovery and recognition as potentially functional regulatory molecules. With growing numbers of lncRNAs being assigned to biological functions, the specificity of lncRNA expression is now increasingly recognized as a characteristic that imbues lncRNAs with great potential as biomarkers and for the development of highly targeted therapeutics. Here we review the history of lncRNA research and how technological advances and insight into biological complexity have gone hand-in-hand in shaping this revolution. We anticipate that as increasing numbers of these molecules, often described as the dark matter of the genome, are characterized and the structure–function relationship of lncRNAs becomes better understood, it may ultimately be feasible to decipher what these non-(protein)-coding genes encode. This article is part of a Special Issue entitled: Clues to long noncoding RNA taxonomy1, edited by Dr. Tetsuro Hirose and Dr. Shinichi Nakagawa.
•Long noncoding RNAs are increasingly appreciated as key molecules in cell fate and behavior.•The history and technology leading to this current knowledge resembles historical conceptual leaps.•Much remains to be understood about the complexity of lncRNA biology.
The previous decade has seen long non-coding RNAs (lncRNAs) rise from obscurity to being defined as a category of genetic elements, leaving its mark on the field of cancer biology. With the current ...number of curated lncRNAs increasing by 10,000 in the last five years, the field is moving from annotation of lncRNA expression in various tumours to understanding their importance in the key cancer signalling networks and characteristic behaviours. Here, we summarize the previously identified as well as recently discovered mechanisms of lncRNA function and their roles in the hallmarks of cancer. Furthermore, we identify novel technologies for investigation of lncRNA properties and their function in carcinogenesis, which will be important for their translation to the clinic as novel biomarkers and therapeutic targets.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Despite the prevalence of long noncoding RNA (lncRNA) genes in eukaryotic genomes, only a small proportion have been examined for biological function. lncRNAdb, available at http://lncrnadb.org, ...provides users with a comprehensive, manually curated reference database of 287 eukaryotic lncRNAs that have been described independently in the scientific literature. In addition to capturing a great proportion of the recent literature describing functions for individual lncRNAs, lncRNAdb now offers an improved user interface enabling greater accessibility to sequence information, expression data and the literature. The new features in lncRNAdb include the integration of Illumina Body Atlas expression profiles, nucleotide sequence information, a BLAST search tool and easy export of content via direct download or a REST API. lncRNAdb is now endorsed by RNAcentral and is in compliance with the International Nucleotide Sequence Database Collaboration.
Human Mitochondrial Transcriptome Mercer, Tim R; Neph, Shane; Dinger, Marcel E ...
Cell,
08/2011, Letnik:
146, Številka:
4
Journal Article
Recenzirano
Odprti dostop
The human mitochondrial genome comprises a distinct genetic system transcribed as precursor polycistronic transcripts that are subsequently cleaved to generate individual mRNAs, tRNAs, and rRNAs. ...Here, we provide a comprehensive analysis of the human mitochondrial transcriptome across multiple cell lines and tissues. Using directional deep sequencing and parallel analysis of RNA ends, we demonstrate wide variation in mitochondrial transcript abundance and precisely resolve transcript processing and maturation events. We identify previously undescribed transcripts, including small RNAs, and observe the enrichment of several nuclear RNAs in mitochondria. Using high-throughput in vivo DNaseI footprinting, we establish the global profile of DNA-binding protein occupancy across the mitochondrial genome at single-nucleotide resolution, revealing regulatory features at mitochondrial transcription initiation sites and functional insights into disease-associated variants. This integrated analysis of the mitochondrial transcriptome reveals unexpected complexity in the regulation, expression, and processing of mitochondrial RNA and provides a resource for future studies of mitochondrial function (accessed at http://mitochondria.matticklab.com).
RNA-sequencing has become the gold standard for whole-transcriptome gene expression quantification. Multiple algorithms have been developed to derive gene counts from sequencing reads. While a number ...of benchmarking studies have been conducted, the question remains how individual methods perform at accurately quantifying gene expression levels from RNA-sequencing reads. We performed an independent benchmarking study using RNA-sequencing data from the well established MAQCA and MAQCB reference samples. RNA-sequencing reads were processed using five workflows (Tophat-HTSeq, Tophat-Cufflinks, STAR-HTSeq, Kallisto and Salmon) and resulting gene expression measurements were compared to expression data generated by wet-lab validated qPCR assays for all protein coding genes. All methods showed high gene expression correlations with qPCR data. When comparing gene expression fold changes between MAQCA and MAQCB samples, about 85% of the genes showed consistent results between RNA-sequencing and qPCR data. Of note, each method revealed a small but specific gene set with inconsistent expression measurements. A significant proportion of these method-specific inconsistent genes were reproducibly identified in independent datasets. These genes were typically smaller, had fewer exons, and were lower expressed compared to genes with consistent expression measurements. We propose that careful validation is warranted when evaluating RNA-seq based expression profiles for this specific gene set.
Clinical genomics promises unprecedented precision in understanding the genetic basis of disease. Understanding the impact of variation across the genome is required to realize this potential. ...Currently, clinical genomics analyses focus on protein-coding genes. However, the noncoding genome is substantially larger than the protein-coding counterpart, and contains structural, regulatory, and transcribed information that needs to be incorporated into genome annotations if the full extent of the opportunity to use genomic information in healthcare is to be realized. This article reviews the challenges and opportunities in unlocking the clinical significance of coding and noncoding genomic information and translating its utility in practice.
Transcriptomic analyses have identified tens of thousands of intergenic, intronic, and cis-antisense long noncoding RNAs (lncRNAs) that are expressed from mammalian genomes. Despite progress in ...functional characterization, little is known about the post-transcriptional regulation of lncRNAs and their half-lives. Although many are easily detectable by a variety of techniques, it has been assumed that lncRNAs are generally unstable, but this has not been examined genome-wide. Utilizing a custom noncoding RNA array, we determined the half-lives of ∼800 lncRNAs and ∼12,000 mRNAs in the mouse Neuro-2a cell line. We find only a minority of lncRNAs are unstable. LncRNA half-lives vary over a wide range, comparable to, although on average less than, that of mRNAs, suggestive of complex metabolism and widespread functionality. Combining half-lives with comprehensive lncRNA annotations identified hundreds of unstable (half-life < 2 h) intergenic, cis-antisense, and intronic lncRNAs, as well as lncRNAs showing extreme stability (half-life > 16 h). Analysis of lncRNA features revealed that intergenic and cis-antisense RNAs are more stable than those derived from introns, as are spliced lncRNAs compared to unspliced (single exon) transcripts. Subcellular localization of lncRNAs indicated widespread trafficking to different cellular locations, with nuclear-localized lncRNAs more likely to be unstable. Surprisingly, one of the least stable lncRNAs is the well-characterized paraspeckle RNA Neat1, suggesting Neat1 instability contributes to the dynamic nature of this subnuclear domain. We have created an online interactive resource (http://stability.matticklab.com) that allows easy navigation of lncRNA and mRNA stability profiles and provides a comprehensive annotation of ~7200 mouse lncRNAs.
Memory B cells (MBCs) and plasma cells (PCs) constitute the two cellular outputs of germinal center (GC) responses that together facilitate long-term humoral immunity. Although expression of the ...transcription factor BLIMP-1 identifies cells undergoing PC differentiation, no such marker exists for cells committed to the MBC lineage. Here, we report that the chemokine receptor CCR6 uniquely marks MBC precursors in both mouse and human GCs. CCR6+ GC B cells were highly enriched within the GC light zone (LZ), were the most quiescent of all GC B cells, exhibited a cell-surface phenotype and gene expression signature indicative of an MBC transition, and possessed the augmented response characteristics of MBCs. MBC precursors within the GC LZ predominantly possessed a low affinity for antigen but also included cells from within the high-affinity pool. These data indicate a fundamental dichotomy between the processes that drive MBC and PC differentiation during GC responses.
Display omitted
•Memory B cell precursors in mouse and human germinal centers are marked by CCR6•Memory B cell precursors localize to the germinal-center light zone•Memory B cell precursors primarily have a low affinity for antigen•Memory B cell precursors have acquired rapid response characteristics
Although memory B cells sustain long-term humoral immunity, the nature of their precursors within the germinal center has remained elusive. Suan et al. demonstrate that these cells are uniquely identified by CCR6 expression in both mouse and human germinal centers, that they are the most quiescent B cells in these structures, and that they are generated within the light zone. Memory B cell precursors have a primarily low affinity for antigen but also include cells emerging from the high-affinity compartment.
The identification of cancer-associated long noncoding RNAs (lncRNAs) and the investigation of their molecular and biological functions are important to understand the molecular biology of cancer and ...its progression. Although the functions of lncRNAs and the mechanisms regulating their expression are largely unknown, recent studies are beginning to unravel their importance in human health and disease. Here, we report that a number of lncRNAs are differentially expressed in melanoma cell lines in comparison to melanocytes and keratinocyte controls. One of these lncRNAs, SPRY4-IT1 (GenBank accession ID AK024556), is derived from an intron of the SPRY4 gene and is predicted to contain several long hairpins in its secondary structure. RNA-FISH analysis showed that SPRY4-IT1 is predominantly localized in the cytoplasm of melanoma cells, and SPRY4-IT1 RNAi knockdown results in defects in cell growth, differentiation, and higher rates of apoptosis in melanoma cell lines. Differential expression of both SPRY4 and SPRY4-IT1 was also detected in vivo, in 30 distinct patient samples, classified as primary in situ, regional metastatic, distant metastatic, and nodal metastatic melanoma. The elevated expression of SPRY4-IT1 in melanoma cells compared to melanocytes, its accumulation in cell cytoplasm, and effects on cell dynamics, including increased rate of wound closure on SPRY4-IT1 overexpression, suggest that the higher expression of SPRY4-IT1 may have an important role in the molecular etiology of human melanoma.
Advanced high-throughput sequencing technologies have produced massive amount of reads data, and algorithms have been specially designed to contract the size of these datasets for efficient storage ...and transmission. Reordering reads with regard to their positions in de novo assembled contigs or in explicit reference sequences has been proven to be one of the most effective reads compression approach. As there is usually no good prior knowledge about the reference sequence, current focus is on the novel construction of de novo assembled contigs.
We introduce a new de novo compression algorithm named minicom. This algorithm uses large k-minimizers to index the reads and subgroup those that have the same minimizer. Within each subgroup, a contig is constructed. Then some pairs of the contigs derived from the subgroups are merged into longer contigs according to a (w, k)-minimizer-indexed suffix-prefix overlap similarity between two contigs. This merging process is repeated after the longer contigs are formed until no pair of contigs can be merged. We compare the performance of minicom with two reference-based methods and four de novo methods on 18 datasets (13 RNA-seq datasets and 5 whole genome sequencing datasets). In the compression of single-end reads, minicom obtained the smallest file size for 22 of 34 cases with significant improvement. In the compression of paired-end reads, minicom achieved 20-80% compression gain over the best state-of-the-art algorithm. Our method also achieved a 10% size reduction of compressed files in comparison with the best algorithm under the reads-order preserving mode. These excellent performances are mainly attributed to the exploit of the redundancy of the repetitive substrings in the long contigs.
https://github.com/yuansliu/minicom.
Supplementary data are available at Bioinformatics online.