Recent advances in RNA-sequencing technologies have led to the discovery of thousands of previously unannotated noncoding transcripts, including many long noncoding RNAs (lncRNAs) whose functions ...remain largely unknown. Here we discuss considerations and best practices in lncRNA identification and annotation, which we hope will foster functional and mechanistic exploration.
We have known for decades that long noncoding RNAs (lncRNAs) can play essential functions across most forms of life. The maintenance of chromosome length requires an lncRNA (e.g., hTERC) and two ...lncRNAs in the ribosome that are required for protein synthesis. Thus, lncRNAs can represent powerful RNA machines. More recently, it has become clear that mammalian genomes encode thousands more lncRNAs. Thus, we raise the question: Which, if any, of these lncRNAs could also represent RNA-based machines? Here we synthesize studies that are beginning to address this question by investigating fundamental properties of lncRNA genes, revealing new insights into the RNA structure-function relationship, determining
cis
- and
trans
-acting lncRNAs in vivo, and generating new developments in high-throughput screening used to identify functional lncRNAs. Overall, these findings provide a context toward understanding the molecular grammar underlying lncRNA biology.
There is growing evidence that transcription and nuclear organization are tightly linked. Yet, whether transcription of thousands of long noncoding RNAs (lncRNAs) could play a role in this packaging ...process remains elusive. Although some lncRNAs have been found to have clear roles in nuclear architecture (e.g., FIRRE, NEAT1, XIST, and others), the vast majority remain poorly understood. In this Perspective, we highlight how the act of transcription can affect nuclear architecture. We synthesize several recent findings into a proposed model where the transcription of lncRNAs can serve as guide-posts for shaping genome organization. This model is similar to the game “cat’s cradle,” where the shape of a string is successively changed by opening up new sites for finger placement. Analogously, transcription of lncRNAs could serve as “grip holds” for nuclear proteins to pull the genome into new positions. This model could explain general lncRNA properties such as low abundance and tissue specificity. Overall, we propose a general framework for how the act of lncRNA transcription could play a role in organizing the 3D genome.
While some abundant lncRNAs have direct, defined functions, there are tens of thousands of lncRNAs of unknown function. Melé and Rinn propose a model in which transcription of lncRNAs generates guide-posts to shape 3D genome organization that may explain why the majority of lncRNAs are expressed at low levels and in specific tissues.
The central dogma of gene expression is that DNA is transcribed into messenger RNAs, which in turn serve as the template for protein synthesis. The discovery of extensive transcription of large RNA ...transcripts that do not code for proteins, termed long noncoding RNAs (lncRNAs), provides an important new perspective on the centrality of RNA in gene regulation. Here, we discuss genome-scale strategies to discover and characterize lncRNAs. An emerging theme from multiple model systems is that lncRNAs form extensive networks of ribonucleoprotein (RNP) complexes with numerous chromatin regulators and then target these enzymatic activities to appropriate locations in the genome. Consistent with this notion, lncRNAs can function as modular scaffolds to specify higher-order organization in RNP complexes and in chromatin states. The importance of these modes of regulation is underscored by the newly recognized roles of long RNAs for proper gene control across all kingdoms of life.
It is clear that RNA has a diverse set of functions and is more than just a messenger between gene and protein. The mammalian genome is extensively transcribed, giving rise to thousands of non-coding ...transcripts. Whether all of these transcripts are functional is debated, but it is evident that there are many functional large non-coding RNAs (ncRNAs). Recent studies have begun to explore the functional diversity and mechanistic role of these large ncRNAs. Here we synthesize these studies to provide an emerging model whereby large ncRNAs might achieve regulatory specificity through modularity, assembling diverse combinations of proteins and possibly RNA and DNA interactions.
Only relatively recently has it become clear that mammalian genomes encode tens of thousands of long non-coding RNAs (lncRNAs). A striking 40% of these are expressed specifically in the brain, where ...they show precisely regulated temporal and spatial expression patterns. This begs the question, what is the functional role of these many lncRNA transcripts in the brain? Here we canvass a growing number of mechanistic studies that have elucidated central roles for lncRNAs in the regulation of nervous system development and function. We also survey studies indicating that neurological and psychiatric disorders may ensue when these mechanisms break down. Finally, we synthesize these insights with evidence from comparative genomics to argue that lncRNAs may have played important roles in brain evolution, by virtue of their abundant sequence innovation in mammals and plausible mechanistic connections to the adaptive processes that occurred recently in the primate and human lineages.
Thousands of lncRNAs exhibit precise spatiotemporal expression in the nervous system. Briggs et al. canvass emerging studies showing the mechanistic importance of lncRNAs in brain development, function, and disease. They synthesize these functions with comparative genomic evidence to implicate lncRNAs in brain evolution.
Cellular homeostasis is achieved by the proper balance of regulatory networks that if disrupted can lead to cellular transformation. These cell circuits are fine-tuned and maintained by the ...coordinated function of proteins and non-coding RNAs (ncRNAs). In addition to the well-characterized protein coding and microRNAs constituents, large ncRNAs are also emerging as important regulatory molecules in tumor-suppressor and oncogenic pathways. Recent studies have revealed mechanistic insight of large ncRNAs regulating key cancer pathways at a transcriptional, post-transcriptional and epigenetic level. Here we synthesize these latest advances within the context of their mechanistic roles in regulating and maintaining cellular equilibrium. We posit that similar to protein-coding genes, large ncRNAs are a newly emerging class of oncogenic and tumor-suppressor genes. Our growing knowledge of the role of large ncRNAs in cellular transformation is pointing towards their potential use as biomarkers and targets for novel therapeutic approaches in the future.
The biochemistry of RNA-Seq library preparation results in cDNA fragments that are not uniformly distributed within the transcripts they represent. This non-uniformity must be accounted for when ...estimating expression levels, and we show how to perform the needed corrections using a likelihood based approach. We find improvements in expression estimates as measured by correlation with independently performed qRT-PCR and show that correction of bias leads to improved replicability of results across libraries and sequencing technologies.
Genome-wide sequencing has led to the discovery of thousands of long non-coding RNA (lncRNA) loci in the human genome, but evidence of functional significance has remained controversial for many ...lncRNAs. Genetically engineered model organisms are considered the gold standard for linking genotype to phenotype. Recent advances in CRISPR-Cas genome editing have led to a rapid increase in the use of mouse models to more readily survey lncRNAs for functional significance. Here, we review strategies to investigate the physiological relevance of lncRNA loci by highlighting studies that have used genetic mouse models to reveal key in vivo roles for lncRNAs, from fertility to brain development. We illustrate how an investigative approach, starting with whole-gene deletion followed by transcription termination and/or transgene rescue strategies, can provide definitive evidence for the in vivo function of mammalian lncRNAs.
The complex language of eukaryotic gene expression remains incompletely understood. Despite the importance suggested by many noncoding variants statistically associated with human disease, nearly all ...such variants have unknown mechanisms. Here, we address this challenge using an approach based on a recent machine learning advance-deep convolutional neural networks (CNNs). We introduce the open source package Basset to apply CNNs to learn the functional activity of DNA sequences from genomics data. We trained Basset on a compendium of accessible genomic sites mapped in 164 cell types by DNase-seq, and demonstrate greater predictive accuracy than previous methods. Basset predictions for the change in accessibility between variant alleles were far greater for Genome-wide association study (GWAS) SNPs that are likely to be causal relative to nearby SNPs in linkage disequilibrium with them. With Basset, a researcher can perform a single sequencing assay in their cell type of interest and simultaneously learn that cell's chromatin accessibility code and annotate every mutation in the genome with its influence on present accessibility and latent potential for accessibility. Thus, Basset offers a powerful computational approach to annotate and interpret the noncoding genome.