Over the past decade, it has become clear that mammalian genomes encode thousands of long non-coding RNAs (lncRNAs), many of which are now implicated in diverse biological processes. Recent work ...studying the molecular mechanisms of several key examples - including Xist, which orchestrates X chromosome inactivation - has provided new insights into how lncRNAs can control cellular functions by acting in the nucleus. Here we discuss emerging mechanistic insights into how lncRNAs can regulate gene expression by coordinating regulatory proteins, localizing to target loci and shaping three-dimensional (3D) nuclear organization. We explore these principles to highlight biological challenges in gene regulation, in which lncRNAs are well-suited to perform roles that cannot be carried out by DNA elements or protein regulators alone, such as acting as spatial amplifiers of regulatory signals in the nucleus.
Pseudouridine is the most abundant RNA modification, yet except for a few well-studied cases, little is known about the modified positions and their function(s). Here, we develop Ψ-seq for ...transcriptome-wide quantitative mapping of pseudouridine. We validate Ψ-seq with spike-ins and de novo identification of previously reported positions and discover hundreds of unique sites in human and yeast mRNAs and snoRNAs. Perturbing pseudouridine synthases (PUS) uncovers which pseudouridine synthase modifies each site and their target sequence features. mRNA pseudouridinylation depends on both site-specific and snoRNA-guided pseudouridine synthases. Upon heat shock in yeast, Pus7p-mediated pseudouridylation is induced at >200 sites, and PUS7 deletion decreases the levels of otherwise pseudouridylated mRNA, suggesting a role in enhancing transcript stability. rRNA pseudouridine stoichiometries are conserved but reduced in cells from dyskeratosis congenita patients, where the PUS DKC1 is mutated. Our work identifies an enhanced, transcriptome-wide scope for pseudouridine and methods to dissect its underlying mechanisms and function.
Display omitted
•Ψ-seq for high resolution, transcriptome-wide profiling of pseudouridine•Many distinct sites in mRNA; dynamically regulated in heat shock•Sites depend on conserved cognate pseudouridine synthases in yeast and human•Reduced rRNA and TERC pseudouridine in dyskeratosis congenita patients
Transcriptome-wide pseudouridine mapping reveals extensive, dynamic pseudouridylation of mRNA and noncoding RNA in yeast and human.
Intermolecular RNA-RNA interactions are used by many noncoding RNAs (ncRNAs) to achieve their diverse functions. To identify these contacts, we developed a method based on RNA antisense purification ...to systematically map RNA-RNA interactions (RAP-RNA) and applied it to investigate two ncRNAs implicated in RNA processing: U1 small nuclear RNA, a component of the spliceosome, and Malat1, a large ncRNA that localizes to nuclear speckles. U1 and Malat1 interact with nascent transcripts through distinct targeting mechanisms. Using differential crosslinking, we confirmed that U1 directly hybridizes to 5′ splice sites and 5′ splice site motifs throughout introns and found that Malat1 interacts with pre-mRNAs indirectly through protein intermediates. Interactions with nascent pre-mRNAs cause U1 and Malat1 to localize proximally to chromatin at active genes, demonstrating that ncRNAs can use RNA-RNA interactions to target specific pre-mRNAs and genomic sites. RAP-RNA is sensitive to lower abundance RNAs as well, making it generally applicable for investigating ncRNAs.
Display omitted
•A general method to identify RNA-RNA interactions for many RNAs (>80 nucleotides)•Distinguishes direct and indirect RNA-RNA interactions using different crosslinkers•U1 snRNA interacts with pre-mRNAs directly, whereas Malat1 lncRNA interacts indirectly•RNA-RNA interactions target U1 and Malat1 to chromatin at active gene loci
Comprehensive mapping of intermolecular RNA-RNA interactions for U1 snRNA and Malat1 lncRNA reveals mechanisms for targeting noncoding RNAs to chromatin at active gene loci.
The
let-7 tumor suppressor microRNAs are known for their regulation of oncogenes, while the RNA-binding proteins
Lin28a/
b promote malignancy by inhibiting
let-7 biogenesis. We have uncovered ...unexpected roles for the
Lin28/let-7 pathway in regulating metabolism. When overexpressed in mice, both
Lin28a and
LIN28B promote an insulin-sensitized state that resists high-fat-diet induced diabetes. Conversely, muscle-specific loss of
Lin28a or overexpression of
let-7 results in insulin resistance and impaired glucose tolerance. These phenomena occur, in part, through the
let-7-mediated repression of multiple components of the insulin-PI3K-mTOR pathway, including
IGF1R, INSR, and
IRS2. In addition, the mTOR inhibitor, rapamycin, abrogates
Lin28a-mediated insulin sensitivity and enhanced glucose uptake. Moreover,
let-7 targets are enriched for genes containing SNPs associated with type 2 diabetes and control of fasting glucose in human genome-wide association studies. These data establish the
Lin28/let-7 pathway as a central regulator of mammalian glucose metabolism.
Display omitted
►
Lin28a/b promote glucose tolerance and insulin-sensitivity in mice ► Overexpression of
let-7 microRNA impairs glucose tolerance in mice ►
Lin28a/b promote and the
let-7's repress components of insulin-PI3K-mTOR signaling ►
Let-7 targets are enriched for type II diabetes-associated SNPs in human GWAS
The microRNA,
let-7, represses mTOR pathway components, contributing to the diabetic phenotypes of insulin resistance and impaired glucose tolerance in mice and humans.
Mammalian genomes are pervasively transcribed to produce thousands of long non-coding RNAs (lncRNAs). A few of these lncRNAs have been shown to recruit regulatory complexes through RNA-protein ...interactions to influence the expression of nearby genes, and it has been suggested that many other lncRNAs can also act as local regulators. Such local functions could explain the observation that lncRNA expression is often correlated with the expression of nearby genes. However, these correlations have been challenging to dissect and could alternatively result from processes that are not mediated by the lncRNA transcripts themselves. For example, some gene promoters have been proposed to have dual functions as enhancers, and the process of transcription itself may contribute to gene regulation by recruiting activating factors or remodelling nucleosomes. Here we use genetic manipulation in mouse cell lines to dissect 12 genomic loci that produce lncRNAs and find that 5 of these loci influence the expression of a neighbouring gene in cis. Notably, none of these effects requires the specific lncRNA transcripts themselves and instead involves general processes associated with their production, including enhancer-like activity of gene promoters, the process of transcription, and the splicing of the transcript. Furthermore, such effects are not limited to lncRNA loci: we find that four out of six protein-coding loci also influence the expression of a neighbour. These results demonstrate that cross-talk among neighbouring genes is a prevalent phenomenon that can involve multiple mechanisms and cis-regulatory signals, including a role for RNA splice sites. These mechanisms may explain the function and evolution of some genomic loci that produce lncRNAs and broadly contribute to the regulation of both coding and non-coding genes.
Enhancer elements in the human genome control how genes are expressed in specific cell types and harbor thousands of genetic variants that influence risk for common diseases
. Yet, we still do not ...know how enhancers regulate specific genes, and we lack general rules to predict enhancer-gene connections across cell types
. We developed an experimental approach, CRISPRi-FlowFISH, to perturb enhancers in the genome, and we applied it to test >3,500 potential enhancer-gene connections for 30 genes. We found that a simple activity-by-contact model substantially outperformed previous methods at predicting the complex connections in our CRISPR dataset. This activity-by-contact model allows us to construct genome-wide maps of enhancer-gene connections in a given cell type, on the basis of chromatin state measurements. Together, CRISPRi-FlowFISH and the activity-by-contact model provide a systematic approach to map and predict which enhancers regulate which genes, and will help to interpret the functions of the thousands of disease risk variants in the noncoding genome.
Many large noncoding RNAs (lncRNAs) regulate chromatin, but the mechanisms by which they localize to genomic targets remain unexplored. We investigated the localization mechanisms of the Xist lncRNA ...during X-chromosome inactivation (XCI), a paradigm of lncRNA-mediated chromatin regulation. During the maintenance of XCI, Xist binds broadly across the X chromosome. During initiation of XCI, Xist initially transfers to distal regions across the X chromosome that are not defined by specific sequences. Instead, Xist identifies these regions by exploiting the three-dimensional conformation of the X chromosome. Xist requires its silencing domain to spread across actively transcribed regions and thereby access the entire chromosome. These findings suggest a model in which Xist coats the X chromosome by searching in three dimensions, modifying chromosome structure, and spreading to newly accessible locations.
The human genome contains thousands of long non-coding RNAs
, but specific biological functions and biochemical mechanisms have been discovered for only about a dozen
. A specific long non-coding ...RNA-non-coding RNA activated by DNA damage (NORAD)-has recently been shown to be required for maintaining genomic stability
, but its molecular mechanism is unknown. Here we combine RNA antisense purification and quantitative mass spectrometry to identify proteins that directly interact with NORAD in living cells. We show that NORAD interacts with proteins involved in DNA replication and repair in steady-state cells and localizes to the nucleus upon stimulation with replication stress or DNA damage. In particular, NORAD interacts with RBMX, a component of the DNA-damage response, and contains the strongest RBMX-binding site in the transcriptome. We demonstrate that NORAD controls the ability of RBMX to assemble a ribonucleoprotein complex-which we term NORAD-activated ribonucleoprotein complex 1 (NARC1)-that contains the known suppressors of genomic instability topoisomerase I (TOP1), ALYREF and the PRPF19-CDC5L complex. Cells depleted for NORAD or RBMX display an increased frequency of chromosome segregation defects, reduced replication-fork velocity and altered cell-cycle progression-which represent phenotypes that are mechanistically linked to TOP1 and PRPF19-CDC5L function. Expression of NORAD in trans can rescue defects caused by NORAD depletion, but rescue is significantly impaired when the RBMX-binding site in NORAD is deleted. Our results demonstrate that the interaction between NORAD and RBMX is important for NORAD function, and that NORAD is required for the assembly of the previously unknown topoisomerase complex NARC1, which contributes to maintaining genomic stability. In addition, we uncover a previously unknown function for long non-coding RNAs in modulating the ability of an RNA-binding protein to assemble a higher-order ribonucleoprotein complex.
Chromosomal translocations are frequent features of cancer genomes that contribute to disease progression. These rearrangements result from formation and illegitimate repair of DNA double-strand ...breaks (DSBs), a process that requires spatial colocalization of chromosomal breakpoints. The "contact first" hypothesis suggests that translocation partners colocalize in the nuclei of normal cells, prior to rearrangement. It is unclear, however, the extent to which spatial interactions based on three-dimensional genome architecture contribute to chromosomal rearrangements in human disease. Here we intersect Hi-C maps of three-dimensional chromosome conformation with collections of 1,533 chromosomal translocations from cancer and germline genomes. We show that many translocation-prone pairs of regions genome-wide, including the cancer translocation partners BCR-ABL and MYC-IGH, display elevated Hi-C contact frequencies in normal human cells. Considering tissue specificity, we find that translocation breakpoints reported in human hematologic malignancies have higher Hi-C contact frequencies in lymphoid cells than those reported in sarcomas and epithelial tumors. However, translocations from multiple tissue types show significant correlation with Hi-C contact frequencies, suggesting that both tissue-specific and universal features of chromatin structure contribute to chromosomal alterations. Our results demonstrate that three-dimensional genome architecture shapes the landscape of rearrangements directly observed in human disease and establish Hi-C as a key method for dissecting these effects.
Genomic analysis of tumours has led to the identification of hundreds of cancer genes on the basis of the presence of mutations in protein-coding regions. By contrast, much less is known about ...cancer-causing mutations in non-coding regions. Here we perform deep sequencing in 360 primary breast cancers and develop computational methods to identify significantly mutated promoters. Clear signals are found in the promoters of three genes. FOXA1, a known driver of hormone-receptor positive breast cancer, harbours a mutational hotspot in its promoter leading to overexpression through increased E2F binding. RMRP and NEAT1, two non-coding RNA genes, carry mutations that affect protein binding to their promoters and alter expression levels. Our study shows that promoter regions harbour recurrent mutations in cancer with functional consequences and that the mutations occur at similar frequencies as in coding regions. Power analyses indicate that more such regions remain to be discovered through deep sequencing of adequately sized cohorts of patients.