The proper activities of enhancers and gene promoters are essential for coordinated transcription within a cell. Although diverse methodologies have been developed to identify enhancers and ...promoters, most have tacitly assumed that these elements are distinct. However, studies have unexpectedly shown that regulatory elements may have both enhancer and promoter functions. Here we review these results, focusing on the factors that determine the promoter and/or enhancer activity of regulatory elements. We discuss emerging models that define regulatory elements by accessible DNA and their non-mutually-exclusive abilities to drive transcription initiation (promoter activity) and/or to enhance transcription at other such regions (enhancer activity).
Alternative usage of transcript isoforms from the same gene has been hypothesized as an important feature in cancers. However, differential usage of gene transcripts between conditions (isoform ...switching) has not been comprehensively characterized in and across cancer types. To this end, we developed methods for identification and visualization of isoform switches with predicted functional consequences. Using these methods, we characterized isoform switching in RNA-seq data from >5,500 cancer patients covering 12 solid cancer types. Isoform switches with potential functional consequences were common, affecting approximately 19% of multiple transcript genes. Among these, isoform switches leading to loss of DNA sequence encoding protein domains were more frequent than expected, particularly in pancancer switches. We identified several isoform switches as powerful biomarkers: 31 switches were highly predictive of patient survival independent of cancer types. Our data constitute an important resource for cancer researchers, available through interactive web tools. Moreover, our methods, available as an R package, enable systematic analysis of isoform switches from other RNA-seq datasets.
This study indicates that isoform switches with predicted functional consequences are common and important in dysfunctional cells, which in turn means that gene expression should be analyzed at the isoform level.
http://mcr.aacrjournals.org/content/molcanres/15/9/1206/F1.large.jpg.
.
Abstract
Summary
Alternative splicing is an important mechanism involved in health and disease. Recent work highlights the importance of investigating genome-wide changes in splicing patterns and the ...subsequent functional consequences. Current computational methods only support such analysis on a gene-by-gene basis. Therefore, we extended IsoformSwitchAnalyzeR R library to enable analysis of genome-wide changes in specific types of alternative splicing and predicted functional consequences of the resulting isoform switches. As a case study, we analyzed RNA-seq data from The Cancer Genome Atlas and found systematic changes in alternative splicing and the consequences of the associated isoform switches.
Availability and implementation
Windows, Linux and Mac OS: http://bioconductor.org/packages/IsoformSwitchAnalyzeR.
Supplementary information
Supplementary data are available at Bioinformatics online.
Highlights • Recent results challenge the notion that promoters and enhancers are distinct entities. • Enhancers can independently work as promoters. • Gene promoters can have enhancer activity. • ...The primary function of a regulatory element is context dependent. • We propose a unified model of regulatory elements.
Abstract
JASPAR (http://jaspar.genereg.net/) is an open-access database containing manually curated, non-redundant transcription factor (TF) binding profiles for TFs across six taxonomic groups. In ...this 9th release, we expanded the CORE collection with 341 new profiles (148 for plants, 101 for vertebrates, 85 for urochordates, and 7 for insects), which corresponds to a 19% expansion over the previous release. We added 298 new profiles to the Unvalidated collection when no orthogonal evidence was found in the literature. All the profiles were clustered to provide familial binding profiles for each taxonomic group. Moreover, we revised the structural classification of DNA binding domains to consider plant-specific TFs. This release introduces word clouds to represent the scientific knowledge associated with each TF. We updated the genome tracks of TFBSs predicted with JASPAR profiles in eight organisms; the human and mouse TFBS predictions can be visualized as native tracks in the UCSC Genome Browser. Finally, we provide a new tool to perform JASPAR TFBS enrichment analysis in user-provided genomic regions. All the data is accessible through the JASPAR website, its associated RESTful API, the R/Bioconductor data package, and a new Python package, pyJASPAR, that facilitates serverless access to the data.
Abstract
JASPAR (http://jaspar.genereg.net) is an open-access database of curated, non-redundant transcription factor (TF)-binding profiles stored as position frequency matrices (PFMs) for TFs across ...multiple species in six taxonomic groups. In this 8th release of JASPAR, the CORE collection has been expanded with 245 new PFMs (169 for vertebrates, 42 for plants, 17 for nematodes, 10 for insects, and 7 for fungi), and 156 PFMs were updated (125 for vertebrates, 28 for plants and 3 for insects). These new profiles represent an 18% expansion compared to the previous release. JASPAR 2020 comes with a novel collection of unvalidated TF-binding profiles for which our curators did not find orthogonal supporting evidence in the literature. This collection has a dedicated web form to engage the community in the curation of unvalidated TF-binding profiles. Moreover, we created a Q&A forum to ease the communication between the user community and JASPAR curators. Finally, we updated the genomic tracks, inference tool, and TF-binding profile similarity clusters. All the data is available through the JASPAR website, its associated RESTful API, and through the JASPAR2020 R/Bioconductor package.
Abstract
JASPAR (http://jaspar.genereg.net) is an open-access database of curated, non-redundant transcription factor (TF)-binding profiles stored as position frequency matrices (PFMs) and TF ...flexible models (TFFMs) for TFs across multiple species in six taxonomic groups. In the 2018 release of JASPAR, the CORE collection has been expanded with 322 new PFMs (60 for vertebrates and 262 for plants) and 33 PFMs were updated (24 for vertebrates, 8 for plants and 1 for insects). These new profiles represent a 30% expansion compared to the 2016 release. In addition, we have introduced 316 TFFMs (95 for vertebrates, 218 for plants and 3 for insects). This release incorporates clusters of similar PFMs in each taxon and each TF class per taxon. The JASPAR 2018 CORE vertebrate collection of PFMs was used to predict TF-binding sites in the human genome. The predictions are made available to the scientific community through a UCSC Genome Browser track data hub. Finally, this update comes with a new web framework with an interactive and responsive user-interface, along with new features. All the underlying data can be retrieved programmatically using a RESTful API and through the JASPAR 2018 R/Bioconductor package.
The quality of gene annotation determines the interpretation of results obtained in transcriptomic studies. The growing number of genome sequence information calls for experimental and computational ...pipelines for de novo transcriptome annotation. Ideally, gene and transcript models should be called from a limited set of key experimental data. We developed TranscriptomeReconstructoR, an R package which implements a pipeline for automated transcriptome annotation. It relies on integrating features from independent and complementary datasets: (i) full-length RNA-seq for detection of splicing patterns and (ii) high-throughput 5' and 3' tag sequencing data for accurate definition of gene borders. The pipeline can also take a nascent RNA-seq dataset to supplement the called gene model with transient transcripts. Our proof-of-concept data suggest a cost-efficient strategy for rapid and accurate annotation of complex eukaryotic transcriptomes. We combine the choice of library preparation methods and sequencing platforms with the dedicated computational pipeline implemented in the TranscriptomeReconstructoR package. The pipeline only requires prior knowledge on the reference genomic DNA sequence, but not the transcriptome. The package seamlessly integrates with Bioconductor packages for downstream analysis.
JASPAR (http://jaspar.genereg.net) is an open-access database storing curated, non-redundant transcription factor (TF) binding profiles representing transcription factor binding preferences as ...position frequency matrices for multiple species in six taxonomic groups. For this 2016 release, we expanded the JASPAR CORE collection with 494 new TF binding profiles (315 in vertebrates, 11 in nematodes, 3 in insects, 1 in fungi and 164 in plants) and updated 59 profiles (58 in vertebrates and 1 in fungi). The introduced profiles represent an 83% expansion and 10% update when compared to the previous release. We updated the structural annotation of the TF DNA binding domains (DBDs) following a published hierarchical structural classification. In addition, we introduced 130 transcription factor flexible models trained on ChIP-seq data for vertebrates, which capture dinucleotide dependencies within TF binding sites. This new JASPAR release is accompanied by a new web tool to infer JASPAR TF binding profiles recognized by a given TF protein sequence. Moreover, we provide the users with a Ruby module complementing the JASPAR API to ease programmatic access and use of the JASPAR collection of profiles. Finally, we provide the JASPAR2016 R/Bioconductor data package with the data of this release.
In animals, RNA polymerase II initiates transcription bidirectionally from gene promoters to produce pre-mRNAs on the forward strand and promoter upstream transcripts (PROMPTs) on the reverse strand. ...PROMPTs are degraded by the nuclear exosome. Previous studies based on nascent RNA approaches concluded that Arabidopsis (
) does not produce PROMPTs. Here, we used steady-state RNA sequencing in mutants defective in nuclear RNA decay including the exosome to reassess the existence of Arabidopsis PROMPTs. While they are rare, we identified ∼100 cases of exosome-sensitive PROMPTs in Arabidopsis. Such PROMPTs are sources of small interfering RNAs in exosome-deficient mutants, perhaps explaining why plants have evolved mechanisms to suppress PROMPTs. In addition, we found ∼200 long, unspliced and exosome-sensitive antisense RNAs that arise from transcription start sites within parts of the genome encoding 3'-untranslated regions on the sense strand. The previously characterized noncoding RNA that regulates expression of the key seed dormancy regulator,
, is a typical representative of this class of RNAs. Transcription factor genes are overrepresented among loci with exosome-sensitive antisense RNAs, suggesting a potential for widespread control of gene expression via this class of noncoding RNAs. Lastly, we assess the use of alternative promoters in Arabidopsis and compare the accuracy of existing TSS annotations.