Abstract
The FANTOM web resource (http://fantom.gsc.riken.jp/) was developed to provide easy access to the data produced by the FANTOM project. It contains the most complete and comprehensive sets of ...actively transcribed enhancers and promoters in the human and mouse genomes. We determined the transcription activities of these regulatory elements by CAGE (Cap Analysis of Gene Expression) for both steady and dynamic cellular states in all major and some rare cell types, consecutive stages of differentiation and responses to stimuli. We have expanded the resource by employing different assays, such as RNA-seq, short RNA-seq and a paired-end protocol for CAGE (CAGEscan), to provide new angles to study the transcriptome. That yielded additional atlases of long noncoding RNAs, miRNAs and their promoters. We have also expanded the CAGE analysis to cover rat, dog, chicken, and macaque species for a limited number of cell types. The CAGE data obtained from human and mouse were reprocessed to make them available on the latest genome assemblies. Here, we report the recent updates of both data and interfaces in the FANTOM web resource.
The FANTOM5 project investigates transcription initiation activities in more than 1,000 human and mouse primary cells, cell lines and tissues using CAGE. Based on manual curation of sample ...information and development of an ontology for sample classification, we assemble the resulting data into a centralized data resource (http://fantom.gsc.riken.jp/5/). This resource contains web-based tools and data-access points for the research community to search and extract data related to samples, genes, promoter activities, transcription factors and enhancers across the FANTOM5 atlas.
Next-generation sequencing experiments have shown that microRNAs (miRNAs) are expressed in many different isoforms (isomiRs), whose biological relevance is often unclear. We found that mature miR-21, ...the most widely researched miRNA because of its importance in human disease, is produced in two prevalent isomiR forms that differ by 1 nt at their 3′ end, and moreover that the 3′ end of miR-21 is posttranscriptionally adenylated by the noncanonical poly(A) polymerase PAPD5. PAPD5 knockdown caused an increase in the miR-21 expression level, suggesting that PAPD5-mediated adenylation of miR-21 leads to its degradation. Exoribonuclease knockdown experiments followed by small-RNA sequencing suggested that PARN degrades miR-21 in the 3′-to-5′ direction. In accordance with this model, microarray expression profiling demonstrated that PAPD5 knockdown results in a down-regulation of miR-21 target mRNAs. We found that disruption of the miR-21 adenylation and degradation pathway is a general feature in tumors across a wide range of tissues, as evidenced by data from The Cancer Genome Atlas, as well as in the noncancerous proliferative disease psoriasis. We conclude that PAPD5 and PARN mediate degradation of oncogenic miRNA miR-21 through a tailing and trimming process, and that this pathway is disrupted in cancer and other proliferative diseases.
Lung cancer is the leading cause of cancer-related deaths worldwide. The majority of cancer driver mutations have been identified; however, relevant epigenetic regulation involved in tumorigenesis ...has only been fragmentarily analyzed. Epigenetically regulated genes have a great theranostic potential, especially in tumors with no apparent driver mutations. Here, epigenetically regulated genes were identified in lung cancer by an integrative analysis of promoter-level expression profiles from Cap Analysis of Gene Expression (CAGE) of 16 non-small cell lung cancer (NSCLC) cell lines and 16 normal lung primary cell specimens with DNA methylation data of 69 NSCLC cell lines and 6 normal lung epithelial cells. A core set of 49 coding genes and 10 long noncoding RNAs (lncRNA), which are upregulated in NSCLC cell lines due to promoter hypomethylation, was uncovered. Twenty-two epigenetically regulated genes were validated (upregulated genes with hypomethylated promoters) in the adenocarcinoma and squamous cell cancer subtypes of lung cancer using The Cancer Genome Atlas data. Furthermore, it was demonstrated that multiple copies of the REP522 DNA repeat family are prominently upregulated due to hypomethylation in NSCLC cell lines, which leads to cancer-specific expression of lncRNAs, such as RP1-90G24.10, AL022344.4, and PCAT7. Finally, Myeloma Overexpressed (MYEOV) was identified as the most promising candidate. Functional studies demonstrated that MYEOV promotes cell proliferation, survival, and invasion. Moreover, high MYEOV expression levels were associated with poor prognosis.
This report identifies a robust list of 22 candidate driver genes that are epigenetically regulated in lung cancer; such genes may complement the known mutational drivers.
http://mcr.aacrjournals.org/content/molcanres/15/10/1354/F1.large.jpg
.
Super-enhancers (SEs), which activate genes involved in cell-type specificity, have mainly been defined as genomic regions with top-ranked enrichment(s) of histone H3 with acetylated K27 (H3K27ac) ...and/or transcription coactivator(s) including a bromodomain and extra-terminal domain (BET) family protein, BRD4. However, BRD4 preferentially binds to multi-acetylated histone H4, typically with acetylated K5 and K8 (H4K5acK8ac), leading us to hypothesize that SEs should be defined by high H4K5acK8ac enrichment at least as well as by that of H3K27ac. Here, we conducted genome-wide profiling of H4K5acK8ac and H3K27ac, BRD4 binding, and the transcriptome by using a BET inhibitor, JQ1, in three human glial cell lines. When SEs were defined as having the top ranks for H4K5acK8ac or H3K27ac signal, 43% of H4K5acK8ac-ranked SEs were distinct from H3K27ac-ranked SEs in a glioblastoma stem-like cell (GSC) line. CRISPR-Cas9-mediated deletion of the H4K5acK8ac-preferred SEs associated with MYCN and NFIC decreased the stem-like properties in GSCs. Collectively, our data highlights H4K5acK8ac's utility for identifying genes regulating cell-type specificity.
Gene expression profiles in homologous tissues have been observed to be different between species, which may be due to differences between species in the gene expression program in each cell type, ...but may also reflect differences in cell type composition of each tissue in different species. Here, we compare expression profiles in matching primary cells in human, mouse, rat, dog, and chicken using Cap Analysis Gene Expression (CAGE) and short RNA (sRNA) sequencing data from FANTOM5. While we find that expression profiles of orthologous genes in different species are highly correlated across cell types, in each cell type many genes were differentially expressed between species. Expression of genes with products involved in transcription, RNA processing, and transcriptional regulation was more likely to be conserved, while expression of genes encoding proteins involved in intercellular communication was more likely to have diverged during evolution. Conservation of expression correlated positively with the evolutionary age of genes, suggesting that divergence in expression levels of genes critical for cell function was restricted during evolution. Motif activity analysis showed that both promoters and enhancers are activated by the same transcription factors in different species. An analysis of expression levels of mature miRNAs and of primary miRNAs identified by CAGE revealed that evolutionary old miRNAs are more likely to have conserved expression patterns than young miRNAs. We conclude that key aspects of the regulatory network are conserved, while differential expression of genes involved in cell-to-cell communication may contribute greatly to phenotypic differences between species.
CAGE (cap analysis gene expression) and RNA-seq are two major technologies used to identify transcript abundances as well as structures. They measure expression by sequencing from either the 5' end ...of capped molecules (CAGE) or tags randomly distributed along the length of a transcript (RNA-seq). Library protocols for clonally amplified (Illumina, SOLiD, 454 Life Sciences Roche, Ion Torrent), second-generation sequencing platforms typically employ PCR preamplification prior to clonal amplification, while third-generation, single-molecule sequencers can sequence unamplified libraries. Although these transcriptome profiling platforms have been demonstrated to be individually reproducible, no systematic comparison has been carried out between them. Here we compare CAGE, using both second- and third-generation sequencers, and RNA-seq, using a second-generation sequencer based on a panel of RNA mixtures from two human cell lines to examine power in the discrimination of biological states, detection of differentially expressed genes, linearity of measurements, and quantification reproducibility. We found that the quantified levels of gene expression are largely comparable across platforms and conclude that CAGE and RNA-seq are complementary technologies that can be used to improve incomplete gene models. We also found systematic bias in the second- and third-generation platforms, which is likely due to steps such as linker ligation, cleavage by restriction enzymes, and PCR amplification. This study provides a perspective on the performance of these platforms, which will be a baseline in the design of further experiments to tackle complex transcriptomes uncovered in a wide range of cell types.
Neuroinflammation is highly influenced by microglia, particularly through activation of the NLRP3 inflammasome and subsequent release of IL-1β. Extracellular ATP is a strong activator of NLRP3 by ...inducing K
efflux as a key signaling event, suggesting that K
-permeable ion channels could have high therapeutic potential. In microglia, these include ATP-gated THIK-1 K
channels and P2X7 receptors, but their interactions and potential therapeutic role in the human brain are unknown. Using a novel specific inhibitor of THIK-1 in combination with patch-clamp electrophysiology in slices of human neocortex, we found that THIK-1 generated the main tonic K
conductance in microglia that sets the resting membrane potential. Extracellular ATP stimulated K
efflux in a concentration-dependent manner only via P2X7 and metabotropic potentiation of THIK-1. We further demonstrated that activation of P2X7 was mandatory for ATP-evoked IL-1β release, which was strongly suppressed by blocking THIK-1. Surprisingly, THIK-1 contributed only marginally to the total K
conductance in the presence of ATP, which was dominated by P2X7. This suggests a previously unknown, K
-independent mechanism of THIK-1 for NLRP3 activation. Nuclear sequencing revealed almost selective expression of THIK-1 in human brain microglia, while P2X7 had a much broader expression. Thus, inhibition of THIK-1 could be an effective and, in contrast to P2X7, microglia-specific therapeutic strategy to contain neuroinflammation.
Cap Analysis of Gene Expression (CAGE) in combination with single-molecule sequencing technology allows precision mapping of transcription start sites (TSSs) and genome-wide capture of promoter ...activities in differentiated and steady state cell populations. Much less is known about whether TSS profiling can characterize diverse and non-steady state cell populations, such as the approximately 400 transitory and heterogeneous cell types that arise during ontogeny of vertebrate animals. To gain such insight, we used the chick model and performed CAGE-based TSS analysis on embryonic samples covering the full 3-week developmental period. In total, 31,863 robust TSS peaks (>1 tag per million TPM) were mapped to the latest chicken genome assembly, of which 34% to 46% were active in any given developmental stage. ZENBU, a web-based, open-source platform, was used for interactive data exploration. TSSs of genes critical for lineage differentiation could be precisely mapped and their activities tracked throughout development, suggesting that non-steady state and heterogeneous cell populations are amenable to CAGE-based transcriptional analysis. Our study also uncovered a large set of extremely stable housekeeping TSSs and many novel stage-specific ones. We furthermore demonstrated that TSS mapping could expedite motif-based promoter analysis for regulatory modules associated with stage-specific and housekeeping genes. Finally, using Brachyury as an example, we provide evidence that precise TSS mapping in combination with Clustered Regularly Interspaced Short Palindromic Repeat (CRISPR)-on technology enables us, for the first time, to efficiently target endogenous avian genes for transcriptional activation. Taken together, our results represent the first report of genome-wide TSS mapping in birds and the first systematic developmental TSS analysis in any amniote species (birds and mammals). By facilitating promoter-based molecular analysis and genetic manipulation, our work also underscores the value of avian models in unravelling the complex regulatory mechanism of cell lineage specification during amniote development.