Populus trichocarpa
(
P. trichocarpa
) is a model tree for the investigation of wood formation. In recent years, researchers have generated a large number of high-throughput sequencing data in
P. ...trichocarpa
. However, no comprehensive database that provides multi-omics associations for the investigation of secondary growth in response to diverse stresses has been reported. Therefore, we developed a public repository that presents comprehensive measurements of gene expression and post-transcriptional regulation by integrating 144 RNA-Seq, 33 ChIP-seq, and six single-molecule real-time (SMRT) isoform sequencing (Iso-seq) libraries prepared from tissues subjected to different stresses. All the samples from different studies were analyzed to obtain gene expression, co-expression network, and differentially expressed genes (DEG) using unified parameters, which allowed comparison of results from different studies and treatments. In addition to gene expression, we also identified and deposited pre-processed data about alternative splicing (AS), alternative polyadenylation (APA) and alternative transcription initiation (ATI). The post-transcriptional regulation, differential expression, and co-expression network datasets were integrated into a new
P. trichocarpa
Stem Differentiating Xylem (PSDX) database (
http://forestry.fafu.edu.cn/db/SDX
), which further highlights gene families of RNA-binding proteins and stress-related genes. The PSDX also provides tools for data query, visualization, a genome browser, and the BLAST option for sequence-based query. Much of the data is also available for bulk download. The availability of PSDX contributes to the research related to the secondary growth in response to stresses in
P. trichocarpa
, which will provide new insights that can be useful for the improvement of stress tolerance in woody plants.
Moso bamboo is an important forest species with a variety of ecological, economic, and cultural values. However, the gene annotation information of moso bamboo is only based on the transcriptome ...sequencing, lacking the evidence of proteome. The lignification and fiber in moso bamboo leads to a difficulty in the extraction of protein using conventional methods, which seriously hinders research on the proteomics of moso bamboo. The purpose of this study is to establish efficient methods for extracting the total proteins from moso bamboo for following mass spectrometry-based quantitative proteome identification. Here, we have successfully established a set of efficient methods for extracting total proteins of moso bamboo followed by mass spectrometry-based label-free quantitative proteome identification, which further improved the protein annotation of moso bamboo genes. In this study, 10,376 predicted coding genes were confirmed by quantitative proteomics, accounting for 35.8% of all annotated protein-coding genes. Proteome analysis also revealed the protein-coding potential of 1015 predicted long noncoding RNA (lncRNA), accounting for 51.03% of annotated lncRNAs. Thus, mass spectrometry-based proteomics provides a reliable method for gene annotation. Especially, quantitative proteomics revealed the translation patterns of proteins in moso bamboo. In addition, the 3284 transcript isoforms from 2663 genes identified by Pacific BioSciences (PacBio) single-molecule real-time long-read isoform sequencing (Iso-Seq) was confirmed on the protein level by mass spectrometry. Furthermore, domain analysis of mass spectrometry-identified proteins encoded in the same genomic locus revealed variations in domain composition pointing towards a functional diversification of protein isoform. Finally, we found that part transcripts targeted by nonsense-mediated mRNA decay (NMD) could also be translated into proteins. In summary, proteomic analysis in this study improves the proteomics-assisted genome annotation of moso bamboo and is valuable to the large-scale research of functional genomics in moso bamboo. In summary, this study provided a theoretical basis and technical support for directional gene function analysis at the proteomics level in moso bamboo.
Summary
Moso bamboo (Phyllostachys edulis) represents one of the fastest‐spreading plants in the world, due in part to its well‐developed rhizome system. However, the post‐transcriptional mechanism ...for the development of the rhizome system in bamboo has not been comprehensively studied. We therefore used a combination of single‐molecule long‐read sequencing technology and polyadenylation site sequencing (PAS‐seq) to re‐annotate the bamboo genome, and identify genome‐wide alternative splicing (AS) and alternative polyadenylation (APA) in the rhizome system. In total, 145 522 mapped full‐length non‐chimeric (FLNC) reads were analyzed, resulting in the correction of 2241 mis‐annotated genes and the identification of 8091 previously unannotated loci. Notably, more than 42 280 distinct splicing isoforms were derived from 128 667 intron‐containing full‐length FLNC reads, including a large number of AS events associated with rhizome systems. In addition, we characterized 25 069 polyadenylation sites from 11 450 genes, 6311 of which have APA sites. Further analysis of intronic polyadenylation revealed that LTR/Gypsy and LTR/Copia were two major transposable elements within the intronic polyadenylation region. Furthermore, this study provided a quantitative atlas of poly(A) usage. Several hundred differential poly(A) sites in the rhizome‐root system were identified. Taken together, these results suggest that post‐transcriptional regulation may potentially have a vital role in the underground rhizome‐root system.
Significance Statement
Moso bamboo (Phyllostachys edulis) is one of the fastest‐spreading plants in the world, due in part to its well‐developed rhizome system. However, the post‐transcriptional mechanism for the development of the rhizome system in bamboo has not been well studied. We therefore used a combination of single‐molecule long‐read sequencing technology and polyadenylation site sequencing to re‐annotate the bamboo genome, and subsequently identify alternative splicing and alternative polyadenylation in the rhizome system.
We report the complete telomere-to-telomere genome assembly of Oldenlandia diffusa which renowned in traditional Chinese medicine, comprising 16 chromosomes and spanning 499.7 Mb. The assembly ...showcases 28 telomeres and minimal gaps, with a total of only five. Repeat sequences constitute 46.41% of the genome, and 49,701 potential protein-coding genes have been predicted. Compared with O. corymbosa, O. diffusa exhibits chromosome duplication and fusion events, diverging 20.34 million years ago. Additionally, a total of 11 clusters of terpene synthase have been identified. The comprehensive genome sequence, gene catalog, and terpene synthase clusters of O. diffusa detailed in this study will significantly contribute to advancing research in this species' genetic, genomic, and pharmacological aspects.
Epigenetic changes play an important role in plant growth and development and in stress response. However, DNA methylation pattern and its relationship with the expression changes of non-coding RNAs ...and mRNAs of Moso bamboo in response to abiotic stress is still largely unknown. In this work, we used whole-genome bisulfite sequencing in combination with whole-transcriptome sequencing to analyze the DNA methylation and transcription patterns of mRNAs and non-coding RNAs in Moso bamboo under abiotic stresses such as cold, heat, ultraviolet (UV) and salinity. We found that CHH methylation in the promoter region was positively correlated with gene expression, while CHG and CHH methylations in the gene body regions were negatively associated with gene expression. Moreover, CG and CHG methylations in the promoter regions were negatively correlated with the transcript abundance of long non-coding RNAs (lncRNAs), microRNAs (miRNAs) and circular RNAs (circRNAs). Similarly, the methylation levels of three contexts in the genic regions were negatively correlated with the transcript abundance of lncRNAs and miRNAs but positively correlated with that of circRNAs. In addition, we suggested that the reduction of 21-nt and 24-nt small interfering RNA (siRNA) expression tended to increase methylation levels in the genic regions. We found that stress-responsive genes such as CRPK1, HSFB2A and CIPK were differentially methylated and expressed. Our results also proposed that DNA methylation may regulate the expression of the transcription factors (TFs) and plant hormone signalling genes such as IAA9, MYC2 and ERF110 in response to abiotic stress. This study firstly reports the abiotic stress-responsive DNA methylation pattern and its involvement of expression of coding RNAs and non-coding RNAs in Moso bamboo. The results expand the knowledge of epigenetic mechanisms in Moso bamboo under abiotic stress and support in-depth deciphering of the function of specific non-coding RNAs in future studies.
•Dynamic DNA methylation in moso bamboo were correlated with gene expression changes, which involved in stress-response.•DNA methylation were related to non-coding RNA expression and the latter further regulatedtheir target gene expression.•The 21-nt and 24-nt siRNA expressions tended to increase DNA methylation levels in moso bamboo.
Summary
Casuarina equisetifolia (C. equisetifolia), a conifer‐like angiosperm with resistance to typhoon and stress tolerance, is mainly cultivated in the coastal areas of Australasia. ...C. equisetifolia, making it a valuable model to study secondary growth associated genes and stress‐tolerance traits. However, the genome sequence is unavailable and therefore wood‐associated growth rate and stress resistance at the molecular level is largely unexplored. We therefore constructed a high‐quality draft genome sequence of C. equisetifolia by a combination of Illumina second‐generation sequencing reads and Pacific Biosciences single‐molecule real‐time (SMRT) long reads to advance the investigation of this species. Here, we report the genome assembly, which contains approximately 300 megabases (Mb) and scaffold size of N50 is 1.06 Mb. Additionally, gene annotation, assisted by a combination of prediction and RNA‐seq data, generated 29 827 annotated protein‐coding genes and 1983 non‐coding genes, respectively. Furthermore, we found that the total number of repetitive sequences account for one‐third of the genome assembly. Here we also construct the genome‐wide map of DNA modification, such as two novel forms N6‐adenine (6mA) and N4‐methylcytosine (4mC) at the level of single‐nucleotide resolution using single‐molecule real‐time (SMRT) sequencing. Interestingly, we found that 17% of 6mA modification genes and 15% of 4mC modification genes also included alternative splicing events. Finally, we investigated cellulose, hemicellulose, and lignin‐related genes, which were associated with secondary growth and contained different DNA modifications. The high‐quality genome sequence and annotation of C. equisetifolia in this study provide a valuable resource to strengthen our understanding of the diverse traits of trees.
Significance Statement
We constructed a high‐quality draft genome sequence of C. equisetifolia and systematically characterized 29,827 annotated protein‐coding genes and 1,983 non‐coding genes, respectively. Furthermore, we construct the genome‐wide map of DNA modification, such as two novel forms N6‐Adenine (6mA) and N4‐methylcytosine (4mC), including the genes in the regulation of lignin and cellulose. The high‐quality genome resource in this study provides valuable resources to strengthen our understanding of the diverse traits of trees.
Abstract
Summary
The single-molecule real-time (SMRT) isoform sequencing (Iso-Seq) based on Pacific Bioscience (PacBio) platform has received increasing attention for its ability to explore ...full-length isoforms. Thus, comprehensive tools for Iso-Seq bioinformatics analysis are extremely useful. Here, we present a one-stop solution for Iso-Seq analysis, called PRAPI to analyze alternative transcription initiation (ATI), alternative splicing (AS), alternative cleavage and polyadenylation (APA), natural antisense transcripts (NAT), and circular RNAs (circRNAs) comprehensively. PRAPI is capable of combining Iso-Seq full-length isoforms with short read data, such as RNA-Seq or polyadenylation site sequencing (PAS-seq) for differential expression analysis of NAT, AS, APA and circRNAs. Furthermore, PRAPI can annotate new genes and correct mis-annotated genes when gene annotation is available. Finally, PRAPI generates high-quality vector graphics to visualize and highlight the Iso-Seq results.
Availability and implementation
The Dockerfile of PRAPI is available at http://www.bioinfor.org/tool/PRAPI.
Understanding gene expression and regulation requires insights into RNA transcription, processing, modification, and translation. However, the relationship between the epitranscriptome and the ...proteome under drought stress remains undetermined in poplar (Populus trichocarpa). In this study, we used Nanopore direct RNA sequencing and tandem mass tag-based proteomic analysis to examine epitranscriptomic and proteomic regulation induced by drought treatment in stem-differentiating xylem (SDX). Our results revealed a decreased full-length read ratio under drought treatment and, especially, a decreased association between transcriptome and proteome changes in response to drought. Epitranscriptome analysis of cellulose- and lignin-related genes revealed an increased N6-Methyladenosine (m6A) ratio, which was accompanied by decreased RNA abundance and translation, under drought stress. Interestingly, usage of the distal poly(A) site increased during drought stress. Finally, we found that transcripts of highly expressed genes tend to have shorter poly(A) tail length (PAL), and drought stress increased the percentage of transcripts with long PAL. These findings provide insights into the interplay among m6A, polyadenylation, PAL, and translation under drought stress in P. trichocarpa SDX.
SUMMARY
Australian pine (Casuarina spp.) is extensively planted in tropical and subtropical regions for wood production, shelterbelts, environmental protection, and ecological restoration due to ...their superior biological characteristics, such as rapid growth, wind and salt tolerance, and nitrogen fixation. To analyze the genomic diversity of Casuarina, we sequenced the genomes and constructed de novo genome assemblies of the three most widely planted Casuarina species: C. equisetifolia, C. glauca, and C. cunninghamiana. We generated chromosome‐scale genome sequences using both Pacific Biosciences (PacBio) Sequel sequencing and chromosome conformation capture technology (Hi‐C). The total genome sizes for C. equisetifolia, C. glauca, and C. cunninghamiana are 268 942 579 bp, 296 631 783 bp, and 293 483 606 bp, respectively, of which 25.91, 27.15, and 27.74% were annotated as repetitive sequences. We annotated 23 162, 24 673, and 24 674 protein‐coding genes in C. equisetifolia, C. glauca, and C. cunninghamiana, respectively. We then collected branchlets from male and female individuals for whole‐genome bisulfite sequencing (BS‐seq) to explore the epigenetic regulation of sex determination in these three species. Transcriptome sequencing (RNA‐seq) revealed differential expression of phytohormone‐related genes between male and female plants. In summary, we generated three chromosome‐level genome assemblies and comprehensive DNA methylation and transcriptome datasets from both male and female material for three Casuarina species, providing a basis for the comprehensive investigation of genomic diversity and functional gene discovery of Casuarina in the future.
Significance Statement
In this study we generated three chromosome‐level genome assemblies and comprehensive DNA methylation and transcriptome datasets from both male and female material for three Casuarina species, providing a basis for the comprehensive investigation of genomic diversity and functional gene discovery of Casuarina in the future.
Circular RNAs, including circular exonic RNAs (circRNA), circular intronic RNAs (ciRNA) and exon-intron circRNAs (EIciRNAs), are a new type of noncoding RNAs. Growing shoots of moso bamboo ...(Phyllostachys edulis) represent an excellent model of fast growth and their circular RNAs have not been studied yet. To understand the potential regulation of circular RNAs, we systematically characterized circular RNAs from eight different developmental stages of rapidly growing shoots. Here, we identified 895 circular RNAs including a subset of mutually inclusive circRNA. These circular RNAs were generated from 759 corresponding parental coding genes involved in cellulose, hemicellulose and lignin biosynthetic process. Gene co-expression analysis revealed that hub genes, such as DEFECTIVE IN RNA-DIRECTED DNA METHYLATION 1 (DRD1), MAINTENANCE OF METHYLATION (MOM), dicer-like 3 (DCL3) and ARGONAUTE 1 (AGO1), were significantly enriched giving rise to circular RNAs. The expression level of these circular RNAs presented correlation with its linear counterpart according to transcriptome sequencing. Further protoplast transformation experiments indicated that overexpressing circ-bHLH93 generating from transcription factor decreased its linear transcript. Finally, the expression profiles suggested that circular RNAs may have interplay with miRNAs to regulate their cognate linear mRNAs, which was further supported by overexpressing miRNA156 decreasing the transcript of circ-TRF-1 and linear transcripts of TRF-1. Taken together, the overall profile of circular RNAs provided new insight into an unexplored category of long noncoding RNA regulation in moso bamboo.