Summary
Moso bamboo (Phyllostachys edulis) represents one of the fastest‐spreading plants in the world, due in part to its well‐developed rhizome system. However, the post‐transcriptional mechanism ...for the development of the rhizome system in bamboo has not been comprehensively studied. We therefore used a combination of single‐molecule long‐read sequencing technology and polyadenylation site sequencing (PAS‐seq) to re‐annotate the bamboo genome, and identify genome‐wide alternative splicing (AS) and alternative polyadenylation (APA) in the rhizome system. In total, 145 522 mapped full‐length non‐chimeric (FLNC) reads were analyzed, resulting in the correction of 2241 mis‐annotated genes and the identification of 8091 previously unannotated loci. Notably, more than 42 280 distinct splicing isoforms were derived from 128 667 intron‐containing full‐length FLNC reads, including a large number of AS events associated with rhizome systems. In addition, we characterized 25 069 polyadenylation sites from 11 450 genes, 6311 of which have APA sites. Further analysis of intronic polyadenylation revealed that LTR/Gypsy and LTR/Copia were two major transposable elements within the intronic polyadenylation region. Furthermore, this study provided a quantitative atlas of poly(A) usage. Several hundred differential poly(A) sites in the rhizome‐root system were identified. Taken together, these results suggest that post‐transcriptional regulation may potentially have a vital role in the underground rhizome‐root system.
Significance Statement
Moso bamboo (Phyllostachys edulis) is one of the fastest‐spreading plants in the world, due in part to its well‐developed rhizome system. However, the post‐transcriptional mechanism for the development of the rhizome system in bamboo has not been well studied. We therefore used a combination of single‐molecule long‐read sequencing technology and polyadenylation site sequencing to re‐annotate the bamboo genome, and subsequently identify alternative splicing and alternative polyadenylation in the rhizome system.
Soil microbiome has a pivotal role in ecosystem functioning, yet little is known about its build-up from local to regional scales. In a multi-year regional-scale survey involving 1251 plots and ...long-read third-generation sequencing, we found that soil pH has the strongest effect on the diversity of fungi and its multiple taxonomic and functional groups. The pH effects were typically unimodal, usually both direct and indirect through tree species, soil nutrients or mold abundance. Individual tree species, particularly
Pinus sylvestris
,
Picea abies
, and
Populus x wettsteinii
, and overall ectomycorrhizal plant proportion had relatively stronger effects on the diversity of biotrophic fungi than saprotrophic fungi. We found strong temporal sampling and investigator biases for the abundance of molds, but generally all spatial, temporal and microclimatic effects were weak. Richness of fungi and several functional groups was highest in woodlands and around ruins of buildings but lowest in bogs, with marked group-specific trends. In contrast to our expectations, diversity of soil fungi tended to be higher in forest island habitats potentially due to the edge effect, but fungal richness declined with island distance and in response to forest fragmentation. Virgin forests supported somewhat higher fungal diversity than old non-pristine forests, but there were no differences in richness between natural and anthropogenic habitats such as parks and coppiced gardens. Diversity of most fungal groups suffered from management of seminatural woodlands and parks and thinning of forests, but especially for forests the results depended on fungal group and time since partial harvesting. We conclude that the positive effects of tree diversity on overall fungal richness represent a combined niche effect of soil properties and intimate associations.
In order to provide a comprehensive resource for human structural variants (SVs), we generated long-read sequence data and analyzed SVs for fifteen human genomes. We sequence resolved 99,604 ...insertions, deletions, and inversions including 2,238 (1.6 Mbp) that are shared among all discovery genomes with an additional 13,053 (6.9 Mbp) present in the majority, indicating minor alleles or errors in the reference. Genotyping in 440 additional genomes confirms the most common SVs in unique euchromatin are now sequence resolved. We report a ninefold SV bias toward the last 5 Mbp of human chromosomes with nearly 55% of all VNTRs (variable number of tandem repeats) mapping to this portion of the genome. We identify SVs affecting coding and noncoding regulatory loci improving annotation and interpretation of functional variation. These data provide the framework to construct a canonical human reference and a resource for developing advanced representations capable of capturing allelic diversity.
Display omitted
•We sequence resolve and annotate 99,604 common human structural variants•55% of VNTRs map to the end of chromosomes and correlate with double-strand breaks•Alternate alleles facilitate accurate genotyping with short reads and new associations•We patch the reference and add diversity needed for developing a pan human genome
Long-read sequencing allows generation of a large catalog of human structural variants and the development of an algorithm for genotyping SVs from short-read data, clarifying the spectrum and importance of structural variation in the human genome.
•This is the first investigation on microbial community of mixed silage using SMRT.•Mixed silage is characterized by high bacterial diversity.•Lactobacillus plantarum was the main species affecting ...silage fermentation.•Heterofermentative Lactobacillus species produced acetate in the silage.•The SMRT method provides deep insights into the microbial composition of silages.
The bacterial community determined via PacBio single molecule, real-time sequencing technology (SMRT) and the fermentation characteristics of Italian ryegrass (IR, 82% moisture) silage prepared with corn stover (CS) were investigated. A selected strain of Lactobacillus plantarum (L694) and a commercial inoculant stain of Lactobacillus plantarum (LP) were used as additives. Lactic acid bacteria (LAB) effectively improved silage quality. After fermentation, Lactobacillus plantarum was the dominant species in IR + LP and IR + L694 treatments, which led to higher (P < 0.05) lactic acid and lower (P < 0.05) butyric acid production. Lactobacillus plantarum, Lactobacillus hammesii, Lactobacillus brevis, and Lactobacillus coryniformis were abundantly present in IR + CS + LP and IR + CS + L694 treatments, and acetic acid contents of these were higher (P < 0.05) than those of other silages. This study demonstrated that addition of CS and LAB can change the microbial community and influence the silage fermentation of IR, and PacBio SMRT reveals more specific microbial information.
The present study investigated the species level based microbial community and metabolome in corn silage inoculated with or without homofermentative
and heterofermentative
using the PacBio SMRT ...Sequencing and time-of-flight mass spectrometry (GC-TOF/MS). Chopped whole crop corn was treated with (1) deionized water (control), (2)
, or (3)
. The chopped whole crop corn was ensiled in vacuum-sealed polyethylene bags containing 300 g of fresh forge for 90 days, with three replicates for each treatment. The results showed that a total of 979 substances were detected, and 316 different metabolites were identified. Some metabolites with antimicrobial activity were detected in whole crop corn silage, such as catechol, 3-phenyllactic acid, 4-hydroxybenzoic acid, azelaic acid, 3,4-dihydroxybenzoic acid and 4-hydroxycinnamic acid. Catechol, pyrogallol and ferulic acid with antioxidant property, 4-hydroxybutyrate with nervine activity, and linoleic acid with cholesterol lowering effects, were detected in present study. In addition, a flavoring agent of myristic acid and a depression mitigation substance of phenylethylamine were also found in this study. Samples treated with inoculants presented more biofunctional metabolites of organic acids, amino acids and phenolic acids than untreated samples. The
species covered over 98% after ensiling, and were mainly comprised by the
and
. As compared to the control silage, inoculation of
increased the relative abundances of
and
, and a considerable decline in the proportion of
was observed; whereas an obvious decrease in
and increases in
and
were observed in the
inoculated silage. Therefore, inoculation of
and
regulated the microbial composition and metabolome of the corn silage with different behaviors. The present results indicated that profiling of silage microbiome and metabolome might improve our current understanding of the biological process underlying silage formation.
Summary
Casuarina equisetifolia (C. equisetifolia), a conifer‐like angiosperm with resistance to typhoon and stress tolerance, is mainly cultivated in the coastal areas of Australasia. ...C. equisetifolia, making it a valuable model to study secondary growth associated genes and stress‐tolerance traits. However, the genome sequence is unavailable and therefore wood‐associated growth rate and stress resistance at the molecular level is largely unexplored. We therefore constructed a high‐quality draft genome sequence of C. equisetifolia by a combination of Illumina second‐generation sequencing reads and Pacific Biosciences single‐molecule real‐time (SMRT) long reads to advance the investigation of this species. Here, we report the genome assembly, which contains approximately 300 megabases (Mb) and scaffold size of N50 is 1.06 Mb. Additionally, gene annotation, assisted by a combination of prediction and RNA‐seq data, generated 29 827 annotated protein‐coding genes and 1983 non‐coding genes, respectively. Furthermore, we found that the total number of repetitive sequences account for one‐third of the genome assembly. Here we also construct the genome‐wide map of DNA modification, such as two novel forms N6‐adenine (6mA) and N4‐methylcytosine (4mC) at the level of single‐nucleotide resolution using single‐molecule real‐time (SMRT) sequencing. Interestingly, we found that 17% of 6mA modification genes and 15% of 4mC modification genes also included alternative splicing events. Finally, we investigated cellulose, hemicellulose, and lignin‐related genes, which were associated with secondary growth and contained different DNA modifications. The high‐quality genome sequence and annotation of C. equisetifolia in this study provide a valuable resource to strengthen our understanding of the diverse traits of trees.
Significance Statement
We constructed a high‐quality draft genome sequence of C. equisetifolia and systematically characterized 29,827 annotated protein‐coding genes and 1,983 non‐coding genes, respectively. Furthermore, we construct the genome‐wide map of DNA modification, such as two novel forms N6‐Adenine (6mA) and N4‐methylcytosine (4mC), including the genes in the regulation of lignin and cellulose. The high‐quality genome resource in this study provides valuable resources to strengthen our understanding of the diverse traits of trees.
ABSTRACT
The cytochrome P450‐2D6 (CYP2D6) enzyme metabolizes ∼25% of common medications, yet homologous pseudogenes and copy number variants (CNVs) make interrogating the polymorphic CYP2D6 gene with ...short‐read sequencing challenging. Therefore, we developed a novel long‐read, full gene CYP2D6 single molecule real‐time (SMRT) sequencing method using the Pacific Biosciences platform. Long‐range PCR and CYP2D6 SMRT sequencing of 10 previously genotyped controls identified expected star (*) alleles, but also enabled suballele resolution, diplotype refinement, and discovery of novel alleles. Coupled with an optimized variant‐calling pipeline, CYP2D6 SMRT sequencing was highly reproducible as triplicate intra‐ and inter‐run nonreference genotype results were completely concordant. Importantly, targeted SMRT sequencing of upstream and downstream CYP2D6 gene copies characterized the duplicated allele in 15 control samples with CYP2D6 CNVs. The utility of CYP2D6 SMRT sequencing was further underscored by identifying the diplotypes of 14 samples with discordant or unclear CYP2D6 configurations from previous targeted genotyping, which again included suballele resolution, duplicated allele characterization, and discovery of a novel allele and tandem arrangement. Taken together, long‐read CYP2D6 SMRT sequencing is an innovative, reproducible, and validated method for full‐gene characterization, duplication allele‐specific analysis, and novel allele discovery, which will likely improve CYP2D6 metabolizer phenotype prediction for both research and clinical testing applications.
Long‐read single molecule real‐time (SMRT) full gene sequencing of cytochrome P450‐2D6 (CYP2D6). Illustrated is the CYP2D6 gene and overview of amplicon preparation for SMRT sequencing.
The human microbiome includes trillions of bacteria, many of which play a vital role in host physiology. Numerous studies have now detected bacterial DNA in first-pass meconium and amniotic fluid ...samples, suggesting that the human microbiome may commence
. However, these data have remained contentious due to underlying contamination issues. Here, we have used a previously described method for reducing contamination in microbiome workflows to determine if there is a fetal bacterial microbiome beyond the level of background contamination. We recruited 50 women undergoing non-emergency cesarean section deliveries with no evidence of intra-uterine infection and collected first-pass meconium and amniotic fluid samples. Full-length 16S rRNA gene sequencing was performed using PacBio SMRT cell technology, to allow high resolution profiling of the fetal gut and amniotic fluid bacterial microbiomes. Levels of inflammatory cytokines were measured in amniotic fluid, and levels of immunomodulatory short chain fatty acids (SCFAs) were quantified in meconium. All meconium samples and most amniotic fluid samples (36/43) contained bacterial DNA. The meconium microbiome was dominated by reads that mapped to
. Aside from this species, the meconium microbiome was remarkably heterogeneous between patients. The amniotic fluid microbiome was more diverse and contained mainly reads that mapped to typical skin commensals, including
and
spp. All meconium samples contained acetate and propionate, at ratios similar to those previously reported in infants.
reads were inversely correlated with meconium propionate levels. Amniotic fluid cytokine levels were associated with the amniotic fluid microbiome. Our results demonstrate that bacterial DNA and SCFAs are present
, and have the potential to influence the developing fetal immune system.
Infectious disease is both a major force of selection in nature and a prime cause of yield loss in agriculture. In plants, disease resistance is often conferred by nucleotide-binding leucine-rich ...repeat (NLR) proteins, intracellular immune receptors that recognize pathogen proteins and their effects on the host. Consistent with extensive balancing and positive selection, NLRs are encoded by one of the most variable gene families in plants, but the true extent of intraspecific NLR diversity has been unclear. Here, we define a nearly complete species-wide pan-NLRome in Arabidopsis thaliana based on sequence enrichment and long-read sequencing. The pan-NLRome largely saturates with approximately 40 well-chosen wild strains, with half of the pan-NLRome being present in most accessions. We chart NLR architectural diversity, identify new architectures, and quantify selective forces that act on specific NLRs and NLR domains. Our study provides a blueprint for defining pan-NLRomes.
Display omitted
•Species-wide NLR diversity is high but not unlimited•A large fraction of NLR diversity is recovered with 40–50 accessions•Presence/absence variation in NLRs is widespread, resulting in a mosaic population•A high diversity of NLR-integrated domains favor known virulence targets
In plants, NLR proteins are important intracellular receptors with roles in innate immunity and disease resistance. This work provides a panoramic view of this diverse and complicated gene family in the model species A. thaliana and provides a foundation for the identification and functional study of disease-resistance genes in agronomically important species with complex genomes.