As a result of a high rate of mutations and recombination events, an RNA-virus exists as a heterogeneous "swarm" of mutant variants. The long read length offered by single-molecule sequencing ...technologies allows each mutant variant to be sequenced in a single pass. However, high error rate limits the ability to reconstruct heterogeneous viral population composed of rare, related mutant variants. In this article, we present two single-nucleotide variants (2SNV), a method able to tolerate the high error rate of the single-molecule protocol and reconstruct mutant variants. 2SNV uses linkage between single-nucleotide variations to efficiently distinguish them from read errors. To benchmark the sensitivity of 2SNV, we performed a single-molecule sequencing experiment on a sample containing a titrated level of known viral mutant variants. Our method is able to accurately reconstruct clone with frequency of 0.2% and distinguish clones that differed in only two nucleotides distantly located on the genome. 2SNV outperforms existing methods for full-length viral mutant reconstruction.
Strategies for sequencing fungal genomes on next-generation sequencing (NGS) platforms depend on the characteristics of the genome of the targeted species, quantity and quality of the genomic DNA, ...and cost considerations. Massively parallel sequencing with sequencing by synthesis (SBS) approach by Illumina produces terabases of short read sequences (i.e., ~300 bp) in a time and cost-effective manner, though the read length can limit the assembly particularly in repetitive regions. The single molecule, real-time (SMRT) sequencing approach by Pacific Biosciences (PacBio) produces longer reads (i.e., ~12,500 bp) which can facilitate de novo assembly of genomes that contain long repetitive sequences, though due to the lower-throughput of this platform achieving the coverage needed for assembly is more expensive than by SBS. Additionally, the Illumina SBS platforms can handle low quantity/quality of genomic DNA materials, while the SMRT system requires undamaged long DNA fragments as input to ensure that high-quality data is produced. Both platforms are discussed in this chapter including key decision-making points.
: Tadpole tail develops from the tailbud, an apparently homogenous mass of cells at the posterior of the embryo. While much progress has been made in understanding the origin and the induction of the ...tailbud, the subsequent outgrowth and differentiation have received much less attention, particularly with regard to global gene expression changes.
: By using RNA-seq with SMRT and further analyses, we report the transcriptome profiles at four key stages of tail development, from a small tailbud to the onset of feeding (S18, S19, S21 and S28) in
, an anuran with a number of advantages for developmental and genetic studies.
: We obtained 48,826 transcripts and discovered 8807 differentially expressed transcripts (DETs, q < 0.05) among these four developmental stages. We functionally classified these DETs by using GO and KEGG analyses and revealed 110 significantly enriched GO categories and 6 highly enriched KEGG pathways (Protein digestion and absorption; ECM-receptor interaction; Pyruvate metabolism; Fatty acid degradation; Valine, leucine and isoleucine degradation; and Glyoxylate and dicarboxylate metabolism) that are likely critically involved in developmental changes in the tail. In addition, analyses of DETs between any two individual stages demonstrated the involvement of distinct biological pathways/GO terms at different stages of tail development. Furthermore, the most dramatic changes in gene expression profile are those between S28 and any of the other three stages. The upregulated DETs at S28 are highly enriched in "myosin complex" and "potassium channel activity", which are important for muscle contraction, a critical function of the tail that the animal needs by the end of embryogenesis. Additionally, many DETs and enriched pathways discovered here during tail development, such as HDAC1, Hes1 and Hippo signaling pathway, have also been reported to be vital for the tissue/organ regeneration, suggesting conserved functions between development and regeneration.
: The present staudy provides a golbal overview of gene expression patterns and new insights into the mechanism involved in anuran tail development and regeneration.
Nitraria tangutorum Bobrov is a halophyte that is resistant to salt and alkali and is widely distributed in northwestern China. However, its genome has not been sequenced, thereby limiting studies on ...this particular species. For species without a reference genome, the full-length transcriptome is a convenient and rapid way to obtain reference gene information. To better study N. tangutorum, we used PacBio single-molecule real-time technology to perform full-length transcriptome analysis of this halophyte. In this study, a total of 21.83 Gb of data were obtained, and 198,300 transcripts, 51,875 SSRs (simple sequence repeats), 55,574 CDS (coding sequence), and 74,913 lncRNAs (long non-coding RNA) were identified. In addition, using this full-length transcriptome, we identified the key Na+/H+ antiporter (NHX) genes that maintain ion balance in plants and found that these are induced to express under salt stress. The results indicate that the full-length transcriptome of N. tangutorum can be used as a database and be utilized in elucidating the salt tolerance mechanism of N. tangutorum.
PacBio’s single-molecule real-time (SMRT) sequencing technology offers important advantages over the short-read DNA sequencing technologies that currently dominate the market. This includes ...exceptionally long read lengths (20 kb or more), unparalleled consensus accuracy, and the ability to sequence native, non-amplified DNA molecules. From fungi to insects to humans, long reads are now used to create highly accurate reference genomes by de novo assembly of genomic DNA and to obtain a comprehensive view of transcriptomes through the sequencing of full-length cDNAs. Besides reducing biases, sequencing native DNA also permits the direct measurement of DNA base modifications. Therefore, SMRT sequencing has become an attractive technology in many fields, such as agriculture, basic science, and medical research. The boundaries of SMRT sequencing are continuously being pushed by developments in bioinformatics and sample preparation. This book contains a collection of articles showcasing the latest developments and the breadth of applications enabled by SMRT sequencing technology.
Simultaneous expression of highly homologous RLN1 and RLN2 genes in prostate impairs their accurate delineation. We used PacBio SMRT sequencing and RNA-Seq in LNCaP cells in order to dissect the ...expression of RLN1 and RLN2 variants. We identified a novel fusion transcript comprising the RLN1 and RLN2 genes and found evidence of its expression in the normal and prostate cancer tissues. The RLN1-RLN2 fusion putatively encodes RLN2 isoform with the deleted secretory signal peptide. The identification of the fusion transcript provided information to determine unique RLN1-RLN2 fusion and RLN1 regions. The RLN1-RLN2 fusion was co-expressed with RLN1 in LNCaP cells, but the two gene products were inversely regulated by androgens. We showed that RLN1 is underrepresented in common PCa cell lines in comparison to normal and PCa tissue. The current study brings a highly relevant update to the relaxin field, and will encourage further studies of RLN1 and RLN2 in PCa and broader.
•A novel RLN1-RLN2 fusion was observed in LNCaP cells and normal and prostate cancer tissues.•The fusion putatively encodes a RLN2 isoform with a deleted secretory signal peptide.•RLN1 is underrepresented in PCa cell lines.•Androgens inversely regulate RLN1 and the RLN1-RLN2 fusion transcript.
The salt‐reducing pickling method has been applied to the industrial production of zhacai. In order to reveal the succession of the microbial community structure and flavor components during the ...pickling process, this study used PacBio Sequel to sequence the full length of 16S rRNA (bacteria, 1400 bp) and ITS (fungi, 1200 bp) genes, and detected flavor components simultaneously, including organic acids, volatile flavor components (VFC), monosaccharides, and amino acids. Eleven phyla and 148 genera were identified in the bacterial community, and 2 phyla and 60 genera in the fungal community. During the four stages of pickling, the dominant bacterial genera were Leuconostoc, Lactobacillus, Leuconostoc, and Lactobacillus, while the dominant fungal genera were Aspergillus, Kazachstania, Debaryomyces, and Debaryomyces, respectively. There were 32 main flavor components (5 organic acids, 19 VFCs, 3 monosaccharides, and 5 amino acids). Correlation heat mapping and bidirectional orthogonal partial least squares (O2PLS) analysis showed that the flora having close relation to flavor components included 14 genera of bacteria (Leuconostoc, Clostridium, Devosia, Lactococcus, Pectobacterium, Sphingobacterium, Serratia, Stenotrophomonas, Halanaerobium, Tetragenococcus, Chromohalobacter, Klebsiella, Acidovorax, and Acinetobacter) and 3 genera of fungi (Filobasidium, Malassezia, and Aspergillus). This study provides detailed data regarding the microbial community and flavor components during the salt‐reducing pickling process of zhacai, which can be used as a reference for the development and improvement of salt‐reducing pickling methods.
This study investigated a whole process of the salt‐reducing pickling of zhacai. The microbial community structure during the four stages was analyzed by using the PacBio Sequel platform to sequence full‐length 16S rRNA and ITS genes, the flavor components were measured (organic acids, VFCs, monosaccharides and amino acids), the correlations between microbial communities and flavor components were analyzed using correlation heat mapping and O2PLS modeling, the core functional flora were inferred through integrated correlation analysis, and the predicted functions of the microbial communities after using PICRUSt2 analysis were highlighted.
The white-backed planthopper Sogatella furcifera is an economically important rice pest distributed throughout Asia. It damages rice crops by sucking phloem sap, resulting in stunted growth and plant ...virus transmission. We aimed to obtain the full-length transcriptome data of S. furcifera using PacBio single-molecule real-time (SMRT) sequencing. Total RNA extracted from S. furcifera at various developmental stages (egg, larval, and adult stages) was mixed and used to generate a full-length transcriptome for SMRT sequencing. Long non-coding RNA (lncRNA) identification, full-length coding sequence prediction, full-length non-chimeric (FLNC) read detection, simple sequence repeat (SSR) analysis, transcription factor detection, and transcript functional annotation were performed. A total of 12,514,449 subreads (15.64 Gbp, clean reads) were generated, including 630,447 circular consensus sequences and 388,348 FLNC reads. Transcript cluster analysis of the FLNC reads revealed 251,109 consensus reads including 29,700 high-quality reads. Additionally, 100,360 SSRs and 121,395 coding sequences were identified using SSR analysis and ANGEL software, respectively. Furthermore, 44,324 lncRNAs were annotated using four tools and 1,288 transcription factors were identified. In total, 95,495 transcripts were functionally annotated based on searches of seven different databases. To the best of our knowledge, this is the first study of the full-length transcriptome of the white-backed planthopper obtained using SMRT sequencing. The acquired transcriptome data can facilitate further studies on the ecological and viral-host interactions of this agricultural pest.