Staphylococcus capitis is an opportunistic pathogen of the coagulase negative staphylococci (CoNS). Functional genomic studies of S. capitis have thus far been limited by a lack of available complete ...genome sequences. Here, we determined the closed S. capitis genome and methylome using Single Molecule Real Time (SMRT) sequencing. The strain, AYP1020, harbors a single circular chromosome of 2.44 Mb encoding 2304 predicted proteins, which is the smallest of all complete staphylococcal genomes sequenced to date. AYP1020 harbors two large mobile genetic elements; a plasmid designated pAYP1020 (59.6 Kb) and a prophage, ΦAYP1020 (48.5 Kb). Methylome analysis identified significant adenine methylation across the genome involving two distinct methylation motifs (1972 putative 6-methyladenine (m6A) residues identified). Putative adenine methyltransferases were also identified. Comparative analysis of AYP1020 and the closely related CoNS, S. epidermidis RP62a, revealed a host of virulence factors that likely contribute to S. capitis pathogenicity, most notably genes important for biofilm formation and a suite of phenol soluble modulins (PSMs); the expression/production of these factors were corroborated by functional assays. The complete S. capitis genome will aid future studies on the evolution and pathogenesis of the coagulase negative staphylococci.
Genomics in the long-read sequencing era van Dijk, Erwin L.; Naquin, Delphine; Gorrichon, Kévin ...
Trends in genetics,
09/2023, Letnik:
39, Številka:
9
Journal Article
Recenzirano
Long-read sequencing (LRS) methods can now produce highly accurate (ultra)long reads and thus enable for the first time the production of truly complete telomere-to-telomere (T2T) assemblies of ...complex genomes.While the recently produced T2T assemblies were labor-intensive and required a combination of various techniques, in the near future T2T assemblies based on LRS alone will be produced.We have entered the era of population scale long-read sequencing, where graph-based pangenomes much better representing genomic variation will increasingly be used in the near future.New applications of LRS appear at a rapid pace, with examples ranging from higher order chromatin interaction studies to the quality control of mRNA vaccines.
Long-read sequencing (LRS) technologies have provided extremely powerful tools to explore genomes. While in the early years these methods suffered technical limitations, they have recently made significant progress in terms of read length, throughput, and accuracy and bioinformatics tools have strongly improved. Here, we aim to review the current status of LRS technologies, the development of novel methods, and the impact on genomics research. We will explore the most impactful recent findings made possible by these technologies focusing on high-resolution sequencing of genomes and transcriptomes and the direct detection of DNA and RNA modifications. We will also discuss how LRS methods promise a more comprehensive understanding of human genetic variation, transcriptomics, and epigenetics for the coming years.
•SMRT and SGS sequencing could providing a comprehensive transcriptome data for Ginkgo.•LncRNA, AS, and fusion gene participating in the regulation of flavonoid metabolism.•Revealing the regulation ...network associated with flavonoid biosynthesis in G. biloba.•Revealing the synthetic transport of flavonoids in G. biloba.
Ginkgo biloba, which contains flavonoids as bioactive components, is widely used in traditional Chinese medicine. Increasing the flavonoid production of medicinal plants through genetic engineering generally focuses on the key genes involved in flavonoid biosynthesis. However, the molecular mechanisms underlying such biosynthesis are not yet well understood. To understand these mechanisms, a combination of second-generation sequencing (SGS) and single-molecule real-time (SMRT) sequencing was applied to G. biloba. Eight tissues were sampled for SMRT sequencing to generate a high-quality, full-length transcriptome database. From 23.36 Gb clean reads, 12,954 alternative polyadenylation events, 12,290 alternative splicing events, 929 fusion transcripts, 2,286 novel transcripts, and 1,270 lncRNAs were predicted by removing redundant reads. Further studies reveal that 7 AS, 5 lncRNA, and 6 fusion gene events were identified in flavonoid biosynthesis. A total of 12 gene modules were revealed to be involved in flavonoid metabolism structural genes and transcription factors by constructing co-expression networks. Weighted gene coexpression network analysis (WGCNA) analysis reveals that some hub genes operate during the biosynthesis by identifying transcription factors (TFs) and structure genes. Seven key hub genes were also identified by analyzing the correlation between gene expression level and flavonoids content. The results highlight the importance of SMRT sequencing of the full-length transcriptome in improving genome annotation and elucidating the gene regulation of flavonoid biosynthesis in G. biloba by providing a comprehensive set of reference transcripts.
•We inoculated eight corn fields with an arbuscular mycorrhizal fungus.•Abundance of native mycorrhizal fungi was negatively related to soil nutrient contents.•Competitive effects to native fungal ...communities determined establishment success of the inoculant.•Phosphorous modulated the competitive effects between native and introduced fungi.•Crop growth response to inoculation was negatively correlated to fertilized amounts of P.
A major strategy to increase the sustainability of agricultural systems consists of enhancing internal ecosystem processes that support crop production and reduce external resource inputs. However, specific approaches to achieve this goal still need to be identified. Here, we investigated whether inoculation with a high dose of a well-characterized strain of a plant symbiotic arbuscular mycorrhizal (AM) fungus into Swiss corn fields leads to successful establishment of the fungus in plant roots and can generate agronomic benefits for maize production.
We used single-molecule real-time (SMRT) DNA sequencing to assess community composition of native AM fungi and identified environmental management and biological factors affecting AM fungal abundance, establishment success of the introduced fungus and effects of AMF inoculation on corn yield.
While native AM fungal abundance was negatively related to soil P contents, we found significantly positive relationships between soil P contents and establishment success of the inoculated fungus. There was a significantly negative relationship between inoculum establishment and abundance of native AM fungi. Although molecular quantification using strain-specific qPCR indicated that the inoculated strain strongly increased in abundance in roots from most soils investigated, total AM fungal root colonization was only significantly increased in one soil, indicating successful competition of the inoculant for root niche space against native AM fungi. Positive effects on corn yield were only observed when inoculation increased root colonization and were negatively correlated to P fertilization levels.
The results imply that phosphorus plays a major role in defining the abundance of native AM fungi and the composition of their communities and that these effects can determine establishment success of the inoculant. The results further indicate that positive effects on crop yield may only be expected when potentially achievable root colonization levels are not yet reached and AMF communities are not well developed.
Silage quality remains an important issue in farming, as do limitations in the range of products suitable for animal fodder. We therefore explored the microorganisms that are critical for the ...fermentation quality of paper mulberry silage. Low (unwilted) and high (wilted) dry matter (DM) paper mulberry were harvested at two cutting times. These were ensiled for 0, 3, 7, 14, and 56 days, respectively. Compared with unwilted silages, wilting significantly decreased (p < 0.05) silage pH value, ammonia‐N concentration, and yeast counts but increased (p < 0.05) lactic acid content. In addition, higher (p < 0.05) crude protein (CP) contents were also observed in wilted silages. Next‐generation sequencing (NGS) analysis revealed that wilting reduced the abundance of Enterobacter, while increasing that of Lactobacillus. Single‐molecule real‐time sequencing (SMRT) revealed that the silage was enriched in the lactic acid bacteria (LAB), Lactobacillus rhamnosus after wilting, which showed a positive correlation with CP and lactic acid content. We conclude that wilting may help preserve paper mulberry silage, facilitating its use as a new fodder resource. Moreover, L. rhamnosus has the potential to be developed as a new inoculant for the modulation in wilted silages, particularly paper mulberry silage.
Bacterial community was analyzed by combination of NGS and SMRT sequencing. Wilting enriched the abundance of Lactobacillus rhamnosus. Lactobacillus rhamnosus was the critical species showing the most positively correlation with wilted silage quality, and have the potential to be developed as new silage inoculant.
Summary
Alternative splicing (AS) is a key post‐transcriptional regulatory mechanism, yet little information is known about its roles in fruit crops. Here, AS was globally analyzed in the wild ...strawberry Fragaria vesca genome with RNA‐seq data derived from different stages of fruit development. The AS landscape was characterized and compared between the single‐molecule, real‐time (SMRT) and Illumina RNA‐seq platform. While SMRT has a lower sequencing depth, it identifies more genes undergoing AS (57.67% of detected multiexon genes) when it is compared with Illumina (33.48%), illustrating the efficacy of SMRT in AS identification. We investigated different modes of AS in the context of fruit development; the percentage of intron retention (IR) is markedly reduced whereas that of alternative acceptor sites (AA) is significantly increased post‐fertilization when compared with pre‐fertilization. When all the identified transcripts were combined, a total of 66.43% detected multiexon genes in strawberry undergo AS, some of which lead to a gain or loss of conserved domains in the gene products. The work demonstrates that SMRT sequencing is highly powerful in AS discovery and provides a rich data resource for later functional studies of different isoforms. Further, shifting AS modes may contribute to rapid changes of gene expression during fruit set.
Significance Statement
Alternative splicing is a key post‐transcriptional regulatory mechanism, yet little is known about its roles in fruit crops. Here we globally analyzed alternative splicing using transcriptome data from strawberry fruits, and show that single molecule real time (SMRT) RNA‐seq data is highly efficient in uncovering alternatively spliced variants.
In this study, we analyzed the fermentation quality, microbial community, and metabolome characteristics of ryegrass silage from different harvests (first harvest-AK, second harvest-BK, and third ...harvest-CK) and analyzed the correlation between fermentative bacteria and metabolites. The bacterial community and metabolomic characteristics were analyzed by single-molecule real-time (SMRT) sequencing and ultra-high-performance liquid chromatography-mass spectrometry (UHPLC-MS/MS), respectively. After 60 days of ensiling, the pH of BK was significantly lower than those of AK and CK, and its lactic acid content was significantly higher than those of AK and CK.
Lactiplantibacillus
and
Enterococcus
genera dominate the microbiota of silage obtained from ryegrass harvested at three different harvests. In addition, the BK group had the highest abundance of
Lactiplantibacillus plantarum
(58.66%), and the CK group had the highest abundance of
Enterococcus faecalis
(42.88%). The most annotated metabolites among the differential metabolites of different harvests were peptides, and eight amino acids were dominant in the composition of the identified peptides. In the ryegrass silage, arginine, alanine, aspartate, and glutamate biosynthesis had the highest enrichment ratio in the metabolic pathway of KEGG pathway enrichment analysis. Valyl-isoleucine and glutamylvaline were positively correlated with
Lactiplantibacillus plantarum
. D-Pipecolic acid and L-glutamic acid were positively correlated with
Levilactobacillus brevis
. L-phenylalanyl-L-proline, 3,4,5-trihydroxy-6-(2-methoxybenzoyloxy) oxane-2-carboxylic acid, and shikimic acid were negatively correlated with
Levilactobacillus brevis
. In conclusion, this study explains the effects of different harvest frequencies on the fermentation quality, microbial community, and metabolites of ryegrass, and improves our understanding of the ensiling mechanisms associated with different ryegrass harvesting frequencies.
The Great Himalayan Leaf-nosed bat (
) is one of the most representative species of all echolocating bats and is an ideal model for studying the echolocation system of bats. An incomplete reference ...genome and limited availability of full-length cDNAs have hindered the identification of alternatively spliced transcripts, which slowed down related basic studies on bats' echolocation and evolution. In this study, we analyzed five organs from
for the first time using PacBio single-molecule real-time sequencing (SMRT). There were 120 GB of subreads generated, including 1,472,058 full-length non-chimeric (FLNC) sequences. A total of 34,611 alternative splicing (AS) events and 66,010 Alternative Polyadenylation (APA) sites were detected by transcriptome structural analysis. Moreover, a total of 110,611 isoforms were identified, consisting of 52% new isoforms of known genes and 5% of novel gene loci, as well as 2112 novel genes that have not been annotated before in the current reference genome of
. Furthermore, several key novel genes, including
,
,
, and
, were identified as being associated with nervous, signal transduction, and immune system processes, which may be involved in regulating the auditory nervous perception and immune system that helps bats to regulate in echolocation. In conclusion, the full-length transcriptome results optimized and replenished existing
genome annotation in multiple ways and offer advantages for newly discovered or previously unrecognized protein-coding genes and isoforms, which can be used as a reference resource.
Next generation sequencing (NGS) technology has revolutionized genomic and genetic research. The pace of change in this area is rapid with three major new sequencing platforms having been released in ...2011: Ion Torrent's PGM, Pacific Biosciences' RS and the Illumina MiSeq. Here we compare the results obtained with those platforms to the performance of the Illumina HiSeq, the current market leader. In order to compare these platforms, and get sufficient coverage depth to allow meaningful analysis, we have sequenced a set of 4 microbial genomes with mean GC content ranging from 19.3 to 67.7%. Together, these represent a comprehensive range of genome content. Here we report our analysis of that sequence data in terms of coverage distribution, bias, GC distribution, variant detection and accuracy.
Sequence generated by Ion Torrent, MiSeq and Pacific Biosciences technologies displays near perfect coverage behaviour on GC-rich, neutral and moderately AT-rich genomes, but a profound bias was observed upon sequencing the extremely AT-rich genome of Plasmodium falciparum on the PGM, resulting in no coverage for approximately 30% of the genome. We analysed the ability to call variants from each platform and found that we could call slightly more variants from Ion Torrent data compared to MiSeq data, but at the expense of a higher false positive rate. Variant calling from Pacific Biosciences data was possible but higher coverage depth was required. Context specific errors were observed in both PGM and MiSeq data, but not in that from the Pacific Biosciences platform.
All three fast turnaround sequencers evaluated here were able to generate usable sequence. However there are key differences between the quality of that data and the applications it will support.
We report the first whole genome sequence (WGS) assembly and annotation of a dwarf coconut variety, 'Catigan Green Dwarf' (CATD). The genome sequence was generated using the PacBio SMRT sequencing ...platform at 15X coverage of the expected genome size of 2.15 Gbp, which was corrected with assembled 50X Illumina paired-end MiSeq reads of the same genome. The draft genome was improved through Chicago sequencing to generate a scaffold assembly that results in a total genome size of 2.1 Gbp consisting of 7,998 scaffolds with N50 of 570,487 bp. The final assembly covers around 97.6% of the estimated genome size of coconut 'CATD' based on homozygous k-mer peak analysis. A total of 34,958 high-confidence gene models were predicted and functionally associated to various economically important traits, such as pest/disease resistance, drought tolerance, coconut oil biosynthesis, and putative transcription factors. The assembled genome was used to infer the evolutionary relationship within the palm family based on genomic variations and synteny of coding gene sequences. Data show that at least three (3) rounds of whole genome duplication occurred and are commonly shared by these members of the
family. A total of 7,139 unique SSR markers were designed to be used as a resource in marker-based breeding. In addition, we discovered 58,503 variants in coconut by aligning the Hainan Tall (HAT) WGS reads to the non-repetitive regions of the assembled CATD genome. The gene markers and genome-wide SSR markers established here will facilitate the development of varieties with resilience to climate change, resistance to pests and diseases, and improved oil yield and quality.