Bacteria and their viruses (phage) are fundamental drivers of many ecosystem processes including global biogeochemistry and horizontal gene transfer. While databases and resources for studying ...function in uncultured bacterial communities are relatively advanced, many fewer exist for their viral counterparts. The issue is largely technical in that the majority (often 90%) of viral sequences are functionally 'unknown' making viruses a virtually untapped resource of functional and physiological information. Here, we provide a community resource that organizes this unknown sequence space into 27 K high confidence protein clusters using 32 viral metagenomes from four biogeographic regions in the Pacific Ocean that vary by season, depth, and proximity to land, and include some of the first deep pelagic ocean viral metagenomes. These protein clusters more than double currently available viral protein clusters, including those from environmental datasets. Further, a protein cluster guided analysis of functional diversity revealed that richness decreased (i) from deep to surface waters, (ii) from winter to summer, (iii) and with distance from shore in surface waters only. These data provide a framework from which to draw on for future metadata-enabled functional inquiries of the vast viral unknown.
The gut microbiome profoundly affects human health and disease, and their infecting viruses are likely as important, but often missed because of reference database limitations. Here, we (1) built a ...human Gut Virome Database (GVD) from 2,697 viral particle or microbial metagenomes from 1,986 individuals representing 16 countries, (2) assess its effectiveness, and (3) report a meta-analysis that reveals age-dependent patterns across healthy Westerners. The GVD contains 33,242 unique viral populations (approximately species-level taxa) and improves average viral detection rates over viral RefSeq and IMG/VR nearly 182-fold and 2.6-fold, respectively. GVD meta-analyses show highly personalized viromes, reveal that inter-study variability from technical artifacts is larger than any “disease” effect at the population level, and document how viral diversity changes from human infancy into senescence. Together, this compact foundational resource, these standardization guidelines, and these meta-analysis findings provide a systematic toolkit to help maximize our understanding of viral roles in health and disease.
Display omitted
•Assembly of 2,697 gut metagenomes from 32 studies exposed 33,242 viral populations•Inter-study analyses reveal strong study biases at the viral population-level•Viral population detection was higher in bulk versus VLP-enriched metagenomes•Gut viral diversity is age-dependent across healthy, Western people
At least 32 studies to date have looked at the human gut virome but with limited consistency. Gregory and Zablocki et al. curate and aggregate these data to provide a systematic virome database; use it to assess study biases, global ecological patterns; and show how viromes evolve throughout the human lifespan.
Marine viruses have important roles in microbial mortality, gene transfer, metabolic reprogramming and biogeochemical cycling. In this Review, we discuss recent technological advances in marine ...virology including the use of near-quantitative, reproducible metagenomics for large-scale investigation of viral communities and the emergence of gene-based viral ecology. We also describe the reprogramming of microbially driven processes by viral metabolic genes, the identification of novel viruses using cultivation-dependent and cultivation-independent tools, and the potential for modelling studies to provide a framework for studying virus-host interactions. These transformative advances have set a rapid pace in exploring and predicting how marine viruses manipulate and respond to their environment.
Microbiomes from every environment contain a myriad of uncultivated archaeal and bacterial viruses, but studying these viruses is hampered by the lack of a universal, scalable taxonomic framework. We ...present vConTACT v.2.0, a network-based application utilizing whole genome gene-sharing profiles for virus taxonomy that integrates distance-based hierarchical clustering and confidence scores for all taxonomic predictions. We report near-identical (96%) replication of existing genus-level viral taxonomy assignments from the International Committee on Taxonomy of Viruses for National Center for Biotechnology Information virus RefSeq. Application of vConTACT v.2.0 to 1,364 previously unclassified viruses deposited in virus RefSeq as reference genomes produced automatic, high-confidence genus assignments for 820 of the 1,364. We applied vConTACT v.2.0 to analyze 15,280 Global Ocean Virome genome fragments and were able to provide taxonomic assignments for 31% of these data, which shows that our algorithm is scalable to very large metagenomic datasets. Our taxonomy tool can be automated and applied to metagenomes from any environment for virus classification.
Viruses that infect bacteria (phages) can influence bacterial community dynamics, bacterial genome evolution and ecosystem biogeochemistry. These influences differ depending on whether phages ...establish lytic, chronic or lysogenic infections. Although the first two produce virion progeny, with lytic infections resulting in cell destruction, phages undergoing lysogenic infections replicate with cells without producing virions. The impacts of lysogeny are numerous and well-studied at the cellular level, but ecosystem-level consequences remain underexplored compared to those of lytic infections. Here, we review lysogeny from molecular mechanisms to ecological patterns to emerging approaches of investigation. Our goal is to highlight both its diversity and importance in complex communities. Altogether, using a combined viral ecology toolkit that is applied across broad model systems and environments will help us understand more of the diverse lifestyles and ecological impacts of lysogens in nature.
Microbial and viral communities transform the chemistry of Earth's ecosystems, yet the specific reactions catalyzed by these biological engines are hard to decode due to the absence of a scalable, ...metabolically resolved, annotation software. Here, we present DRAM (Distilled and Refined Annotation of Metabolism), a framework to translate the deluge of microbiome-based genomic information into a catalog of microbial traits. To demonstrate the applicability of DRAM across metabolically diverse genomes, we evaluated DRAM performance on a defined, in silico soil community and previously published human gut metagenomes. We show that DRAM accurately assigned microbial contributions to geochemical cycles and automated the partitioning of gut microbial carbohydrate metabolism at substrate levels. DRAM-v, the viral mode of DRAM, established rules to identify virally-encoded auxiliary metabolic genes (AMGs), resulting in the metabolic categorization of thousands of putative AMGs from soils and guts. Together DRAM and DRAM-v provide critical metabolic profiling capabilities that decipher mechanisms underpinning microbiome function.
Viruses of microbes impact all ecosystems where microbes drive key energy and substrate transformations including the oceans, humans and industrial fermenters. However, despite this recognized ...importance, our understanding of viral diversity and impacts remains limited by too few model systems and reference genomes. One way to fill these gaps in our knowledge of viral diversity is through the detection of viral signal in microbial genomic data. While multiple approaches have been developed and applied for the detection of prophages (viral genomes integrated in a microbial genome), new types of microbial genomic data are emerging that are more fragmented and larger scale, such as Single-cell Amplified Genomes (SAGs) of uncultivated organisms or genomic fragments assembled from metagenomic sequencing. Here, we present VirSorter, a tool designed to detect viral signal in these different types of microbial sequence data in both a reference-dependent and reference-independent manner, leveraging probabilistic models and extensive virome data to maximize detection of novel viruses. Performance testing shows that VirSorter's prophage prediction capability compares to that of available prophage predictors for complete genomes, but is superior in predicting viral sequences outside of a host genome (i.e., from extrachromosomal prophages, lytic infections, or partially assembled prophages). Furthermore, VirSorter outperforms existing tools for fragmented genomic and metagenomic datasets, and can identify viral signal in assembled sequence (contigs) as short as 3kb, while providing near-perfect identification (>95% Recall and 100% Precision) on contigs of at least 10kb. Because VirSorter scales to large datasets, it can also be used in "reverse" to more confidently identify viral sequence in viral metagenomes by sorting away cellular DNA whether derived from gene transfer agents, generalized transduction or contamination. Finally, VirSorter is made available through the iPlant Cyberinfrastructure that provides a web-based user interface interconnected with the required computing resources. VirSorter thus complements existing prophage prediction softwares to better leverage fragmented, SAG and metagenomic datasets in a way that will scale to modern sequencing. Given these features, VirSorter should enable the discovery of new viruses in microbial datasets, and further our understanding of uncultivated viral communities across diverse ecosystems.
The ecological importance of viruses is now widely recognized, yet our limited knowledge of viral sequence space and virus-host interactions precludes accurate prediction of their roles and impacts. ...In this study, we mined publicly available bacterial and archaeal genomic data sets to identify 12,498 high-confidence viral genomes linked to their microbial hosts. These data augment public data sets 10-fold, provide first viral sequences for 13 new bacterial phyla including ecologically abundant phyla, and help taxonomically identify 7-38% of 'unknown' sequence space in viromes. Genome- and network-based classification was largely consistent with accepted viral taxonomy and suggested that (i) 264 new viral genera were identified (doubling known genera) and (ii) cross-taxon genomic recombination is limited. Further analyses provided empirical data on extrachromosomal prophages and coinfection prevalences, as well as evaluation of in silico virus-host linkage predictions. Together these findings illustrate the value of mining viral signal from microbial genomes.
A planetary-scale understanding of the ocean ecosystem, particularly in light of climate change, is crucial. Here, we review the work of Tara Oceans, an international, multidisciplinary project to ...assess the complexity of ocean life across comprehensive taxonomic and spatial scales. Using a modified sailing boat, the team sampled plankton at 210 globally distributed sites at depths down to 1,000 m. We describe publicly available resources of molecular, morphological and environmental data, and discuss how an ecosystems biology approach has expanded our understanding of plankton diversity and ecology in the ocean as a planetary, interconnected ecosystem. These efforts illustrate how global-scale concepts and data can help to integrate biological complexity into models and serve as a baseline for assessing ecosystem changes and the future habitability of our planet in the Anthropocene epoch.
Viruses are a significant player in many biosphere and human ecosystems, but most signals remain "hidden" in metagenomic/metatranscriptomic sequence datasets due to the lack of universal gene ...markers, database representatives, and insufficiently advanced identification tools.
Here, we introduce VirSorter2, a DNA and RNA virus identification tool that leverages genome-informed database advances across a collection of customized automatic classifiers to improve the accuracy and range of virus sequence detection. When benchmarked against genomes from both isolated and uncultivated viruses, VirSorter2 uniquely performed consistently with high accuracy (F1-score > 0.8) across viral diversity, while all other tools under-detected viruses outside of the group most represented in reference databases (i.e., those in the order Caudovirales). Among the tools evaluated, VirSorter2 was also uniquely able to minimize errors associated with atypical cellular sequences including eukaryotic genomes and plasmids. Finally, as the virosphere exploration unravels novel viral sequences, VirSorter2's modular design makes it inherently able to expand to new types of viruses via the design of new classifiers to maintain maximal sensitivity and specificity.
With multi-classifier and modular design, VirSorter2 demonstrates higher overall accuracy across major viral groups and will advance our knowledge of virus evolution, diversity, and virus-microbe interaction in various ecosystems. Source code of VirSorter2 is freely available ( https://bitbucket.org/MAVERICLab/virsorter2 ), and VirSorter2 is also available both on bioconda and as an iVirus app on CyVerse ( https://de.cyverse.org/de ). Video abstract.