Abstract
The small subunit ribosomal RNA gene (16S rRNA) has been successfully used to catalogue and study the diversity of prokaryotic species and communities but it offers limited resolution at the ...species and finer levels, and cannot represent the whole-genome diversity and fluidity. To overcome these limitations, we introduced the Microbial Genomes Atlas (MiGA), a webserver that allows the classification of an unknown query genomic sequence, complete or partial, against all taxonomically classified taxa with available genome sequences, as well as comparisons to other related genomes including uncultivated ones, based on the genome-aggregate Average Nucleotide and Amino Acid Identity (ANI/AAI) concepts. MiGA integrates best practices in sequence quality trimming and assembly and allows input to be raw reads or assemblies from isolate genomes, single-cell sequences, and metagenome-assembled genomes (MAGs). Further, MiGA can take as input hundreds of closely related genomes of the same or closely related species (a so-called 'Clade Project') to assess their gene content diversity and evolutionary relationships, and calculate important clade properties such as the pangenome and core gene sets. Therefore, MiGA is expected to facilitate a range of genome-based taxonomic and diversity studies, and quality assessment across environmental and clinical settings. MiGA is available at http://microbial-genomes.org/.
Ribosomal RNA genes have become the standard molecular markers for microbial community analysis for good reasons, including universal occurrence in cellular organisms, availability of large ...databases, and ease of rRNA gene region amplification and analysis. As markers, however, rRNA genes have some significant limitations. The rRNA genes are often present in multiple copies, unlike most protein-coding genes. The slow rate of change in rRNA genes means that multiple species sometimes share identical 16S rRNA gene sequences, while many more species share identical sequences in the short 16S rRNA regions commonly analyzed. In addition, the genes involved in many important processes are not distributed in a phylogenetically coherent manner, potentially due to gene loss or horizontal gene transfer. While rRNA genes remain the most commonly used markers, key genes in ecologically important pathways, e.g., those involved in carbon and nitrogen cycling, can provide important insights into community composition and function not obtainable through rRNA analysis. However, working with ecofunctional gene data requires some tools beyond those required for rRNA analysis. To address this, our Functional Gene Pipeline and Repository (FunGene; http://fungene.cme.msu.edu/) offers databases of many common ecofunctional genes and proteins, as well as integrated tools that allow researchers to browse these collections and choose subsets for further analysis, build phylogenetic trees, test primers and probes for coverage, and download aligned sequences. Additional FunGene tools are specialized to process coding gene amplicon data. For example, FrameBot produces frameshift-corrected protein and DNA sequences from raw reads while finding the most closely related protein reference sequence. These tools can help provide better insight into microbial communities by directly studying key genes involved in important ecological processes.
Ribosomal Database Project (RDP; http://rdp.cme.msu.edu/) provides the research community with aligned and annotated rRNA gene sequence data, along with tools to allow researchers to analyze their ...own rRNA gene sequences in the RDP framework. RDP data and tools are utilized in fields as diverse as human health, microbial ecology, environmental microbiology, nucleic acid chemistry, taxonomy and phylogenetics. In addition to aligned and annotated collections of bacterial and archaeal small subunit rRNA genes, RDP now includes a collection of fungal large subunit rRNA genes. RDP tools, including Classifier and Aligner, have been updated to work with this new fungal collection. The use of high-throughput sequencing to characterize environmental microbial populations has exploded in the past several years, and as sequence technologies have improved, the sizes of environmental datasets have increased. With release 11, RDP is providing an expanded set of tools to facilitate analysis of high-throughput data, including both single-stranded and paired-end reads. In addition, most tools are now available as open source packages for download and local use by researchers with high-volume needs or who would like to develop custom analysis pipelines.
Abstract
Motivation
Much global attention has been paid to antibiotic resistance in monitoring its emergence, accumulation and dissemination. For rapid characterization and quantification of ...antibiotic resistance genes (ARGs) in metagenomic datasets, an online analysis pipeline, ARGs-OAP has been developed consisting of a database termed Structured Antibiotic Resistance Genes (the SARG) with a hierarchical structure (ARGs type-subtype-reference sequence).
Results
The new release of the database, termed SARG version 2.0, contains sequences not only from CARD and ARDB databases, but also carefully selected and curated sequences from the latest protein collection of the NCBI-NR database, to keep up to date with the increasing number of ARG deposited sequences. SARG v2.0 has tripled the sequences of the first version and demonstrated improved coverage of ARGs detection in metagenomes from various environmental samples. In addition to annotation of high-throughput raw reads using a similarity search strategy, ARGs-OAP v2.0 now provides model-based identification of assembled sequences using SARGfam, a high-quality profile Hidden Markov Model (HMM), containing profiles of ARG subtypes. Additionally, ARGs-OAP v2.0 improves cell number quantification by using the average coverage of essential single copy marker genes, as an option in addition to the previous method based on the 16S rRNA gene.
Availability and implementation
ARGs-OAP can be accessed through http://smile.hku.hk/SARGs. The database could be downloaded from the same site. Source codes for this study can be downloaded from https://github.com/xiaole99/ARGs-OAP-v2.0.
Supplementary information
Supplementary data are available at Bioinformatics online.
The Ribosomal Database Project (RDP) Classifier, a naïve Bayesian classifier, can rapidly and accurately classify bacterial 16S rRNA sequences into the new higher-order taxonomy proposed in Bergey's ...Taxonomic Outline of the Prokaryotes (2nd ed., release 5.0, Springer-Verlag, New York, NY, 2004). It provides taxonomic assignments from domain to genus, with confidence estimates for each assignment. The majority of classifications (98%) were of high estimated confidence (>=95%) and high accuracy (98%). In addition to being tested with the corpus of 5,014 type strain sequences from Bergey's outline, the RDP Classifier was tested with a corpus of 23,095 rRNA sequences as assigned by the NCBI into their alternative higher-order taxonomy. The results from leave-one-out testing on both corpora show that the overall accuracies at all levels of confidence for near-full-length and 400-base segments were 89% or above down to the genus level, and the majority of the classification errors appear to be due to anomalies in the current taxonomies. For shorter rRNA segments, such as those that might be generated by pyrosequencing, the error rate varied greatly over the length of the 16S rRNA gene, with segments around the V2 and V4 variable regions giving the lowest error rates. The RDP Classifier is suitable both for the analysis of single rRNA sequences and for the analysis of libraries of thousands of sequences. Another related tool, RDP Library Compare, was developed to facilitate microbial-community comparison based on 16S rRNA gene sequence libraries. It combines the RDP Classifier with a statistical test to flag taxa differentially represented between samples. The RDP Classifier and RDP Library Compare are available online at http://rdp.cme.msu.edu/.
The Canadian Earth System Model version 5 (CanESM5) is a global
model developed to simulate historical climate change and variability, to
make centennial-scale projections of future climate, and to ...produce
initialized seasonal and decadal predictions. This paper describes the model
components and their coupling, as well as various aspects of model
development, including tuning, optimization, and a reproducibility strategy.
We also document the stability of the model using a long control simulation,
quantify the model's ability to reproduce large-scale features of the
historical climate, and evaluate the response of the model to external
forcing. CanESM5 is comprised of three-dimensional atmosphere (T63 spectral
resolution equivalent roughly to 2.8∘) and ocean (nominally 1∘) general
circulation models, a sea-ice model, a land surface scheme, and explicit
land and ocean carbon cycle models. The model features relatively coarse
resolution and high throughput, which facilitates the production of large
ensembles. CanESM5 has a notably higher equilibrium climate sensitivity
(5.6 K) than its predecessor, CanESM2 (3.7 K), which we briefly discuss, along
with simulated changes over the historical period. CanESM5 simulations
contribute to the Coupled Model Intercomparison Project phase 6 (CMIP6)
and will be employed for climate science and service applications in Canada.
Soil is an important reservoir of antibiotic resistance genes (ARGs), but their potential risk in different ecosystems as well as response to anthropogenic land use change is unknown. We used a ...metagenomic approach and datasets with well-characterized metadata to investigate ARG types and amounts in soil DNA of three native ecosystems: Alaskan tundra, US Midwestern prairie, and Amazon rainforest, as well as the effect of conversion of the latter two to agriculture and pasture, respectively.
High diversity (242 ARG subtypes) and abundance (0.184-0.242 ARG copies per 16S rRNA gene copy) were observed irrespective of ecosystem, with multidrug resistance and efflux pump the dominant class and mechanism. Ten regulatory genes were identified and they accounted for 13-35% of resistome abundances in soils, among them arlR, cpxR, ompR, vanR, and vanS were dominant and observed in all studied soils. We identified 55 non-regulatory ARGs shared by all 26 soil metagenomes of the three ecosystems, which accounted for more than 81% of non-regulatory resistome abundance. Proteobacteria, Firmicutes, and Actinobacteria were primary ARG hosts, 7 of 10 most abundant ARGs were found in all of them. No significant differences in both ARG diversity and abundance were observed between native prairie soil and adjacent long-term cultivated agriculture soil. We chose 12 clinically important ARGs to evaluate at the sequence level and found them to be distinct from those in human pathogens, and when assembled they were even more dissimilar. Significant correlation was found between bacterial community structure and resistome profile, suggesting that variance in resistome profile was mainly driven by the bacterial community composition.
Our results identify candidate background ARGs (shared in all 26 soils), classify ARG hosts, quantify resistance classes, and provide quantitative and sequence information suggestive of very low risk but also revealing resistance gene variants that might emerge in the future. Video abstract.
Environmental dissemination of antibiotic resistance genes (ARGs) has become an increasing concern for public health. Metagenomics approaches can effectively detect broad profiles of ARGs in ...environmental samples; however, the detection and subsequent classification of ARG-like sequences are time consuming and have been severe obstacles in employing metagenomic methods. We sought to accelerate quantification of ARGs in metagenomic data from environmental samples.
A Structured ARG reference database (SARG) was constructed by integrating ARDB and CARD, the two most commonly used databases. SARG was curated to remove redundant sequences and optimized to facilitate query sequence identification by similarity. A database with a hierarchical structure (type-subtype-reference sequence) was then constructed to facilitate classification (assigning ARG-like sequence to type, subtype and reference sequence) of sequences identified through similarity search. Utilizing SARG and a previously proposed hybrid functional gene annotation pipeline, we developed an online pipeline called ARGs-OAP for fast annotation and classification of ARG-like sequences from metagenomic data. We also evaluated and proposed a set of criteria important for efficiently conducting metagenomic analysis of ARGs using ARGs-OAP.
Perl script for ARGs-OAP can be downloaded from https://github.com/biofuture/Ublastx_stageone ARGs-OAP can be accessed through http://smile.hku.hk/SARGs
zhangt@hku.hk or tiedjej@msu.edu
Supplementary data are available at Bioinformatics online.
Antibiotics have been administered to agricultural animals for disease treatment disease prevention, and growth promotion for over 50 y. The impact of such antibiotic use on the treatment of human ...diseases is hotly debated. We raised pigs in a highly controlled environment with one portion of the littermates receiving a diet containing performance-enhancing antibiotics chlortetracycline, sulfamethazine, and penicillin (known as ASP250) and the other portion receiving the same diet but without the antibiotics. We used phylogenetic, metagenomic, and quantitative PCR-based approaches to address the impact of antibiotics on the swine gut microbiota. Bacterial phylotypes shifted after 14 d of antibiotic treatment with the medicated pigs showing an increase in Proteobacteria (1-11%) compared with nonmedicated pigs at the same time point. This shift was driven by an increase in Escherichia coli populations. Analysis of the metagenomes showed that microbial functional genes relating to energy production and conversion were increased in the antibiotic-fed pigs. The results also indicate that antibiotic resistance genes increased in abundance and diversity in the medicated swine microbiome despite a high background of resistance genes in nonmedicated swine. Some enriched genes, such as aminoglycoside O-phosphotransferases, confer resistance to antibiotics that were not administered in this study, demonstrating the potential for indirect selection of resistance to classes of antibiotics not fed. The collateral effects of feeding subtherapeutic doses of antibiotics to agricultural animals are apparent and must be considered in cost-benefit analyses.