The perpetually increasing rate at which viral full-genome sequences are being determined is creating a pressing demand for computational tools that will aid the objective classification of these ...genome sequences. Taxonomic classification approaches that are based on pairwise genetic identity measures are potentially highly automatable and are progressively gaining favour with the International Committee on Taxonomy of Viruses (ICTV). There are, however, various issues with the calculation of such measures that could potentially undermine the accuracy and consistency with which they can be applied to virus classification. Firstly, pairwise sequence identities computed based on multiple sequence alignments rather than on multiple independent pairwise alignments can lead to the deflation of identity scores with increasing dataset sizes. Also, when gap-characters need to be introduced during sequence alignments to account for insertions and deletions, methodological variations in the way that these characters are introduced and handled during pairwise genetic identity calculations can cause high degrees of inconsistency in the way that different methods classify the same sets of sequences. Here we present Sequence Demarcation Tool (SDT), a free user-friendly computer program that aims to provide a robust and highly reproducible means of objectively using pairwise genetic identity calculations to classify any set of nucleotide or amino acid sequences. SDT can produce publication quality pairwise identity plots and colour-coded distance matrices to further aid the classification of sequences according to ICTV approved taxonomic demarcation criteria. Besides a graphical interface version of the program for Windows computers, command-line versions of the program are available for a variety of different operating systems (including a parallel version for cluster computing platforms).
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
The family
Genomoviridae
(phylum
Cressdnaviricota
, class
Repensiviricetes
, order
Geplafuvirales
) includes viruses with circular single-stranded DNA genomes encoding two proteins, the capsid ...protein and the rolling-circle replication initiation protein. The genomes of the vast majority of members in this family have been sequenced directly from diverse environmental or animal- and plant-associated samples, but two genomoviruses have been identified infecting fungi. Since the last taxonomic update of the
Genomoviridae
, a number of new members of this family have been sequenced. Here, we report on the most recent taxonomic update, including the creation of one new genus,
Gemytripvirus
, and classification of ~420 new genomoviruses into 164 new species. We also announce the adoption of the “Genus + freeform epithet” binomial system for the naming of all 236 officially recognized species in the family
Genomoviridae
. The updated taxonomy presented in this article has been accepted by the International Committee on Taxonomy of Viruses (ICTV).
Nucleotide-based intergenomic similarities are useful to understand how viruses are related with each other and to classify them. Here we have developed VIRIDIC, which implements the traditional ...algorithm used by the International Committee on Taxonomy of Viruses (ICTV), Bacterial and Archaeal Viruses Subcommittee, to calculate virus intergenomic similarities. When compared with other software, VIRIDIC gave the best agreement with the traditional algorithm, which is based on the percent identity between two genomes determined by BLASTN. Furthermore, VIRIDIC proved best at estimating the relatedness between more distantly-related phages, relatedness that other tools can significantly overestimate. In addition to the intergenomic similarities, VIRIDIC also calculates three indicators of the alignment ability to capture the relatedness between viruses: the aligned fractions for each genome in a pair and the length ratio between the two genomes. The main output of VIRIDIC is a heatmap integrating the intergenomic similarity values with information regarding the genome lengths and the aligned genome fraction. Additionally, VIRIDIC can group viruses into clusters, based on user-defined intergenomic similarity thresholds. The sensitivity of VIRIDIC is given by the BLASTN. Thus, it is able to capture relationships between viruses having in common even short genomic regions, with as low as 65% similarity. Below this similarity level, protein-based analyses should be used, as they are the best suited to capture distant relationships. VIRIDIC is available at viridic.icbm.de, both as a web-service and a stand-alone tool. It allows fast analysis of large phage genome datasets, especially in the stand-alone version, which can be run on the user's own servers and can be integrated in bioinformatics pipelines. VIRIDIC was developed having viruses of
and
in mind; however, it could potentially be used for eukaryotic viruses as well, as long as they are monopartite.
Smacoviruses have small (∼2.3-2.9 kb), circular single-stranded DNA genomes encoding rolling circle replication-associated proteins (Rep) and unique capsid proteins. Although smacoviruses are ...prevalent in faecal matter of various vertebrates, including humans, none of these viruses have been cultured thus far. Smacoviruses display ∼45% genome-wide sequence diversity, which is very similar to that found within other families of circular Rep-encoding single-stranded (CRESS) DNA viruses, including members of the families
Geminiviridae
(46% diversity) and
Genomoviridae
(47% diversity). Here, we announce the creation of a new family
Smacoviridae
and describe a sequence-based taxonomic framework which was used to classify 83 smacovirus genomes into 43 species within six new genera,
Bovismacovirus
(n=3),
Cosmacovirus
(n=1),
Dragsmacovirus
(n=1),
Drosmacovirus
(n=3),
Huchismacovirus
(n=7), and
Porprismacovirus
(n=28). As in the case of genomoviruses, the species demarcation is based on the genome-wide pairwise identity, whereas genera are established based on the Rep amino acid sequence identity coupled with strong phylogenetic support. A similar sequence-based taxonomic framework should guide the classification of an astonishing diversity of other uncultured and currently unclassified CRESS DNA viruses discovered by metagenomic approaches.
The family
Smacoviridae
(order
Cremevirales
, class
Arfiviricetes
, phylum
Cressdnaviricota
) is comprised of viruses with small circular single-stranded DNA genomes of ~2.3-3 kb in length that have ...primarily been identified in fecal sample of various animals. Smacovirus genomes carry two genes in ambisense orientation encoding a capsid protein and a rolling-circle replication initiation protein, respectively. We have revised the taxonomy of the family by assigning 138 new genomic sequences deposited in GenBank to already established taxa as well as 41 new species and six new genera. Furthermore, we have adopted binomial species nomenclature, conforming to the “Genus + freeform epithet” format for all 84 species from 12 genera. The updated
Smacoviridae
taxonomy presented in this article has been ratified by the International Committee on Taxonomy of Viruses (ICTV).
Single-stranded (ss) DNA viruses are a major component of the earth virome. In particular, the circular, Rep-encoding ssDNA (CRESS-DNA) viruses show high diversity and abundance in various habitats. ...By combining sequence similarity network and phylogenetic analyses of the replication proteins (Rep) belonging to the HUH endonuclease superfamily, we show that the replication machinery of the CRESS-DNA viruses evolved, on three independent occasions, from the Reps of bacterial rolling circle-replicating plasmids. The CRESS-DNA viruses emerged via recombination between such plasmids and cDNA copies of capsid genes of eukaryotic positive-sense RNA viruses. Similarly, the rep genes of prokaryotic DNA viruses appear to have evolved from HUH endonuclease genes of various bacterial and archaeal plasmids. Our findings also suggest that eukaryotic polyomaviruses and papillomaviruses with dsDNA genomes have evolved via parvoviruses from CRESS-DNA viruses. Collectively, our results shed light on the complex evolutionary history of a major class of viruses revealing its polyphyletic origins.
The family Circoviridae comprises viruses with small, circular, single-stranded DNA (ssDNA) genomes, including the smallest known animal viruses. Members of this family are classified into two ...genera, Circovirus and Cyclovirus, which are distinguished by the position of the origin of replication relative to the coding regions and the length of the intergenic regions. Within each genus, the species demarcation threshold is 80 % genome-wide nucleotide sequence identity. This is a summary of the International Committee on Taxonomy of Viruses (ICTV) Report on the taxonomy of the Circoviridae, which is available at www.ictv.global/report/circoviridae.
Viruses and mobile genetic elements are molecular parasites or symbionts that coevolve with nearly all forms of cellular life. The route of virus replication and protein expression is determined by ...the viral genome type. Comparison of these routes led to the classification of viruses into seven "Baltimore classes" (BCs) that define the major features of virus reproduction. However, recent phylogenomic studies identified multiple evolutionary connections among viruses within each of the BCs as well as between different classes. Due to the modular organization of virus genomes, these relationships defy simple representation as lines of descent but rather form complex networks. Phylogenetic analyses of virus hallmark genes combined with analyses of gene-sharing networks show that replication modules of five BCs (three classes of RNA viruses and two classes of reverse-transcribing viruses) evolved from a common ancestor that encoded an RNA-directed RNA polymerase or a reverse transcriptase. Bona fide viruses evolved from this ancestor on multiple, independent occasions via the recruitment of distinct cellular proteins as capsid subunits and other structural components of virions. The single-stranded DNA (ssDNA) viruses are a polyphyletic class, with different groups evolving by recombination between rolling-circle-replicating plasmids, which contributed the replication protein, and positive-sense RNA viruses, which contributed the capsid protein. The double-stranded DNA (dsDNA) viruses are distributed among several large monophyletic groups and arose via the combination of distinct structural modules with equally diverse replication modules. Phylogenomic analyses reveal the finer structure of evolutionary connections among RNA viruses and reverse-transcribing viruses, ssDNA viruses, and large subsets of dsDNA viruses. Taken together, these analyses allow us to outline the global organization of the virus world. Here, we describe the key aspects of this organization and propose a comprehensive hierarchical taxonomy of viruses.
The family
Circoviridae
contains viruses with covalently closed, circular, single-stranded DNA (ssDNA) genomes, including the smallest known autonomously replicating, capsid-encoding animal ...pathogens. Members of this family are known to cause fatal diseases in birds and pigs and have been historically classified in one of two genera:
Circovirus
, which contains avian and porcine pathogens, and
Gyrovirus
, which includes a single species (
Chicken anemia virus
). However, over the course of the past six years, viral metagenomic approaches as well as degenerate PCR detection in unconventional hosts and environmental samples have elucidated a broader host range, including fish, a diversity of mammals, and invertebrates, for members of the family
Circoviridae
. Notably, these methods have uncovered a distinct group of viruses that are closely related to members of the genus
Circovirus
and comprise a new genus,
Cyclovirus
. The discovery of new viruses and a re-evaluation of genomic features that characterize members of the
Circoviridae
prompted a revision of the classification criteria used for this family of animal viruses. Here we provide details on an updated
Circoviridae
taxonomy ratified by the International Committee on the Taxonomy of Viruses in 2016, which establishes the genus
Cyclovirus
and reassigns the genus
Gyrovirus
to the family
Anelloviridae,
a separate lineage of animal viruses that also contains circular ssDNA genomes. In addition, we provide a new species demarcation threshold of 80% genome-wide pairwise identity for members of the family
Circoviridae
, based on pairwise identity distribution analysis, and list guidelines to distinguish between members of this family and other eukaryotic viruses with circular, ssDNA genomes.
With the advent of metagenomics approaches, a large diversity of known and unknown viruses has been identified in various types of environmental, plant, and animal samples. One such widespread virus ...group is the recently established family Genomoviridae which includes viruses with small (∼2–2.4 kb), circular ssDNA genomes encoding rolling-circle replication initiation proteins (Rep) and unique capsid proteins. Here, we propose a sequence-based taxonomic framework for classification of 121 new virus genomes within this family. Genomoviruses display ∼47% sequence diversity, which is very similar to that within the well-established and extensively studied family Geminiviridae (46% diversity). Based on our analysis, we establish a 78% genome-wide pairwise identity as a species demarcation threshold. Furthermore, using a Rep sequence phylogeny-based analysis coupled with the current knowledge on the classification of geminiviruses, we establish nine genera within the Genomoviridae family. These are Gemycircularvirus (n = 73), Gemyduguivirus (n = 1), Gemygorvirus (n = 9), Gemykibivirus (n = 29), Gemykolovirus (n = 3), Gemykrogvirus (n = 3), Gemykroznavirus (n = 1), Gemytondvirus (n = 1), Gemyvongvirus (n = 1). The presented taxonomic framework offers rational classification of genomoviruses based on the sequence information alone and sets an example for future classification of other groups of uncultured viruses discovered using metagenomics approaches.