Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2020 Mar 4;84(2):e00061-19.
doi: 10.1128/MMBR.00061-19. Print 2020 May 20.

Global Organization and Proposed Megataxonomy of the Virus World

Affiliations
Review

Global Organization and Proposed Megataxonomy of the Virus World

Eugene V Koonin et al. Microbiol Mol Biol Rev. .

Abstract

Viruses and mobile genetic elements are molecular parasites or symbionts that coevolve with nearly all forms of cellular life. The route of virus replication and protein expression is determined by the viral genome type. Comparison of these routes led to the classification of viruses into seven "Baltimore classes" (BCs) that define the major features of virus reproduction. However, recent phylogenomic studies identified multiple evolutionary connections among viruses within each of the BCs as well as between different classes. Due to the modular organization of virus genomes, these relationships defy simple representation as lines of descent but rather form complex networks. Phylogenetic analyses of virus hallmark genes combined with analyses of gene-sharing networks show that replication modules of five BCs (three classes of RNA viruses and two classes of reverse-transcribing viruses) evolved from a common ancestor that encoded an RNA-directed RNA polymerase or a reverse transcriptase. Bona fide viruses evolved from this ancestor on multiple, independent occasions via the recruitment of distinct cellular proteins as capsid subunits and other structural components of virions. The single-stranded DNA (ssDNA) viruses are a polyphyletic class, with different groups evolving by recombination between rolling-circle-replicating plasmids, which contributed the replication protein, and positive-sense RNA viruses, which contributed the capsid protein. The double-stranded DNA (dsDNA) viruses are distributed among several large monophyletic groups and arose via the combination of distinct structural modules with equally diverse replication modules. Phylogenomic analyses reveal the finer structure of evolutionary connections among RNA viruses and reverse-transcribing viruses, ssDNA viruses, and large subsets of dsDNA viruses. Taken together, these analyses allow us to outline the global organization of the virus world. Here, we describe the key aspects of this organization and propose a comprehensive hierarchical taxonomy of viruses.

Keywords: ICTV; evolution; megataxonomy; phylogenomics; phylogeny; realm; virosphere; virus classification; virus nomenclature; virus taxonomy.

PubMed Disclaimer

Figures

FIG 1
FIG 1
The seven Baltimore classes (BCs): information flow. For each BC, the processes of replication, transcription, translation, and virion assembly are shown by color-coded arrows (see the inset). Host enzymes that are involved in virus genome replication or transcription are prefixed with “h-,” and in cases when, in a given BC, one of these processes can be mediated by either a host- or a virus-encoded enzyme, the latter is prefixed with “v-.” Otherwise, virus-encoded enzymes are not prefixed. CP, capsid protein; DdDp, DNA-directed DNA polymerase; DdRp, DNA-directed RNA polymerase; gRNA, genomic RNA; RdRp, RNA-directed RNA polymerase; RT, reverse transcriptase; RCRE, rolling-circle replication (initiation) endonuclease.
FIG 2
FIG 2
Distribution of the seven BCs of viruses in the major divisions of prokaryotes and eukaryotes. The virus genera are from the ICTV report (https://talk.ictvonline.org/ictv-reports/ictv_online_report/).
FIG 3
FIG 3
Comparison of gene frequency distributions for prokaryotes and dsDNA viruses (Baltimore class I). The horizontal axis shows the fraction of genomes in which a given family of orthologous genes is represented. The vertical axis (logarithmic scale) shows the relative frequency of orthologous gene families that are represented in the given number of genomes. Thus, common genes are on the right of the plot, and rare genes are on the left.
FIG 4
FIG 4
Representation of the 6 “superviral hallmark genes” in virus genomes of the seven Baltimore classes. The “superviral hallmark proteins” are shown by ribbon diagrams of the representative protein structures. The lines connect the proteins with the viruses of BCs in which they are present. The thickness of each connecting line roughly reflects the abundance of a given “superhallmark” gene in a given BC. DJR-CP, double-jelly-roll capsid protein; RCRE, rolling-circle replication (initiation) endonuclease; RdRp, RNA-directed RNA polymerase; RT, reverse transcriptase; S3H, superfamily 3 helicase; SJR-CP, single-jelly-roll capsid protein.
FIG 5
FIG 5
Schematic representation of the phylogenetic tree of the RNA-directed RNA polymerases (RdRps) of RNA viruses (realm Riboviria). The five major branches discussed in the text are labeled. Only selected clades including the best-characterized virus groups are shown within each major branch. Each branch represents collapsed sequences of the respective set of RdRps. The reverse transcriptase sequences from group II introns and non-LTR retrotransposons were used as the outgroup to root the tree. The three BCs included in the tree are color-coded (orange, BC III; blue, BC IV; purple, BC V). (Adapted from reference .)
FIG 6
FIG 6
Schematic phylogenetic subtree of the reverse transcriptases (RTs) of reverse-transcribing viruses (Caulimoviridae, Hepadnaviridae, and Retroviridae) and long terminal repeat (LTR) retrotransposons (Belpaoviridae, Metaviridae, and Pseudoviridae). The tree is rooted using sequences from nonviral retroelements (prokaryotic group II introns and non-LTR retrotransposons). Genome organizations of selected representatives of reverse-transcribing viruses and LTR retrotransposons are shown near the corresponding clades. LTRs are shown as black triangles. 6, 6-kDa protein; ATF, aphid transmission factor; CA/CP, capsid protein; CHR, chromodomain (contained only in the integrases of certain clades of metavirids from plants, fungi, and several vertebrates); gag, group-specific antigen; env, envelope gene; INT, integrase; MA, matrix protein; MP, movement protein; NC, nucleocapsid; nef, tat, rev, vif, vpr, and vpu, genes expressing regulatory proteins via spliced mRNAs; P, polymerase; pol, polymerase gene; PR, protease; PreS, pre-surface protein (envelope); PX/TA, protein X/transcription activator; RH, RNase H; SU, surface glycoprotein; TM, transmembrane glycoprotein; TP, terminal protein domain; TT/SR, translation trans-activator/suppressor of RNA interference; VAP, virion-associated protein. The two BCs included in the tree are color-coded (green, BC VI; brown, BC VII). (Adapted from reference .)
FIG 7
FIG 7
Multiple, chimeric origins of ssDNA viruses (and two groups of dsDNA viruses [Papillomaviridae and Polyomaviridae]). The genomes are shown with two colors each, to emphasize the distinct origins of the genes encoding proteins involved in replication (Reps) and those encoding capsid proteins (CPs). JRC, jelly roll capsid protein (nonorthologous JRC genes are shown with different colors); fCP, filamentous capsid protein; pCP, polymorphic capsid protein; DdDp, DNA-directed DNA polymerase; RCRE, rolling-circle replication (initiation) endonuclease; S3H, superfamily 3 helicase. (Adapted from reference .)
FIG 8
FIG 8
Bipartite gene-genome network of dsDNA viruses. (A) Complete network. The larger circles show nodes corresponding to genomes, and dots show nodes corresponding to core gene families. An edge connecting a circle and a dot indicates that a genome contains a representative of a core gene family. Virus genomes that belong to different modules of the network are shown in different colors. (B and C) Internal organizations of the DJR-MCP (B) and HK97-MCP (C) supermodules. The individual modules within each supermodule are shown by circles. The positions of some major groups of viruses are indicated. In panel C, the numbers correspond to the module numbering described previously (79); the unnamed modules are small groups of tailed bacteriophages. DNAP, DNA polymerase. (Adapted from reference .)
FIG 9
FIG 9
Proposed evolutionary scenario for DJR-MCP supermodule viruses. The host ranges of viruses are shown by colored circles (see the color-code on the left). For nucleocytoplasmic large DNA viruses (NCLDVs), schematic virion shapes are shown. (Adapted from reference .)
FIG 10
FIG 10
Comparison of the structures of the major capsid proteins of bacterial, archaeal, and eukaryotic viruses of the HK97-MCP supermodule. For HCMV and HK97, the Protein Data Bank (PDB) accession numbers are indicated in parentheses (5VKU and 1OHG). HCMV, human cytomegalovirus; HK97, Escherichia coli phage HK97; HSTV-1, Haloarcula sinaiiensis tailed virus 1.
FIG 11
FIG 11
Proposed megataxonomy of the virus world. Officially proposed taxon names are printed in boldface type; names of taxa that have already been accepted by the ICTV are rendered in normal font. The names of the virus groups that have not been officially proposed or formally accepted are shown in quotation marks. Taxon and virus group positions in the hierarchy, when uncertain, are indicated by fading curved dotted lines. The etymology of newly proposed taxon names associated with this article is outlined in File S1 in the supplemental material. Official taxa or officially proposed taxa that can be assigned only to top ranks at this time are indicated by curved dashed lines in the absence of quotation marks. All taxa/virus groups are to be considered sensu lato; i.e., they may include numerous “-like” viruses not yet named and/or classified. Proposed taxon names should be seen as provisional. Definitive names will be approved and ratified by the ICTV and will be found at the ICTV website (https://talk.ictvonline.org/).
FIG 11
FIG 11
Proposed megataxonomy of the virus world. Officially proposed taxon names are printed in boldface type; names of taxa that have already been accepted by the ICTV are rendered in normal font. The names of the virus groups that have not been officially proposed or formally accepted are shown in quotation marks. Taxon and virus group positions in the hierarchy, when uncertain, are indicated by fading curved dotted lines. The etymology of newly proposed taxon names associated with this article is outlined in File S1 in the supplemental material. Official taxa or officially proposed taxa that can be assigned only to top ranks at this time are indicated by curved dashed lines in the absence of quotation marks. All taxa/virus groups are to be considered sensu lato; i.e., they may include numerous “-like” viruses not yet named and/or classified. Proposed taxon names should be seen as provisional. Definitive names will be approved and ratified by the ICTV and will be found at the ICTV website (https://talk.ictvonline.org/).
FIG 12
FIG 12
Virus realms, Baltimore classes, and the global network of the virus world. Gray shapes denote the four virus realms, and white rectangles denote the seven Baltimore classes. The colored circles contain ribbon diagram representations of the structures of hallmark virus proteins and the RNA recognition motif (RRM) domain. Circles are connected to groups of viruses, in which a given viral hallmark protein or domain is represented, and to BCs by lines of the corresponding color. Abbreviations: DJR, double jelly roll; PolB, family B DNA-directed DNA polymerase; RCRE, rolling-circle replication (initiation) endonuclease; RdRp, RNA-directed RNA polymerase; RT, reverse transcriptase; S3H, superfamily 3 helicase; SJR, single jelly roll.

Similar articles

Cited by

References

    1. Chow C-E, Suttle CA. 2015. Biogeography of viruses in the sea. Annu Rev Virol 2:41–66. doi:10.1146/annurev-virology-031413-085540. - DOI - PubMed
    1. Edwards RA, Rohwer F. 2005. Viral metagenomics. Nat Rev Microbiol 3:504–510. doi:10.1038/nrmicro1163. - DOI - PubMed
    1. Rohwer F. 2003. Global phage diversity. Cell 113:141. doi:10.1016/s0092-8674(03)00276-9. - DOI - PubMed
    1. Rosario K, Breitbart M. 2011. Exploring the viral world through metagenomics. Curr Opin Virol 1:289–297. doi:10.1016/j.coviro.2011.06.004. - DOI - PubMed
    1. Suttle CA. 2005. Viruses in the sea. Nature 437:356–361. doi:10.1038/nature04160. - DOI - PubMed

Publication types

LinkOut - more resources