The
genus of mammal-infecting viruses includes three subgenera (
,
and
, in which most known human coronaviruses, including SARS-CoV-2, cluster. Coronaviruses are prone to host shifts, with ...recombination and positive selection possibly contributing to their high zoonotic potential. We analyzed the role of these two forces in the evolution of viruses belonging to the
genus. The results showed that recombination has been pervasive during sarbecovirus evolution, and it is more widespread in this subgenus compared to the other two. In both sarbecoviruses and merbecoviruses, recombination hotspots are clearly observed. Conversely, positive selection was a less prominent force in sarbecoviruses compared to embecoviruses and merbecoviruses and targeted distinct genomic regions in the three subgenera, with S being the major target in sarbecoviruses alone. Overall, the results herein indicate that
subgenera evolved along different trajectories, which might recapitulate their host preferences or reflect the origins of the presently available coronavirus sequences.
The Flavivirus genus comprises several human pathogens such as dengue virus (DENV), Japanese encephalitis virus (JEV), and Zika virus (ZIKV). Although ZIKV usually causes mild symptoms, growing ...evidence is linking it to congenital birth defects and to increased risk of Guillain-Barré syndrome. ZIKV encodes a polyprotein that is processed to produce three structural and seven nonstructural (NS) proteins. We investigated the evolution of the viral polyprotein in ZIKV and in related flaviviruses (DENV, Spondweni virus, and Kedougou virus). After accounting for saturation issues, alignment uncertainties, and recombination, we found evidence of episodic positive selection on the branch that separates DENV from the other flaviviruses. NS1 emerged as the major selection target, and selected sites were located in immune epitopes or in functionally important protein regions. Three of these sites are located in an NS1 region that interacts with structural proteins and is essential for virion biogenesis. Analysis of the more recent evolutionary history of ZIKV lineages indicated that positive selection acted on NS5 and NS4B, this latter representing the preferential target. All selected sites were located in the N-terminal portion of NS4B, which inhibits interferon response. One of the positively selected sites (26M/I/T/V) in ZIKV also represents a selection target in sylvatic DENV2 isolates, and a nearby residue evolves adaptively in JEV. Two additional positively selected sites are within a protein region that interacts with host (e.g. STING) and viral (i.e. NS1, NS4A) proteins. Notably, mutations in the NS4B region of other flaviviruses modulate neurovirulence and/or neuroinvasiveness. These results suggest that the positively selected sites we identified modulate viral replication and contribute to immune evasion. These sites should be prioritized in future experimental studies. However, analyses herein detected no selective events associated to the spread of the Asian/American ZIKV lineage.
Many human genes have adapted to the constant threat of exposure to infectious agents; according to the "hygiene hypothesis," lack of exposure to parasites in modern settings results in immune ...imbalances, augmenting susceptibility to the development of autoimmune and allergic conditions. Here, by estimating the number of pathogen species/genera in a specific geographic location (pathogen richness) for 52 human populations and analyzing 91 interleukin (IL)/IL receptor genes (IL genes), we show that helminths have been a major selective force on a subset of these genes. A population genetics analysis revealed that five IL genes, including IL7R and IL18RAP, have been a target of balancing selection, a selection process that maintains genetic variability within a population. Previous identification of polymorphisms in some of these loci, and their association with autoimmune conditions, prompted us to investigate the relationship between adaptation and disease. By searching for variants in IL genes identified in genome-wide association studies, we verified that six risk alleles for inflammatory bowel (IBD) or celiac disease are significantly correlated with micropathogen richness. These data support the hygiene hypothesis for IBD and provide a large set of putative targets for susceptibility to helminth infections.
The zinc-finger antiviral protein (ZAP) is an innate immunity sensor of non-self nucleic acids. Its antiviral activity is exerted through the physical interaction with different cofactors, including ...TRIM25, Riplet and KHNYN. Cellular proteins that interact with infectious agents are expected to be engaged in genetic conflicts that often result in their rapid evolution. To test this possibility and to identify the regions most strongly targeted by natural selection, we applied in silico molecular evolution tools to analyze the evolutionary history of ZAP and cofactors in four mammalian groups. We report evidence of positive selection in all genes and in most mammalian groups. On average, the intrinsically disordered regions (IDRs) embedded in the four proteins evolve significantly faster than folded domains and most positively selected sites fall within IDRs. In ZAP, the PARP domain also shows abundant signals of selection, and independent evolution in different mammalian groups suggests modulation of its ADP-ribose binding ability. Detailed analyses of the biophysical properties of IDRs revealed that chain compaction and conformational entropy are conserved across mammals. The IDRs in ZAP and KHNYN are particularly compact, indicating that they may promote phase separation (PS). In line with this hypothesis, we predicted several PS-promoting regions in ZAP and KHNYN, as well as in TRIM25. Positively selected sites are abundant in these regions, suggesting that PS may be important for the antiviral functions of these proteins and the evolutionary arms race with viruses. Our data shed light into the evolution of ZAP and cofactors and indicate that IDRs represent central elements in host-pathogen interactions.
Display omitted
Abstract Fusobacteria have been associated to different diseases, including colorectal cancer (CRC), but knowledge of which taxonomic groups contribute to specific conditions is incomplete. We ...analyzed the genetic diversity and relationships within the Fusobacterium genus. We report recent and ancestral recombination in core genes, indicating that fusobacteria have mosaic genomes and emphasizing that taxonomic demarcation should not rely on single genes/gene regions. Across databases, we found ample evidence of species miss-classification and of undescribed species, which are both expected to complicate disease association. By focusing on a lineage that includes F. periodonticum/pseudoperiodonticum and F. nucleatum , we show that genomes belong to four modern populations, but most known species/subspecies emerged from individual ancestral populations. Of these, the F. periodonticum/pseudoperiodonticum population experienced the lowest drift and displays the highest genetic diversity, in line with the less specialized distribution of these bacteria in oral sites. A highly drifted ancestral population instead contributed genetic ancestry to a new species, which includes genomes classified within the F. nucleatum animalis diversity in a recent CRC study. Thus, evidence herein calls for a re-analysis of F. nucleatum animalis features associated to CRC. More generally, our data inform future molecular profiling approaches to investigate the epidemiology of Fusobacterium -associated diseases.
Cytomegaloviruses (order Herpesvirales) display remarkable species-specificity as a result of long-term co-evolution with their mammalian hosts. Human cytomegalovirus (HCMV) is exquisitely adapted to ...our species and displays high genetic diversity. We leveraged information on inter-species divergence of primate-infecting cytomegaloviruses and intra-species diversity of clinical isolates to provide a genome-wide picture of HCMV adaptation across different time-frames. During adaptation to the human host, core viral genes were commonly targeted by positive selection. Functional characterization of adaptive mutations in the primase gene (UL70) indicated that selection favored amino acid replacements that decrease viral replication in human fibroblasts, suggesting evolution towards viral temperance. HCMV intra-species diversity was largely governed by immune system-driven selective pressure, with several adaptive variants located in antigenic domains. A significant excess of positively selected sites was also detected in the signal peptides (SPs) of viral proteins, indicating that, although they are removed from mature proteins, SPs can contribute to viral adaptation. Functional characterization of one of these SPs indicated that adaptive variants modulate the timing of cleavage by the signal peptidase and the dynamics of glycoprotein intracellular trafficking. We thus used evolutionary information to generate experimentally-testable hypotheses on the functional effect of HCMV genetic diversity and we define modulators of viral phenotypes.
The ongoing worldwide monkeypox outbreak is caused by viral lineages (globally referred to as hMPXV1) that are related to but distinct from clade IIb MPXV viruses transmitted within Nigeria. Analysis ...of the genetic differences has indicated that APOBEC-mediated editing might be responsible for the unexpectedly high number of mutations observed in hMPXV1 genomes. Here, using 1,624 publicly available hMPXV1 sequences, we analyzed the mutations that accrued between 2017 and the emergence of the current predominant variant (B.1), as well as those that that have been accumulating during the 2022 outbreak. We confirmed an overwhelming prevalence of C-to-T and G-to-A mutations, with a sequence context (5'-TC-3') consistent with the preferences of several human APOBEC3 enzymes. We also found that mutations preferentially occur in highly expressed viral genes, although no transcriptional asymmetry was observed. A comparison of the mutation spectrum and context was also performed against the human-specific variola virus (VARV) and the zoonotic cowpox virus (CPXV), as well as fowlpox virus (FWPV). The results indicated that in VARV genomes, C-to-T and G-to-A changes were more common than the opposite substitutions, although the effect was less marked than for hMPXV1. Conversely, no preference toward C-to-T and G-to-A changes was observed in CPXV and FWPV. Consistently, the sequence context of C-to-T changes confirmed a preference for a T in the -1 position for VARV, but not for CPXV or FWPV. Overall, our results strongly support the view that, irrespective of the transmission route, orthopoxviruses infecting humans are edited by the host APOBEC3 enzymes.
Analysis of the viral lineages responsible for the 2022 monkeypox outbreak suggested that APOBEC enzymes are driving hMPXV1 evolution. Using 1,624 public sequences, we analyzed the mutations that accumulated between 2017 and the emergence of the predominant variant and those that characterize the last outbreak. We found that the mutation spectrum of hMPXV1 has been dominated by TC-to-TT and GA-to-AA changes, consistent with the editing activity of human APOBEC3 proteins. We also found that mutations preferentially affect highly expressed viral genes, possibly because transcription exposes single-stranded DNA (ssDNA), a target of APOBEC3 editing. Notably, analysis of the human-specific variola virus (VARV) and the zoonotic cowpox virus (CPXV) indicated that in VARV genomes, TC-to-TT and GA-to-AA changes are likewise extremely frequent. Conversely, no preference toward TC-to-TT and GA-to-AA changes is observed in CPXV. These results suggest that APOBEC3 proteins have an impact on the evolution of different human-infecting orthopoxviruses.
Four endemic coronaviruses infect humans and cause mild symptoms. Because previous analyses were based on a limited number of sequences and did not control for effects that affect molecular dating, ...we re-assessed the timing of endemic coronavirus emergence. After controlling for recombination, selective pressure, and molecular clock model, we obtained similar tMRCA (time to the most recent common ancestor) estimates for the four coronaviruses, ranging from 72 (HCoV-229E) to 54 (HCoV-NL63) years ago. The split times of HCoV-229E and HCoV-OC43 from camel alphacoronavirus and bovine coronavirus were dated ~268 and ~99 years ago. The split times of HCoV-HKU1 and HCoV-NL63 could not be calculated, as their zoonoticic sources are unknown. To compare the timing of coronavirus emergence to that of another respiratory virus, we recorded the occurrence of influenza pandemics since 1500. Although there is no clear relationship between pandemic occurrence and human population size, the frequency of influenza pandemics seems to intensify starting around 1700, which corresponds with the initial phase of exponential increase of human population and to the emergence of HCoV-229E. The frequency of flu pandemics in the 19th century also suggests that the concurrence of HCoV-OC43 emergence and the Russian flu pandemic may be due to chance.
Akin to a molecular signature, dinucleotide composition can be exploited by the zinc-finger antiviral protein (ZAP) to restrict CpG-rich (and UpA-rich) RNA viruses. ZAP evolved in tetrapods, and it ...is not encoded by invertebrates and fish. Because a systematic analysis is missing, we analyzed the genomes of RNA viruses that infect vertebrates or invertebrates. We show that vertebrate single-stranded (ss) RNA(+) viruses and, to a lesser extent, double-stranded RNA viruses tend to have stronger CpG bias than invertebrate viruses. Conversely, ssRNA(-) viruses have similar dinucleotide composition whether they infect vertebrates or invertebrates. Analysis of ssRNA(+) viruses that infect mammals, reptiles, and fish indicated that ZAP is unlikely to be a major driver of CpG depletion. We also show that, compared to other coronaviruses, the genome of SARS-CoV-2 is not homogeneously CpG-depleted. Our study provides new insights into virus evolution and strategies for recoding RNA virus genomes.
Historically, allelic variations in blood group antigen (BGA) genes have been regarded as possible susceptibility factors for infectious diseases. Since host-pathogen interactions are major ...determinants in evolution, BGAs can be thought of as selection targets. In order to verify this hypothesis, we obtained an estimate of pathogen richness for geographic locations corresponding to 52 populations distributed worldwide; after correction for multiple tests and for variables different from selective forces, significant correlations with pathogen richness were obtained for multiple variants at 11 BGA loci out of 26. In line with this finding, we demonstrate that three BGA genes, namely CD55, CD151, and SLC14A1, have been subjected to balancing selection, a process, rare outside MHC genes, which maintains variability at a locus. Moreover, we identified a gene region immediately upstream the transcription start site of FUT2 which has undergone non-neutral evolution independently from the coding region. Finally, in the case of BSG, we describe the presence of a highly divergent haplotype clade and the possible reasons for its maintenance, including frequency-dependent balancing selection, are discussed. These data indicate that BGAs have been playing a central role in the host-pathogen arms race during human evolutionary history and no other gene category shows similar levels of widespread selection, with the only exception of loci involved in antigen recognition.