The evolutionary dynamics of HIV during the chronic phase of infection is driven by the host immune response and by selective pressures exerted through drug treatment. To understand and model the ...evolution of HIV quantitatively, the parameters governing genetic diversification and the strength of selection need to be known. While mutation rates can be measured in single replication cycles, the relevant effective recombination rate depends on the probability of coinfection of a cell with more than one virus and can only be inferred from population data. However, most population genetic estimators for recombination rates assume absence of selection and are hence of limited applicability to HIV, since positive and purifying selection are important in HIV evolution. Yet, little is known about the distribution of selection differentials between individual viruses and the impact of single polymorphisms on viral fitness. Here, we estimate the rate of recombination and the distribution of selection coefficients from time series sequence data tracking the evolution of HIV within single patients. By examining temporal changes in the genetic composition of the population, we estimate the effective recombination to be rho = 1.4+/-0.6 x 10(-5) recombinations per site and generation. Furthermore, we provide evidence that the selection coefficients of at least 15% of the observed non-synonymous polymorphisms exceed 0.8% per generation. These results provide a basis for a more detailed understanding of the evolution of HIV. A particularly interesting case is evolution in response to drug treatment, where recombination can facilitate the rapid acquisition of multiple resistance mutations. With the methods developed here, more precise and more detailed studies will be possible as soon as data with higher time resolution and greater sample sizes are available.
When two influenza viruses co-infect the same cell, they can exchange genome segments in a process known as reassortment. Reassortment is an important source of genetic diversity and is known to have ...been involved in the emergence of most pandemic influenza strains. However, because of the difficulty in identifying reassortment events from viral sequence data, little is known about their role in the evolution of the seasonal influenza viruses. Here we introduce TreeKnit, a method that infers ancestral reassortment graphs (ARG) from two segment trees. It is based on topological differences between trees, and proceeds in a greedy fashion by finding regions that are compatible in the two trees. Using simulated genealogies with reassortments, we show that TreeKnit performs well in a wide range of settings and that it is as accurate as a more principled bayesian method, while being orders of magnitude faster. Finally, we show that it is possible to use the inferred ARG to better resolve segment trees and to construct more informative visualizations of reassortments.
Human seasonal influenza viruses evolve rapidly, enabling the virus population to evade immunity and reinfect previously infected individuals. Antigenic properties are largely determined by the ...surface glycoprotein hemagglutinin (HA), and amino acid substitutions at exposed epitope sites in HA mediate loss of recognition by antibodies. Here, we show that antigenic differences measured through serological assay data are well described by a sum of antigenic changes along the path connecting viruses in a phylogenetic tree. This mapping onto the tree allows prediction of antigenicity from HA sequence data alone. The mapping can further be used to make predictions about the makeup of the future A(H3N2) seasonal influenza virus population, and we compare predictions between models with serological and sequence data. To make timely model output readily available, we developed a web browser-based application that visualizes antigenic data on a continuously updated phylogeny.
Crop disease outbreaks are often associated with clonal expansions of single pathogenic lineages. To determine whether similar boom-and-bust scenarios hold for wild pathosystems, we carried out a ...multi-year, multi-site survey of Pseudomonas in its natural host Arabidopsis thaliana. The most common Pseudomonas lineage corresponded to a ubiquitous pathogenic clade. Sequencing of 1,524 genomes revealed this lineage to have diversified approximately 300,000 years ago, containing dozens of genetically identifiable pathogenic sublineages. There is differentiation at the level of both gene content and disease phenotype, although the differentiation may not provide fitness advantages to specific sublineages. The coexistence of sublineages indicates that in contrast to crop systems, no single strain has been able to overtake the studied A. thaliana populations in the recent past. Our results suggest that selective pressures acting on a plant pathogen in wild hosts are likely to be much more complex than those in agricultural systems.
Display omitted
•Wild A. thaliana is regularly colonized by a single Pseudomonas OTU•Strains within this OTU diverged from one another at least 300,000 years ago•Many strains can cause disease and are classified as pathogenic•In contrast to agriculture, no single pathogenic strain dominates host populations
Disease outbreaks in agriculture are often associated with clonal expansions of single pathogenic lineages. In this study, Karasov et al. show that in populations of a wild plant, no single lineage of an abundant pathogen takes over the host population. Genetic and species diversity may prevent clonal expansions in nature.
Many microbial populations rapidly adapt to changing environments with multiple variants competing for survival. To quantify such complex evolutionary dynamics in vivo, time resolved and genome wide ...data including rare variants are essential. We performed whole-genome deep sequencing of HIV-1 populations in 9 untreated patients, with 6-12 longitudinal samples per patient spanning 5-8 years of infection. The data can be accessed and explored via an interactive web application. We show that patterns of minor diversity are reproducible between patients and mirror global HIV-1 diversity, suggesting a universal landscape of fitness costs that control diversity. Reversions towards the ancestral HIV-1 sequence are observed throughout infection and account for almost one third of all sequence changes. Reversion rates depend strongly on conservation. Frequent recombination limits linkage disequilibrium to about 100 bp in most of the genome, but strong hitch-hiking due to short range linkage limits diversity.
Abstract
Continued evolution and adaptation of SARS-CoV-2 has led to more transmissible and immune-evasive variants with profound impacts on the course of the pandemic. Here I analyze the evolution ...of the virus over 2.5 years since its emergence and estimate the rates of evolution for synonymous and non-synonymous changes separately for evolution within clades—well-defined monophyletic groups with gradual evolution—and for the pandemic overall. The rate of synonymous mutation is found to be around 6 changes per year. Synonymous rates within variants vary little from variant to variant and are compatible with the overall rate of 7 changes per year (or $7.5 \times 10^{-4}$ per year and codon). In contrast, the rate at which variants accumulate amino acid changes (non-synonymous mutations) was initially around 12-16 changes per year, but in 2021 and 2022 it dropped to 6-9 changes per year. The overall rate of non-synonymous evolution, that is across variants, is estimated to be about 26 amino acid changes per year (or $2.7 \times 10^{-3}$ per year and codon). This strong acceleration of the overall rate compared to within clade evolution indicates that the evolutionary process that gave rise to the different variants is qualitatively different from that in typical transmission chains and likely dominated by adaptive evolution. I further quantify the spectrum of mutations and purifying selection in different SARS-CoV-2 proteins and show that the massive global sampling of SARS-CoV-2 is sufficient to estimate site-specific fitness costs across the entire genome. Many accessory proteins evolve under limited evolutionary constraints with little short-term purifying selection. About half of the mutations in other proteins are strongly deleterious.
HIV-1 infection cannot be cured because the virus persists as integrated proviral DNA in long-lived cells despite years of suppressive antiretroviral therapy (ART). In a previous paper (Zanini et al, ...2015) we documented HIV-1 evolution in 10 untreated patients. Here we characterize establishment, turnover, and evolution of viral DNA reservoirs in the same patients after 3-18 years of suppressive ART. A median of 14% (range 0-42%) of the DNA sequences were defective due to G-to-A hypermutation. Remaining DNA sequences showed no evidence of evolution over years of suppressive ART. Most sequences from the DNA reservoirs were very similar to viruses actively replicating in plasma (RNA sequences) shortly before start of ART. The results do not support persistent HIV-1 replication as a mechanism to maintain the HIV-1 reservoir during suppressive therapy. Rather, the data indicate that DNA variants are turning over as long as patients are untreated and that suppressive ART halts this turnover.
The rapid development of sequencing technologies has to led to an explosion of pathogen sequence data, which are increasingly collected as part of routine surveillance or clinical diagnostics. In ...public health, sequence data are used to reconstruct the evolution of pathogens, to anticipate future spread, and to target interventions. In clinical settings, whole-genome sequencing can identify pathogens at the strain level, can be used to predict phenotypes such as drug resistance and virulence, and can inform treatment by linking closely related cases. While sequencing has become cheaper, the analysis of sequence data has become an important bottleneck. Deriving interpretable and actionable results for a large variety of pathogens, each with its own complexity, from continuously updated data is a daunting task that requires flexible bioinformatic workflows and dissemination platforms. Here, we review recent developments in real-time analyses of pathogen sequence data, with a particular focus on the visualization and integration of sequence and phenotype data.
Given a sample of genome sequences from an asexual population, can one predict its evolutionary future? Here we demonstrate that the branching patterns of reconstructed genealogical trees contains ...information about the relative fitness of the sampled sequences and that this information can be used to predict successful strains. Our approach is based on the assumption that evolution proceeds by accumulation of small effect mutations, does not require species specific input and can be applied to any asexual population under persistent selection pressure. We demonstrate its performance using historical data on seasonal influenza A/H3N2 virus. We predict the progenitor lineage of the upcoming influenza season with near optimal performance in 30% of cases and make informative predictions in 16 out of 19 years. Beyond providing a tool for prediction, our ability to make informative predictions implies persistent fitness variation among circulating influenza A/H3N2 viruses.
Seasonal influenza is controlled through vaccination campaigns. Evolution of influenza virus antigens means that vaccines must be updated to match novel strains, and vaccine effectiveness depends on ...the ability of scientists to predict nearly a year in advance which influenza variants will dominate in upcoming seasons. In this review, we highlight a promising new surveillance tool: predictive models. Based on data-sharing and close collaboration between the World Health Organization and academic scientists, these models use surveillance data to make quantitative predictions regarding influenza evolution. Predictive models demonstrate the potential of applied evolutionary biology to improve public health and disease control. We review the state of influenza predictive modeling and discuss next steps and recommendations to ensure that these models deliver upon their considerable biomedical promise.
Seasonal influenza evolves to evade immune recognition, necessitating regular vaccine updates. The World Health Organizationhas collaborated with academic institutions and national public health organizations to build a global surveillance program for monitoring influenza evolution.
Scientists have built predictive models grounded in evolutionary theory that use surveillance data to forecast which viral strains or clades will predominate in the coming months.
Output from these models is already being used to inform influenza vaccine strain selection.
This modeling sheds light on basic science questions: the degree to which evolution is directed and the phylogenetic and genomic signatures of fitness.
This is a success story for large-scale collaborative science.