Many organisms that cause infectious diseases, particularly RNA viruses, mutate so rapidly that their evolutionary and ecological behaviours are inextricably linked. Consequently, aspects of the ...transmission and epidemiology of these pathogens are imprinted on the genetic diversity of their genomes. Large-scale empirical analyses of the evolutionary dynamics of important pathogens are now feasible owing to the increasing availability of pathogen sequence data and the development of new computational and statistical methods of analysis. In this Review, we outline the questions that can be answered using viral evolutionary analysis across a wide range of biological scales.
Responding to an outbreak of a novel coronavirus agent of coronavirus disease 2019 (COVID-19) in December 2019, China banned travel to and from Wuhan city on 23 January 2020 and implemented a ...national emergency response. We investigated the spread and control of COVID-19 using a data set that included case reports, human movement, and public health interventions. The Wuhan shutdown was associated with the delayed arrival of COVID-19 in other cities by 2.91 days. Cities that implemented control measures preemptively reported fewer cases on average (13.0) in the first week of their outbreaks compared with cities that started control later (20.6). Suspending intracity public transport, closing entertainment venues, and banning public gatherings were associated with reductions in case incidence. The national emergency response appears to have delayed the growth and limited the size of the COVID-19 epidemic in China, averting hundreds of thousands of cases by 19 February (day 50).
Information on global human movement patterns is central to spatial epidemiological models used to predict the behavior of influenza and other infectious diseases. Yet it remains difficult to test ...which modes of dispersal drive pathogen spread at various geographic scales using standard epidemiological data alone. Evolutionary analyses of pathogen genome sequences increasingly provide insights into the spatial dynamics of influenza viruses, but to date they have largely neglected the wealth of information on human mobility, mainly because no statistical framework exists within which viral gene sequences and empirical data on host movement can be combined. Here, we address this problem by applying a phylogeographic approach to elucidate the global spread of human influenza subtype H3N2 and assess its ability to predict the spatial spread of human influenza A viruses worldwide. Using a framework that estimates the migration history of human influenza while simultaneously testing and quantifying a range of potential predictive variables of spatial spread, we show that the global dynamics of influenza H3N2 are driven by air passenger flows, whereas at more local scales spread is also determined by processes that correlate with geographic distance. Our analyses further confirm a central role for mainland China and Southeast Asia in maintaining a source population for global influenza diversity. By comparing model output with the known pandemic expansion of H1N1 during 2009, we demonstrate that predictions of influenza spatial spread are most accurate when data on human mobility and viral evolution are integrated. In conclusion, the global dynamics of influenza viruses are best explained by combining human mobility data with the spatial information inherent in sampled viral genomes. The integrated approach introduced here offers great potential for epidemiological surveillance through phylogeographic reconstructions and for improving predictive models of disease control.
The incidence of dengue has grown dramatically in recent decades worldwide, especially in Southeast Asia and the Americas with substantial transmission in 2014-2015. Yet the mechanisms underlying the ...spatio-temporal circulation of dengue virus (DENV) serotypes at large geographical scales remain elusive. Here we investigate the co-circulation in Asia of DENV serotypes 1-3 from 1956 to 2015, using a statistical framework that jointly estimates migration history and quantifies potential predictors of viral spatial diffusion, including socio-economic, air transportation and maritime mobility data. We find that the spread of DENV-1, -2 and -3 lineages in Asia is significantly associated with air traffic. Our analyses suggest the network centrality of air traffic hubs such as Thailand and India contribute to seeding dengue epidemics, whilst China, Cambodia, Indonesia, and Singapore may establish viral diffusion links with multiple countries in Asia. Phylogeographic reconstructions help to explain how growing air transportation networks could influence the dynamics of DENV circulation.
The ongoing coronavirus disease 2019 (COVID-19) outbreak expanded rapidly throughout China. Major behavioral, clinical, and state interventions were undertaken to mitigate the epidemic and prevent ...the persistence of the virus in human populations in China and worldwide. It remains unclear how these unprecedented interventions, including travel restrictions, affected COVID-19 spread in China. We used real-time mobility data from Wuhan and detailed case data including travel history to elucidate the role of case importation in transmission in cities across China and to ascertain the impact of control measures. Early on, the spatial distribution of COVID-19 cases in China was explained well by human mobility data. After the implementation of control measures, this correlation dropped and growth rates became negative in most locations, although shifts in the demographics of reported cases were still indicative of local chains of transmission outside of Wuhan. This study shows that the drastic control measures implemented in China substantially mitigated the spread of COVID-19.
Hepatitis C virus (HCV) exhibits high genetic diversity, characterized by regional variations in genotype prevalence. This poses a challenge to the improved development of vaccines and pan‐genotypic ...treatments, which require the consideration of global trends in HCV genotype prevalence. Here we provide the first comprehensive survey of these trends. To approximate national HCV genotype prevalence, studies published between 1989 and 2013 reporting HCV genotypes are reviewed and combined with overall HCV prevalence estimates from the Global Burden of Disease (GBD) project. We also generate regional and global genotype prevalence estimates, inferring data for countries lacking genotype information. We include 1,217 studies in our analysis, representing 117 countries and 90% of the global population. We calculate that HCV genotype 1 is the most prevalent worldwide, comprising 83.4 million cases (46.2% of all HCV cases), approximately one‐third of which are in East Asia. Genotype 3 is the next most prevalent globally (54.3 million, 30.1%); genotypes 2, 4, and 6 are responsible for a total 22.8% of all cases; genotype 5 comprises the remaining <1%. While genotypes 1 and 3 dominate in most countries irrespective of economic status, the largest proportions of genotypes 4 and 5 are in lower‐income countries. Conclusion: Although genotype 1 is most common worldwide, nongenotype 1 HCV cases—which are less well served by advances in vaccine and drug development—still comprise over half of all HCV cases. Relative genotype proportions are needed to inform healthcare models, which must be geographically tailored to specific countries or regions in order to improve access to new treatments. Genotype surveillance data are needed from many countries to improve estimates of unmet need. (Hepatology 2015;61:77–87)
Understanding the causes and consequences of the emergence of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) variants of concern is crucial to pandemic control yet difficult to achieve ...because they arise in the context of variable human behavior and immunity. We investigated the spatial invasion dynamics of lineage B.1.1.7 by jointly analyzing UK human mobility, virus genomes, and community-based polymerase chain reaction data. We identified a multistage spatial invasion process in which early B.1.1.7 growth rates were associated with mobility and asymmetric lineage export from a dominant source location, enhancing the effects of B.1.1.7's increased intrinsic transmissibility. We further explored how B.1.1.7 spread was shaped by nonpharmaceutical interventions and spatial variation in previous attack rates. Our findings show that careful accounting of the behavioral and epidemiological context within which variants of concern emerge is necessary to interpret correctly their observed relative growth rates.
B cells undergo rapid mutation and selection for antibody binding affinity when producing antibodies capable of neutralizing pathogens. This evolutionary process can be intermixed with migration ...between tissues, differentiation between cellular subsets, and switching between functional isotypes. B cell receptor (BCR) sequence data has the potential to elucidate important information about these processes. However, there is currently no robust, generalizable framework for making such inferences from BCR sequence data. To address this, we develop three parsimony-based summary statistics to characterize migration, differentiation, and isotype switching along B cell phylogenetic trees. We use simulations to demonstrate the effectiveness of this approach. We then use this framework to infer patterns of cellular differentiation and isotype switching from high throughput BCR sequence datasets obtained from patients in a study of HIV infection and a study of food allergy. These methods are implemented in the R package dowser, available at https://dowser.readthedocs.io.
Phylogenetic analysis is now an important tool in the study of viral outbreaks. It can reconstruct epidemic history when surveillance epidemiology data are sparse, and can indicate transmission ...linkages among infections that may not otherwise be evident. However, a remaining challenge is to develop an analytical framework that can test hypotheses about the effect of environmental variables on pathogen spatial spread. Recent phylogeographic approaches can reconstruct the history of virus dispersal from sampled viral genomes and infer the locations of ancestral infections. Such methods provide a unique source of spatio-temporal information, and are exploited here.
We present and apply a new statistical framework that combines genomic and geographic data to test the impact of environmental variables on the mode and tempo of pathogen dispersal during emerging epidemics. First, the spatial history of an emerging pathogen is estimated using standard phylogeographic methods. The inferred dispersal path for each phylogenetic lineage is then assigned a "weight" using environmental data (e.g. altitude, land cover). Next, tests measure the association between each environmental variable and lineage movement. A randomisation procedure is used to assess statistical confidence and we validate this approach using simulated data. We apply our new framework to a set of gene sequences from an epidemic of rabies virus in North American raccoons. We test the impact of six different environmental variables on this epidemic and demonstrate that elevation is associated with a slower rabies spread in a natural population.
This study shows that it is possible to integrate genomic and environmental data in order to test hypotheses concerning the mode and tempo of virus dispersal during emerging epidemics.
Gene sequences sampled at different points in time can be used to infer molecular phylogenies on a natural timescale of months or years, provided that the sequences in question undergo measurable ...amounts of evolutionary change between sampling times. Data sets with this property are termed heterochronous and have become increasingly common in several fields of biology, most notably the molecular epidemiology of rapidly evolving viruses. Here we introduce the cross-platform software tool, TempEst (formerly known as Path-O-Gen), for the visualization and analysis of temporally sampled sequence data. Given a molecular phylogeny and the dates of sampling for each sequence, TempEst uses an interactive regression approach to explore the association between genetic divergence through time and sampling dates. TempEst can be used to (1) assess whether there is sufficient temporal signal in the data to proceed with phylogenetic molecular clock analysis, and (2) identify sequences whose genetic divergence and sampling date are incongruent. Examination of the latter can help identify data quality problems, including errors in data annotation, sample contamination, sequence recombination, or alignment error. We recommend that all users of the molecular clock models implemented in BEAST first check their data using TempEst prior to analysis.