Genome-Wide Association Studies (GWAS) in microbial organisms have the potential to vastly improve the way we understand, manage, and treat infectious diseases. Yet, microbial GWAS methods ...established thus far remain insufficiently able to capitalise on the growing wealth of bacterial and viral genetic sequence data. Facing clonal population structure and homologous recombination, existing GWAS methods struggle to achieve both the precision necessary to reject spurious findings and the power required to detect associations in microbes. In this paper, we introduce a novel phylogenetic approach that has been tailor-made for microbial GWAS, which is applicable to organisms ranging from purely clonal to frequently recombining, and to both binary and continuous phenotypes. Our approach is robust to the confounding effects of both population structure and recombination, while maintaining high statistical power to detect associations. Thorough testing via application to simulated data provides strong support for the power and specificity of our approach and demonstrates the advantages offered over alternative cluster-based and dimension-reduction methods. Two applications to Neisseria meningitidis illustrate the versatility and potential of our method, confirming previously-identified penicillin resistance loci and resulting in the identification of both well-characterised and novel drivers of invasive disease. Our method is implemented as an open-source R package called treeWAS which is freely available at https://github.com/caitiecollins/treeWAS.
Recombination is an important evolutionary force in bacteria, but it remains challenging to reconstruct the imports that occurred in the ancestry of a genomic sample. Here we present ClonalFrameML, ...which uses maximum likelihood inference to simultaneously detect recombination in bacterial genomes and account for it in phylogenetic reconstruction. ClonalFrameML can analyse hundreds of genomes in a matter of hours, and we demonstrate its usefulness on simulated and real datasets. We find evidence for recombination hotspots associated with mobile elements in Clostridium difficile ST6 and a previously undescribed 310kb chromosomal replacement in Staphylococcus aureus ST582. ClonalFrameML is freely available at http://clonalframeml.googlecode.com/.
It is a standard practice to test for the signature of homologous recombination in studies examining the genetic diversity of bacterial populations. Although it has emerged that homologous ...recombination rates can vary widely between species, comparing the results from different studies is made difficult by the diversity of estimation methods used. Here, Multi Locus Sequence Typing (MLST) datasets from a wide variety of bacteria and archaea are analyzed using the ClonalFrame method. This enables a direct comparison between species and allows for a first exploration of the question whether phylogeny or ecology is the primary determinant of homologous recombination rate.
We describe a model-based method for using multilocus sequence data to infer the clonal relationships of bacteria and the chromosomal position of homologous recombination events that disrupt a clonal ...pattern of inheritance. The key assumption of our model is that recombination events introduce a constant rate of substitutions to a contiguous region of sequence. The method is applicable both to multilocus sequence typing (MLST) data from a few loci and to alignments of multiple bacterial genomes. It can be used to decide whether a subset of isolates share common ancestry, to estimate the age of the common ancestor, and hence to address a variety of epidemiological and ecological questions that hinge on the pattern of bacterial spread. It should also be useful in associating particular genetic events with the changes in phenotype that they cause. We show that the model outperforms existing methods of subdividing recombinogenic bacteria using MLST data and provide examples from Salmonella and Bacillus. The software used in this article, ClonalFrame, is available from http://bacteria.stats.ox.ac.uk/.
The human pathogen Helicobacter pylori displays extensive genetic diversity. While H. pylori is known to evolve during infection, population dynamics inside the gastric environment have not been ...extensively investigated. Here we obtained gastric biopsies from multiple stomach regions of 16 H. pylori-infected adults, and analyze the genomes of 10 H. pylori isolates from each biopsy. Phylogenetic analyses suggest location-specific evolution and bacterial migration between gastric regions. Migration is significantly more frequent between the corpus and the fundus than with the antrum, suggesting that physiological differences between antral and oxyntic mucosa contribute to spatial partitioning of H. pylori populations. Associations between H. pylori gene polymorphisms and stomach niches suggest that chemotaxis, regulatory functions and outer membrane proteins contribute to specific adaptation to the antral and oxyntic mucosa. Moreover, we show that antibiotics can induce severe population bottlenecks and likely play a role in shaping the population structure of H. pylori.
The burial rates of males and females in early modern central London were compared to investigate a possible bias towards male mortality in the plague years of 1563, 1593, 1603, 1625 and 1665. The ...burial records of sixteen parishes were examined and compared with the five-year periods immediately preceding each plague year when recorded burials were substantially less. A markedly higher burial rate for males was detected in each plague year but this can be partly attributed to a general preponderance of males in the central London population since there was a similar but lesser bias in non-plague years. In the plague years the difference between the frequency of male and female adult burials appears to have been enhanced by the preferential migration of women of childbearing age out of the city since fewer births were recorded in months when plague was rife. Furthermore, when a sample of households was investigated, husbands were significantly more likely to have been buried than their wives. These findings were largely applicable to the plague years of 1603, 1625 and 1665 but were far less apparent in 1563 and 1593. In general, there were more burials of boys than girls in non-plague years which is the expected consequence of their greater vulnerability to childhood diseases. This difference diminished in plague years so that the burials of girls and boys approached parity at a time when burials of children of both sexes were significantly increased. Possibly, plague did not discriminate between the sexes and this characteristic tended to mask the usual vulnerability of boys.
Coronavirus disease 2019 (COVID-19) was first identified in late 2019 in Wuhan, Hubei Province, China and spread globally in months, sparking worldwide concern. However, it is unclear whether ...super-spreading events occurred during the early outbreak phase, as has been observed for other emerging viruses. Here, we analyse 208 publicly available SARS-CoV-2 genome sequences collected during the early outbreak phase. We combine phylogenetic analysis with Bayesian inference under an epidemiological model to trace person-to-person transmission. The dispersion parameter of the offspring distribution in the inferred transmission chain was estimated to be 0.23 (95% CI: 0.13-0.38), indicating there are individuals who directly infected a disproportionately large number of people. Our results showed that super-spreading events played an important role in the early stage of the COVID-19 outbreak.
Abstract
The sequencing and comparative analysis of a collection of bacterial genomes from a single species or lineage of interest can lead to key insights into its evolution, ecology or ...epidemiology. The tool of choice for such a study is often to build a phylogenetic tree, and more specifically when possible a dated phylogeny, in which the dates of all common ancestors are estimated. Here, we propose a new Bayesian methodology to construct dated phylogenies which is specifically designed for bacterial genomics. Unlike previous Bayesian methods aimed at building dated phylogenies, we consider that the phylogenetic relationships between the genomes have been previously evaluated using a standard phylogenetic method, which makes our methodology much faster and scalable. This two-step approach also allows us to directly exploit existing phylogenetic methods that detect bacterial recombination, and therefore to account for the effect of recombination in the construction of a dated phylogeny. We analysed many simulated datasets in order to benchmark the performance of our approach in a wide range of situations. Furthermore, we present applications to three different real datasets from recent bacterial genomic studies. Our methodology is implemented in a R package called BactDating which is freely available for download at https://github.com/xavierdidelot/BactDating.
Colistin represents one of the few available drugs for treating infections caused by carbapenem-resistant Enterobacteriaceae. As such, the recent plasmid-mediated spread of the colistin resistance ...gene mcr-1 poses a significant public health threat, requiring global monitoring and surveillance. Here, we characterize the global distribution of mcr-1 using a data set of 457 mcr-1-positive sequenced isolates. We find mcr-1 in various plasmid types but identify an immediate background common to all mcr-1 sequences. Our analyses establish that all mcr-1 elements in circulation descend from the same initial mobilization of mcr-1 by an ISApl1 transposon in the mid 2000s (2002-2008; 95% highest posterior density), followed by a marked demographic expansion, which led to its current global distribution. Our results provide the first systematic phylogenetic analysis of the origin and spread of mcr-1, and emphasize the importance of understanding the movement of antibiotic resistance genes across multiple levels of genomic organization.
Recent advances in bacterial whole-genome sequencing have resulted in a comprehensive catalog of antibiotic resistance genomic signatures in Mycobacterium tuberculosis. With a view to pre-empt the ...emergence of resistance, we hypothesized that pre-existing polymorphisms in susceptible genotypes (pre-resistance mutations) could increase the risk of becoming resistant in the future. We sequenced whole genomes from 3135 isolates sampled over a 17-year period. After reconstructing ancestral genomes on time-calibrated phylogenetic trees, we developed and applied a genome-wide survival analysis to determine the hazard of resistance acquisition. We demonstrate that M. tuberculosis lineage 2 has a higher risk of acquiring resistance than lineage 4, and estimate a higher hazard of rifampicin resistance evolution following isoniazid mono-resistance. Furthermore, we describe loci and genomic polymorphisms associated with a higher risk of resistance acquisition. Identifying markers of future antibiotic resistance could enable targeted therapy to prevent resistance emergence in M. tuberculosis and other pathogens.