The molecular clock and its phylogenetic applications to genomic data have changed how we study and understand one of the major human pathogens, Mycobacterium tuberculosis (MTB), the etiologic agent ...of tuberculosis. Genome sequences of MTB strains sampled at different times are increasingly used to infer when a particular outbreak begun, when a drug-resistant clone appeared and expanded, or when a strain was introduced into a specific region. Despite the growing importance of the molecular clock in tuberculosis research, there is a lack of consensus as to whether MTB displays a clocklike behavior and about its rate of evolution. Here we performed a systematic study of the molecular clock of MTB on a large genomic data set (6,285 strains), covering different epidemiological settings and most of the known global diversity. We found that sampling times below 15-20 years were often insufficient to calibrate the clock of MTB. For data sets where such calibration was possible, we obtained a clock rate between 1x10-8 and 5x10-7 nucleotide changes per-site-per-year (0.04-2.2 SNPs per-genome-per-year), with substantial differences between clades. These estimates were not strongly dependent on the time of the calibration points as they changed only marginally when we used epidemiological isolates (sampled in the last 40 years) or three ancient DNA samples (about 1,000 years old) to calibrate the tree. Additionally, the uncertainty and the discrepancies in the results of different methods were sometimes large, highlighting the importance of using different methods, and of considering carefully their assumptions and limitations.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Bayesian molecular dating: opening up the black box Bromham, Lindell; Duchêne, Sebastián; Hua, Xia ...
Biological reviews of the Cambridge Philosophical Society,
20/May , Letnik:
93, Številka:
2
Journal Article
Recenzirano
Odprti dostop
ABSTRACT
Molecular dating analyses allow evolutionary timescales to be estimated from genetic data, offering an unprecedented capacity for investigating the evolutionary past of all species. These ...methods require us to make assumptions about the relationship between genetic change and evolutionary time, often referred to as a ‘molecular clock’. Although initially regarded with scepticism, molecular dating has now been adopted in many areas of biology. This broad uptake has been due partly to the development of Bayesian methods that allow complex aspects of molecular evolution, such as variation in rates of change across lineages, to be taken into account. But in order to do this, Bayesian dating methods rely on a range of assumptions about the evolutionary process, which vary in their degree of biological realism and empirical support. These assumptions can have substantial impacts on the estimates produced by molecular dating analyses. The aim of this review is to open the ‘black box’ of Bayesian molecular dating and have a look at the machinery inside. We explain the components of these dating methods, the important decisions that researchers must make in their analyses, and the factors that need to be considered when interpreting results. We illustrate the effects that the choices of different models and priors can have on the outcome of the analysis, and suggest ways to explore these impacts. We describe some major research directions that may improve the reliability of Bayesian dating. The goal of our review is to help researchers to make informed choices when using Bayesian phylogenetic methods to estimate evolutionary rates and timescales.
Severe liver abscess infections caused by hypervirulent clonal-group CG23 Klebsiella pneumoniae have been increasingly reported since the mid-1980s. Strains typically possess several virulence ...factors including an integrative, conjugative element ICEKp encoding the siderophore yersiniabactin and genotoxin colibactin. Here we investigate CG23's evolutionary history, showing several deep-branching sublineages associated with distinct ICEKp acquisitions. Over 80% of liver abscess isolates belong to sublineage CG23-I, which emerged in ~1928 following acquisition of ICEKp10 (encoding yersiniabactin and colibactin), and then disseminated globally within the human population. CG23-I's distinguishing feature is the colibactin synthesis locus, which reportedly promotes gut colonisation and metastatic infection in murine models. These data show circulation of CG23 K. pneumoniae decades before the liver abscess epidemic was first recognised, and provide a framework for future epidemiological and experimental studies of hypervirulent K. pneumoniae. To support such studies we present an open access, completely sequenced CG23-I human liver abscess isolate, SGH10.
Klebsiella pneumoniae has emerged as an important cause of two distinct public health threats: multi-drug resistant (MDR) healthcare-associated infections and drug susceptible community-acquired ...invasive infections. These pathotypes are generally associated with two distinct subsets of K. pneumoniae lineages or 'clones' that are distinguished by the presence of acquired resistance genes and several key virulence loci. Genomic evolutionary analyses of the most notorious MDR and invasive community-associated ('hypervirulent') clones indicate differences in terms of chromosomal recombination dynamics and capsule polysaccharide diversity, but it remains unclear if these differences represent generalised trends. Here we leverage a collection of >2200 K. pneumoniae genomes to identify 28 common clones (n ≥ 10 genomes each), and perform the first genomic evolutionary comparison. Eight MDR and 6 hypervirulent clones were identified on the basis of acquired resistance and virulence gene prevalence. Chromosomal recombination, surface polysaccharide locus diversity, pan-genome, plasmid and phage dynamics were characterised and compared. The data showed that MDR clones were highly diverse, with frequent chromosomal recombination generating extensive surface polysaccharide locus diversity. Additional pan-genome diversity was driven by frequent acquisition/loss of both plasmids and phage. In contrast, chromosomal recombination was rare in the hypervirulent clones, which also showed a significant reduction in pan-genome diversity, largely driven by a reduction in plasmid diversity. Hence the data indicate that hypervirulent clones may be subject to some sort of constraint for horizontal gene transfer that does not apply to the MDR clones. Our findings are relevant for understanding the risk of emergence of individual K. pneumoniae strains carrying both virulence and acquired resistance genes, which have been increasingly reported and cause highly virulent infections that are extremely difficult to treat. Specifically, our data indicate that MDR clones pose the greatest risk, because they are more likely to acquire virulence genes than hypervirulent clones are to acquire resistance genes.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
In Australia, we have relied on genomic epidemiology (and associated derived parameters such as viral growth rate, reproductive number, and estimated sampling proportion) to inform public health ...policy changes 1. Genome surveillance alone may not be sufficiently informative to produce meaningful epidemiological estimates. ...we argue here that more genomes do not necessarily mean better results. Sustainable sequencing to track the COVID-19 pandemic Whilst public health and social measures, quarantine restrictions and vaccination have all been utilised in past and current pandemics, the COVID-19 pandemic is the first to employ genomic sequencing on a massive global scale. In both Australia and New Zealand, where the proportion of cases sequenced has been substantial, we have been able to use the data at the “macro” level (studying global evolution of the virus, emergence of VOCs and informing public health policies) and at the “micro” level (inferring local transmission networks and the impact of public health interventions on genomic clusters) 1,6–9.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
A fundamental challenge in resolving evolutionary relationships across the tree of life is to account for heterogeneity in the evolutionary signal across loci. Studies of marsupial mammals have ...demonstrated that this heterogeneity can be substantial, leaving considerable uncertainty in the evolutionary timescale and relationships within the group. Using simulations and a new phylogenomic data set comprising nucleotide sequences of 1550 loci from 18 of the 22 extant marsupial families, we demonstrate the power of a method for identifying clusters of loci that support different phylogenetic trees. We find two distinct clusters of loci, each providing an estimate of the species tree that matches previously proposed resolutions of the marsupial phylogeny. We also identify a well-supported placement for the enigmatic marsupial moles (Notoryctes) that contradicts previous molecular estimates but is consistent with morphological evidence. The pattern of gene-tree variation across tree-space is characterized by changes in information content, GC content, substitution-model adequacy, and signatures of purifying selection in the data. In a simulation study, we show that incomplete lineage sorting can explain the division of loci into the two tree-topology clusters, as found in our phylogenomic analysis of marsupials. We also demonstrate the potential benefits of minimizing uncertainty from phylogenetic conflict for molecular dating. Our analyses reveal that Australasian marsupials appeared in the early Paleocene, whereas the diversification of present-day families occurred primarily during the late Eocene and early Oligocene. Our methods provide an intuitive framework for improving the accuracy and precision of phylogenetic inference and molecular dating using genome-scale data.
Celotno besedilo
Dostopno za:
BFBNIB, DOBA, IZUM, KILJ, NMLJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
We are writing in response to a recent critique by Emerson & Hickerson (), who challenge the evidence of a time‐dependent bias in molecular rate estimates. This bias takes the form of a negative ...relationship between inferred evolutionary rates and the ages of the calibrations on which these estimates are based. Here, we present a summary of the evidence obtained from a broad range of taxa that supports a time‐dependent bias in rate estimates, with a consideration of the potential causes of these observed trends. We also describe recent progress in improving the reliability of evolutionary rate estimation and respond to the concerns raised by Emerson & Hickerson () about the validity of rates estimated from time‐structured sequence data. In doing so, we hope to dispel some misconceptions and to highlight several research directions that will improve our understanding of time‐dependent biases in rate estimates.
Phylodynamic inference is a pivotal tool in understanding transmission dynamics of viral outbreaks. These analyses are strongly guided by the input of an epidemiological model as well as sequence ...data that must contain sufficient intersequence variability in order to be informative. These criteria, however, may not be met during the early stages of an outbreak. Here we investigate the impact of low diversity sequence data on phylodynamic inference using the birth-death and coalescent exponential models. Through our simulation study, estimating the molecular evolutionary rate required enough sequence diversity and is an essential first step for any phylodynamic inference. Following this, the birth-death model outperforms the coalescent exponential model in estimating epidemiological parameters, when faced with low diversity sequence data due to explicitly exploiting the sampling times. In contrast, the coalescent model requires additional samples and therefore variability in sequence data before accurate estimates can be obtained. These findings were also supported through our empirical data analyses of an Australian and a New Zealand cluster outbreaks of SARS-CoV-2. Overall, the birth-death model is more robust when applied to datasets with low sequence diversity given sampling is specified and this should be considered for future viral outbreak investigations.
Determining the time scale of virus evolution is central to understanding their origins and emergence. The phylogenetic methods commonly used for this purpose can be misleading if the substitution ...model makes incorrect assumptions about the data. Empirical studies consider a pool of models and select that with the highest statistical fit. However, this does not allow the rejection of all models, even if they poorly describe the data. An alternative is to use model adequacy methods that evaluate the ability of a model to predict hypothetical future observations. This can be done by comparing the empirical data with data generated under the model in question. We conducted simulations to evaluate the sensitivity of such methods with nucleotide, amino acid, and codon data. These effectively detected underparameterized models, but failed to detect mutational saturation and some instances of nonstationary base composition, which can lead to biases in estimates of tree topology and length. To test the applicability of these methods with real data, we analyzed nucleotide and amino acid data sets from the genus Flavivirus of RNA viruses. In most cases these models were inadequate, with the exception of a data set of relatively closely related sequences of Dengue virus, for which the GTR+Γ nucleotide and LG+Γ amino acid substitution models were adequate. Our results partly explain the lack of consensus over estimates of the long-term evolutionary time scale of these viruses, and indicate that assessing the adequacy of substitution models should be routinely used to determine whether estimates are reliable.
Genomic sequencing has significant potential to inform public health management for SARS-CoV-2. Here we report high-throughput genomics for SARS-CoV-2, sequencing 80% of cases in Victoria, Australia ...(population 6.24 million) between 6 January and 14 April 2020 (total 1,333 COVID-19 cases). We integrate epidemiological, genomic and phylodynamic data to identify clusters and impact of interventions. The global diversity of SARS-CoV-2 is represented, consistent with multiple importations. Seventy-six distinct genomic clusters were identified, including large clusters associated with social venues, healthcare and cruise ships. Sequencing sequential samples from 98 patients reveals minimal intra-patient SARS-CoV-2 genomic diversity. Phylodynamic modelling indicates a significant reduction in the effective viral reproductive number (R
) from 1.63 to 0.48 after implementing travel restrictions and physical distancing. Our data provide a concrete framework for the use of SARS-CoV-2 genomics in public health responses, including its use to rapidly identify SARS-CoV-2 transmission chains, increasingly important as social restrictions ease globally.