The package adegenet for the R software is dedicated to the multivariate analysis of genetic markers. It extends the ade4 package of multivariate methods by implementing formal classes and functions ...to manipulate and analyse genetic markers. Data can be imported from common population genetics software and exported to other software and R packages. adegenet also implements standard population genetics tools along with more original approaches for spatial genetics and hybridization. Availability: Stable version is available from CRAN: http://cran.r-project.org/mirrors.html. Development version is available from adegenet website: http://adegenet.r-forge.r-project.org/. Both versions can be installed directly from R. adegenet is distributed under the GNU General Public Licence (v.2). Contact: jombart@biomserv.univ-lyon1.fr Supplementary information: Supplementary data are available at Bioinformatics online.
While the R software is becoming a standard for the analysis of genetic data, classical population genetics tools are being challenged by the increasing availability of genomic sequences. Dedicated ...tools are needed for harnessing the large amount of information generated by next-generation sequencing technologies. We introduce new tools implemented in the adegenet 1.3-1 package for handling and analyzing genome-wide single nucleotide polymorphism (SNP) data. Using a bit-level coding scheme for SNP data and parallelized computation, adegenet enables the analysis of large genome-wide SNPs datasets using standard personal computers.
Availability:
adegenet 1.3-1 is available from CRAN: http://cran.r-project.org/web/packages/adegenet/. Information and support including a dedicated forum of discussion can be found on the adegenet website: http://adegenet.r-forge.r-project.org/. adegenet is released with a manual and four tutorials totalling over 300 pages of documentation, and distributed under the GNU General Public Licence (≥2).
Contact:
t.jombart@imperial.ac.uk
Supplementary Information:
Supplementary data are available at Bioinformatics online.
The COVID-19 pandemic has placed an unprecedented strain on health systems, with rapidly increasing demand for healthcare in hospitals and intensive care units (ICUs) worldwide. As the pandemic ...escalates, determining the resulting needs for healthcare resources (beds, staff, equipment) has become a key priority for many countries. Projecting future demand requires estimates of how long patients with COVID-19 need different levels of hospital care.
We performed a systematic review of early evidence on length of stay (LoS) of patients with COVID-19 in hospital and in ICU. We subsequently developed a method to generate LoS distributions which combines summary statistics reported in multiple studies, accounting for differences in sample sizes. Applying this approach, we provide distributions for total hospital and ICU LoS from studies in China and elsewhere, for use by the community.
We identified 52 studies, the majority from China (46/52). Median hospital LoS ranged from 4 to 53 days within China, and 4 to 21 days outside of China, across 45 studies. ICU LoS was reported by eight studies-four each within and outside China-with median values ranging from 6 to 12 and 4 to 19 days, respectively. Our summary distributions have a median hospital LoS of 14 (IQR 10-19) days for China, compared with 5 (IQR 3-9) days outside of China. For ICU, the summary distributions are more similar (median (IQR) of 8 (5-13) days for China and 7 (4-11) days outside of China). There was a visible difference by discharge status, with patients who were discharged alive having longer LoS than those who died during their admission, but no trend associated with study date.
Patients with COVID-19 in China appeared to remain in hospital for longer than elsewhere. This may be explained by differences in criteria for admission and discharge between countries, and different timing within the pandemic. In the absence of local data, the combined summary LoS distributions provided here can be used to model bed demands for contingency planning and then updated, with the novel method presented here, as more studies with aggregated statistics emerge outside China.
There exists significant interest in developing statistical and computational tools for inferring 'who infected whom' in an infectious disease outbreak from densely sampled case data, with most ...recent studies focusing on the analysis of whole genome sequence data. However, genomic data can be poorly informative of transmission events if mutations accumulate too slowly to resolve individual transmission pairs or if there exist multiple pathogens lineages within-host, and there has been little focus on incorporating other types of outbreak data. We present here a methodology that uses contact data for the inference of transmission trees in a statistically rigorous manner, alongside genomic data and temporal data. Contact data is frequently collected in outbreaks of pathogens spread by close contact, including Ebola virus (EBOV), severe acute respiratory syndrome coronavirus (SARS-CoV) and Mycobacterium tuberculosis (TB), and routinely used to reconstruct transmission chains. As an improvement over previous, ad-hoc approaches, we developed a probabilistic model that relates a set of contact data to an underlying transmission tree and integrated this in the outbreaker2 inference framework. By analyzing simulated outbreaks under various contact tracing scenarios, we demonstrate that contact data significantly improves our ability to reconstruct transmission trees, even under realistic limitations on the coverage of the contact tracing effort and the amount of non-infectious mixing between cases. Indeed, contact data is equally or more informative than fully sampled whole genome sequence data in certain scenarios. We then use our method to analyze the early stages of the 2003 SARS outbreak in Singapore and describe the range of transmission scenarios consistent with contact data and genetic sequence in a probabilistic manner for the first time. This simple yet flexible model can easily be incorporated into existing tools for outbreak reconstruction and should permit a better integration of genomic and epidemiological data for inferring transmission chains.
We present a global analysis of the spread of recently emerged SARS-CoV-2 variants and estimate changes in effective reproduction numbers at country-specific level using sequence data from GISAID. ...Nearly all investigated countries demonstrated rapid replacement of previously circulating lineages by the World Health Organization-designated variants of concern, with estimated transmissibility increases of 29% (95% CI: 24-33), 25% (95% CI: 20-30), 38% (95% CI: 29-48) and 97% (95% CI: 76-117), respectively, for B.1.1.7, B.1.351, P.1 and B.1.617.2.
Recent years have seen the development of numerous methodologies for reconstructing transmission trees in infectious disease outbreaks from densely sampled whole genome sequence data. However, a ...fundamental and as of yet poorly addressed limitation of such approaches is the requirement for genetic diversity to arise on epidemiological timescales. Specifically, the position of infected individuals in a transmission tree can only be resolved by genetic data if mutations have accumulated between the sampled pathogen genomes. To quantify and compare the useful genetic diversity expected from genetic data in different pathogen outbreaks, we introduce here the concept of 'transmission divergence', defined as the number of mutations separating whole genome sequences sampled from transmission pairs. Using parameter values obtained by literature review, we simulate outbreak scenarios alongside sequence evolution using two models described in the literature to describe transmission divergence of ten major outbreak-causing pathogens. We find that while mean values vary significantly between the pathogens considered, their transmission divergence is generally very low, with many outbreaks characterised by large numbers of genetically identical transmission pairs. We describe the impact of transmission divergence on our ability to reconstruct outbreaks using two outbreak reconstruction tools, the R packages outbreaker and phybreak, and demonstrate that, in agreement with previous observations, genetic sequence data of rapidly evolving pathogens such as RNA viruses can provide valuable information on individual transmission events. Conversely, sequence data of pathogens with lower mean transmission divergence, including Streptococcus pneumoniae, Shigella sonnei and Clostridium difficile, provide little to no information about individual transmission events. Our results highlight the informational limitations of genetic sequence data in certain outbreak scenarios, and demonstrate the need to expand the toolkit of outbreak reconstruction tools to integrate other types of epidemiological data.
adephylo is a package for the R software dedicated to the analysis of comparative evolutionary data. Phylogenetic comparative methods initially aimed at accounting for or removing the effects of ...phylogenetic signal in the analysis of biological traits. However, recent approaches have shown that considerable information can be gathered from the study of the phylogenetic signal. In particular, close examination of phylogenetic structures can unveil interesting evolutionary patterns. For this purpose, we developed the package adephylo that provides tools for quantifying and describing the phylogenetic structures of biological traits. adephylo implements tests of phylogenetic signal, phylogenetic distances and proximities, and novel methods for describing further univariate and multivariate phylogenetic structures. These tools open up new perspectives in the analysis of evolutionary comparative data. Availability: The stable version is available from CRAN: http:/cran.r-project.org/web/packages/adephylo/. The development version is hosted by R-Forge: http://r-forge.r-project.org/projects/adephylo/. Both versions can be installed directly from R. adephylo is distributed under the GNU General Public Licence (≥2). Contact: t.jombart@imperial.ac.uk; dray@biomserv.univ-lyon1.fr Supplementary information: Supplementary data are available at Bioinformatics online.
The increasing availability of large genomic data sets as well as the advent of Bayesian phylogenetics facilitates the investigation of phylogenetic incongruence, which can result in the ...impossibility of representing phylogenetic relationships using a single tree. While sometimes considered as a nuisance, phylogenetic incongruence can also reflect meaningful biological processes as well as relevant statistical uncertainty, both of which can yield valuable insights in evolutionary studies. We introduce a new tool for investigating phylogenetic incongruence through the exploration of phylogenetic tree landscapes. Our approach, implemented in the R package treespace, combines tree metrics and multivariate analysis to provide low‐dimensional representations of the topological variability in a set of trees, which can be used for identifying clusters of similar trees and group‐specific consensus phylogenies. treespace also provides a user‐friendly web interface for interactive data analysis and is integrated alongside existing standards for phylogenetics. It fills a gap in the current phylogenetics toolbox in R and will facilitate the investigation of phylogenetic results.
Recent years have seen progress in the development of statistically rigorous frameworks to infer outbreak transmission trees ("who infected whom") from epidemiological and genetic data. Making use of ...pathogen genome sequences in such analyses remains a challenge, however, with a variety of heuristic approaches having been explored to date. We introduce a statistical method exploiting both pathogen sequences and collection dates to unravel the dynamics of densely sampled outbreaks. Our approach identifies likely transmission events and infers dates of infections, unobserved cases and separate introductions of the disease. It also proves useful for inferring numbers of secondary infections and identifying heterogeneous infectivity and super-spreaders. After testing our approach using simulations, we illustrate the method with the analysis of the beginning of the 2003 Singaporean outbreak of Severe Acute Respiratory Syndrome (SARS), providing new insights into the early stage of this epidemic. Our approach is the first tool for disease outbreak reconstruction from genetic data widely available as free software, the R package outbreaker. It is applicable to various densely sampled epidemics, and improves previous approaches by detecting unobserved and imported cases, as well as allowing multiple introductions of the pathogen. Because of its generality, we believe this method will become a tool of choice for the analysis of densely sampled disease outbreaks, and will form a rigorous framework for subsequent methodological developments.
How to measure and test phylogenetic signal Münkemüller, Tamara; Lavergne, Sébastien; Bzeznik, Bruno ...
Methods in ecology and evolution,
August 2012, Letnik:
3, Številka:
4
Journal Article
Recenzirano
Odprti dostop
Summary
1. Phylogenetic signal is the tendency of related species to resemble each other more than species drawn at random from the same tree. This pattern is of considerable interest in a range of ...ecological and evolutionary research areas, and various indices have been proposed for quantifying it. Unfortunately, these indices often lead to contrasting results, and guidelines for choosing the most appropriate index are lacking.
2. Here, we compare the performance of four commonly used indices using simulated data. Data were generated with numerical simulations of trait evolution along phylogenetic trees under a variety of evolutionary models. We investigated the sensitivity of the approaches to the size of phylogenies, the resolution of tree structure and the availability of branch length information, examining both the response of the selected indices and the power of the associated statistical tests.
3. We found that under a Brownian motion (BM) model of trait evolution, Abouheif’s Cmean and Pagel’s λ performed well and substantially better than Moran’s I and Blomberg’s K. Pagel’s λ provided a reliable effect size measure and performed better for discriminating between more complex models of trait evolution, but was computationally more demanding than Abouheif’s Cmean. Blomberg’s K was most suitable to capture the effects of changing evolutionary rates in simulation experiments.
4. Interestingly, sample size influenced not only the uncertainty but also the expected values of most indices, while polytomies and missing branch length information had only negligible impacts.
5. We propose guidelines for choosing among indices, depending on (a) their sensitivity to true underlying patterns of phylogenetic signal, (b) whether a test or a quantitative measure is required and (c) their sensitivities to different topologies of phylogenies.
6. These guidelines aim to better assess phylogenetic signal and distinguish it from random trait distributions. They were developed under the assumption of BM, and additional simulations with more complex trait evolution models show that they are to a certain degree generalizable. They are particularly useful in comparative analyses, when requiring a proxy for niche similarity, and in conservation studies that explore phylogenetic loss associated with extinction risks of specific clades.