is a gut commensal of humans and animals but is also listed on the WHO global priority list of multidrug-resistant pathogens. Many of its antibiotic resistance traits reside on plasmids and have the ...potential to be disseminated by horizontal gene transfer. Here, we present the first comprehensive population-wide analysis of the pan-plasmidome of a clinically important bacterium, by whole-genome sequence analysis of 1,644 isolates from hospital, commensal, and animal sources of
Long-read sequencing on a selection of isolates resulted in the completion of 305 plasmids that exhibited high levels of sequence modularity. We further investigated the entirety of all plasmids of each isolate (plasmidome) using a combination of short-read sequencing and machine-learning classifiers. Clustering of the plasmid sequences unraveled different
populations with a clear association with hospitalized patient isolates, suggesting different optimal configurations of plasmids in the hospital environment. The characterization of these populations allowed us to identify common mechanisms of plasmid stabilization such as toxin-antitoxin systems and genes exclusively present in particular plasmidome populations exemplified by copper resistance, phosphotransferase systems, or bacteriocin genes potentially involved in niche adaptation. Based on the distribution of k-mer distances between isolates, we concluded that plasmidomes rather than chromosomes are most informative for source specificity of
is one of the most frequent nosocomial pathogens of hospital-acquired infections.
has gained resistance against most commonly available antibiotics, most notably, against ampicillin, gentamicin, and vancomycin, which renders infections difficult to treat. Many antibiotic resistance traits, in particular, vancomycin resistance, can be encoded in autonomous and extrachromosomal elements called plasmids. These sequences can be disseminated to other isolates by horizontal gene transfer and confer novel mechanisms to source specificity. In our study, we elucidated the total plasmid content, referred to as the plasmidome, of 1,644
isolates by using short- and long-read whole-genome technologies with the combination of a machine-learning classifier. This was fundamental to investigate the full collection of plasmid sequences present in our collection (pan-plasmidome) and to observe the potential transfer of plasmid sequences between
hosts. We observed that
isolates from hospitalized patients carried a larger number of plasmid sequences compared to that from other sources, and they elucidated different configurations of plasmidome populations in the hospital environment. We assessed the contribution of different genomic components and observed that plasmid sequences have the highest contribution to source specificity. Our study suggests that
plasmids are regulated by complex ecological constraints rather than physical interaction between hosts.
Monte Carlo methods represent the
de facto
standard for approximating complicated integrals involving multidimensional target distributions. In order to generate random realizations from the target ...distribution, Monte Carlo techniques use simpler proposal probability densities to draw candidate samples. The performance of any such method is strictly related to the specification of the proposal distribution, such that unfortunate choices easily wreak havoc on the resulting estimators. In this work, we introduce a
layered
(i.e., hierarchical) procedure to generate samples employed within a Monte Carlo scheme. This approach ensures that an appropriate equivalent proposal density is always obtained automatically (thus eliminating the risk of a catastrophic performance), although at the expense of a moderate increase in the complexity. Furthermore, we provide a general unified importance sampling (IS) framework, where multiple proposal densities are employed and several IS schemes are introduced by applying the so-called deterministic mixture approach. Finally, given these schemes, we also propose a novel class of adaptive importance samplers using a population of proposals, where the adaptation is driven by independent parallel or interacting Markov chain Monte Carlo (MCMC) chains. The resulting algorithms efficiently combine the benefits of both IS and MCMC methods.
Enterotoxigenic Escherichia coli (ETEC), a major cause of infectious diarrhea, produce heat-stable and/or heat-labile enterotoxins and at least 25 different colonization factors that target the ...intestinal mucosa. The genes encoding the enterotoxins and most of the colonization factors are located on plasmids found across diverse E. coli serogroups. Whole-genome sequencing of a representative collection of ETEC isolated between 1980 and 2011 identified globally distributed lineages characterized by distinct colonization factor and enterotoxin profiles. Contrary to current notions, these relatively recently emerged lineages might harbor chromosome and plasmid combinations that optimize fitness and transmissibility. These data have implications for understanding, tracking and possibly preventing ETEC disease.
Retrotransposon segments were characterized and inter-retrotransposon amplified polymorphism (IRAP) markers developed for cultivated flax (Linum usitatissimum L.) and the Linum genus. Over 75 ...distinct long terminal repeat retrotransposon segments were cloned, the first set for Linum, and specific primers designed for them. IRAP was then used to evaluate genetic diversity among 708 accessions of cultivated flax comprising 143 landraces, 387 varieties, and 178 breeding lines. These included both traditional and modern, oil (86), fiber (351), and combined-use (271) accessions, originating from 36 countries, and 10 wild Linum species. The set of 10 most polymorphic primers yielded 141 reproducible informative data points per accession, with 52% polymorphism and a 0.34 Shannon diversity index. The maximal genetic diversity was detected among wild Linum species (100% IRAP polymorphism and 0.57 Jaccard similarity), while diversity within cultivated germplasm decreased from landraces (58%, 0.63) to breeding lines (48%, 0.85) and cultivars (50%, 0.81). Application of Bayesian methods for clustering resulted in the robust identification of 20 clusters of accessions, which were unstratified according to origin or user type. This indicates an overlap in genetic diversity despite disruptive selection for fiber versus oil types. Nevertheless, eight clusters contained high proportions (70-100%) of commercial cultivars, whereas two clusters were rich (60%) in landraces. These findings provide a basis for better flax germplasm management, core collection establishment, and exploration of diversity in breeding, as well as for exploration of the role of retrotransposons in flax genome dynamics.
It is known that directed acyclic graphs (DAGs) may hide several local features of the joint probability distribution that can be essential for some applications. To remedy this, more expressive ...model classes have been introduced. In addition to the restrictions implied by conditional independence, these model classes typically include some form of local structure that implies equality constraints on the node-wise conditional distribution. In particular, the concept of context-specific independence (CSI) was introduced to increase the flexibility of traditional Bayesian networks. Furthermore, in the most expressive class of generalized Bayesian networks, decision graphs were used to model arbitrary parameter restrictions. Here we formulate an alternative representation of such models called a partition DAG (PDAG), which defines the parameter restrictions using a partition-based representation of the parent outcome spaces. We establish a criterion that can identify whether an arbitrary PDAG has a CSI-consistent representation using an efficient basic graph theoretic algorithm. Based on a recursive inference algorithm for partition posteriors, an exact Bayesian learning method is introduced. We demonstrate on real data that exact learning of PDAGs can identify important relationships between variables that have not been discovered by previous graphical model learning methods.
The Glanville fritillary butterfly (Melitaea cinxia) has been studied in the Åland Islands in Finland since 1991, where it occurs as a classic metapopulation in a large network of 4000 dry meadows. ...Much ecological work has been conducted on this species, but population genetic studies have been hampered by paucity of suitable genetic markers. Here, using single nucleotide polymorphisms and microsatellites developed for the Glanville fritillary, we examine the correspondence between the demographic and genetic spatial structures. Given the dynamic nature of the metapopulation, the current genetic spatial structure may bear a signal of past changes in population sizes and past patterns of gene flow rather than reflect the current demographic structure or landscape structure. We analyse this question with demographic data for 10 years, using the Rand index to assess the similarity between the genetic, demographic, and landscape spatial structures. Our results show that the current genetic spatial structure is better explained by the past rather than by the current demographic spatial structure or by the spatial configuration of the habitat in the landscape. Furthermore, current genetic diversity is significantly explained by past metapopulation sizes. The time lag between major demographic events and change in the genetic spatial structure and diversity has implications for the study of spatial dynamics.
The prevalence of ampicillin- and/or vancomycin-resistant Enterococcus faecium (AREf and VREf) has increased in hospitalized patients in the Netherlands.
To quantify the prevalence, risk factors and ...co-carriage of AREf and VREf in humans, cats and dogs in the Dutch population.
From 2014 to 2015, ∼2000 inhabitants of the Netherlands each month were randomly invited to complete a questionnaire and provide a faecal sample. Subjects owning pets were also asked to submit one dog or cat sample. Faecal samples were screened for AREf and VREf. The genetic relatedness of isolates was determined using core genome MLST. Logistic regression analysis was used to determine risk factors.
Of 25 365 subjects, 4721 (18.6%) completed the questionnaire and 1992 (42.2%) human, 277 dog and 118 cat samples were submitted. AREf was detected in 29 human (1.5%), 71 dog (25.6%) and 6 cat (5.1%) samples. VREf (vanA) was detected in one human and one dog. AREf/VREf co-carriage was not detected in 388 paired samples. The use of antibiotics (OR 4.2, 95% CI 1.7-11.2) and proton pump inhibitors (OR 2.7, 95% CI 1.1-6.3) were risk factors for AREf carriage in humans. In dogs, these were the use of antibiotics (OR 2.3, 95% CI 1.1-4.6) and eating raw meat (OR 3.2, 95% CI 1.4-6.6). Core genome MLST-based phylogenetic linkage indicated clonal relatedness for a minority of human (16.7%) and pet AREf isolates (23.8%) in three clusters.
Intestinal carriage with AREf or VREf is rare in the Dutch general population. Although AREf carriage is high in dogs, phylogenetic linkage between human and pet AREf isolates was limited.
The control of the human body sway by the central nervous system, muscles, and conscious brain is of interest since body sway carries information about the physiological status of a person. Several ...models have been proposed to describe body sway in an upright standing position, however, due to the statistical intractability of the more realistic models, no formal parameter inference has previously been conducted and the expressive power of such models for real human subjects remains unknown. Using the latest advances in Bayesian statistical inference for intractable models, we fitted a nonlinear control model to posturographic measurements, and we showed that it can accurately predict the sway characteristics of both simulated and real subjects. Our method provides a full statistical characterization of the uncertainty related to all model parameters as quantified by posterior probability density functions, which is useful for comparisons across subjects and test settings. The ability to infer intractable control models from sensor data opens new possibilities for monitoring and predicting body status in health applications.
Models on how bacterial lineages differentiate increase our understanding of early bacterial speciation events and the genetic loci involved. Here, we analyze the population genomics events leading ...to the emergence of the tuberculosis pathogen. The emergence is characterized by a combination of recombination events involving core pathogenesis functions and purifying selection on early diverging loci. We identify the
gene, the sensor kinase of a two-component system involved in virulence, as a key functional player subject to pervasive positive selection after the divergence of the
complex from its ancestor. Previous evidence showed that
mutations played a central role in the adaptation of the pathogen to different host species. Now, we show that
mutations have been under selection during the early spread of human tuberculosis, during later expansions, and in ongoing transmission events. Our results show that linking pathogen evolution across evolutionary and epidemiological time scales points to past and present virulence determinants.