Summary Background Tuberculosis incidence in the UK has risen in the past decade. Disease control depends on epidemiological data, which can be difficult to obtain. Whole-genome sequencing can detect ...microevolution within Mycobacterium tuberculosis strains. We aimed to estimate the genetic diversity of related M tuberculosis strains in the UK Midlands and to investigate how this measurement might be used to investigate community outbreaks. Methods In a retrospective observational study, we used Illumina technology to sequence M tuberculosis genomes from an archive of frozen cultures. We characterised isolates into four groups: cross-sectional, longitudinal, household, and community. We measured pairwise nucleotide differences within hosts and between hosts in household outbreaks and estimated the rate of change in DNA sequences. We used the findings to interpret network diagrams constructed from 11 community clusters derived from mycobacterial interspersed repetitive-unit–variable-number tandem-repeat data. Findings We sequenced 390 separate isolates from 254 patients, including representatives from all five major lineages of M tuberculosis . The estimated rate of change in DNA sequences was 0·5 single nucleotide polymorphisms (SNPs) per genome per year (95% CI 0·3–0·7) in longitudinal isolates from 30 individuals and 25 families. Divergence is rarely higher than five SNPs in 3 years. 109 (96%) of 114 paired isolates from individuals and households differed by five or fewer SNPs. More than five SNPs separated isolates from none of 69 epidemiologically linked patients, two (15%) of 13 possibly linked patients, and 13 (17%) of 75 epidemiologically unlinked patients (three-way comparison exact p<0·0001). Genetic trees and clinical and epidemiological data suggest that super-spreaders were present in two community clusters. Interpretation Whole-genome sequencing can delineate outbreaks of tuberculosis and allows inference about direction of transmission between cases. The technique could identify super-spreaders and predict the existence of undiagnosed cases, potentially leading to early treatment of infectious patients and their contacts. Funding Medical Research Council, Wellcome Trust, National Institute for Health Research, and the Health Protection Agency.
Summary Background Diagnosing drug-resistance remains an obstacle to the elimination of tuberculosis. Phenotypic drug-susceptibility testing is slow and expensive, and commercial genotypic assays ...screen only common resistance-determining mutations. We used whole-genome sequencing to characterise common and rare mutations predicting drug resistance, or consistency with susceptibility, for all first-line and second-line drugs for tuberculosis. Methods Between Sept 1, 2010, and Dec 1, 2013, we sequenced a training set of 2099 Mycobacterium tuberculosis genomes. For 23 candidate genes identified from the drug-resistance scientific literature, we algorithmically characterised genetic mutations as not conferring resistance (benign), resistance determinants, or uncharacterised. We then assessed the ability of these characterisations to predict phenotypic drug-susceptibility testing for an independent validation set of 1552 genomes. We sought mutations under similar selection pressure to those characterised as resistance determinants outside candidate genes to account for residual phenotypic resistance. Findings We characterised 120 training-set mutations as resistance determining, and 772 as benign. With these mutations, we could predict 89·2% of the validation-set phenotypes with a mean 92·3% sensitivity (95% CI 90·7–93·7) and 98·4% specificity (98·1–98·7). 10·8% of validation-set phenotypes could not be predicted because uncharacterised mutations were present. With an in-silico comparison, characterised resistance determinants had higher sensitivity than the mutations from three line-probe assays (85·1% vs 81·6%). No additional resistance determinants were identified among mutations under selection pressure in non-candidate genes. Interpretation A broad catalogue of genetic mutations enable data from whole-genome sequencing to be used clinically to predict drug resistance, drug susceptibility, or to identify drug phenotypes that cannot yet be genetically predicted. This approach could be integrated into routine diagnostic workflows, phasing out phenotypic drug-susceptibility testing while reporting drug resistance early. Funding Wellcome Trust, National Institute of Health Research, Medical Research Council, and the European Union.
The genomic relationships among Enterococcus faecium isolates are the subject of ongoing research that seeks to clarify the origins of observed lineages and the extent of horizontal gene transfer ...between them, and to robustly identify links between genotypes and phenotypes. E faecium is considered to form distinct groups—A and B—corresponding to isolates derived from patients who were hospitalised (A) and isolates from humans in the community (B). The additional separation of A into the so-called clades A1 and A2 remains an area of uncertainty. We aimed to investigate the relationships between A1 and non-A1 groups and explore the potential role of non-A1 isolates in shaping the population structure of hospital E faecium.
We collected short-read sequence data from invited groups that had previously published E faecium genome data. This hospital-based isolate collection could be separated into three groups (or clades, A1, A2, and B) by augmenting the study genomes with published sequences derived from human samples representing the previously defined genomic clusters. We performed phylogenetic analyses, by constructing maximum-likelihood phylogenetic trees, and identified historical recombination events. We assessed the pan-genome, did resistome analysis, and examined the genomic data to identify mobile genetic elements. Each genome underwent chromosome painting by use of ChromoPainter within FineSTRUCTURE software to assess ancestry and identify hybrid groups. We further assessed highly admixed regions to infer recombination directionality.
We assembled a collection of 1095 hospital E faecium sequences from 34 countries, further augmented by 33 published sequences. 997 (88%) of 1128 genomes clustered as A1, 92 (8%) as A2, and 39 (4%) as B. We showed that A1 probably emerged as a clone from within A2 and that, because of ongoing gene flow, hospital isolates currently identified as A2 represent a genetic continuum between A1 and community E faecium. This interchange of genetic material between isolates from different groups results in the emergence of hybrid genomes between clusters. Of the 1128 genomes, 49 (4%) hybrid genomes were identified: 33 previously labelled as A2 and 16 previously labelled as A1. These interactions were fuelled by a directional pattern of recombination mediated by mobile genetic elements. By contrast, the contribution of B group genetic material to A1 was limited to a few small regions of the genome and appeared to be driven by genomic sweep events.
A2 and B isolates coming into the hospital form an important reservoir for ongoing A1 adaptation, suggesting that effective long-term control of the effect of E faecium could benefit from strategies to reduce these genomic interactions, such as a focus on reducing the acquisition of hospital A1 strains by patients entering the hospital.
Wellcome Trust.
Abstract Background Epidemiological investigations into Mycobacterium tuberculosis outbreaks use 24-locus genotyping (MIRU-VNTR typing). Where no epidemiological link can be found between patients, ...the importance of shared genotypes remains unclear. This issue is especially problematic and time-consuming when tracing contacts within some social groups at high tuberculosis risk, in which unwillingness to volunteer information is common. We investigated whether whole-genome sequencing (WGS) could delineate outbreaks with greater resolution than MIRU-VNTR typing has done. Methods We sequenced 390 M tuberculosis isolates from 254 patients from the UK Midlands (1994–2011) using Illumina technology ( appendix ). We estimated the expected genomic diversity between isolates within a transmission chain by measuring pairwise nucleotide differences between genomes within hosts (79 individuals with pulmonary and extrapulmonary disease, or multiple pulmonary episodes) and between hosts within 25 household outbreaks (63 individuals). We then investigated 11 MIRU-VNTR-based community clusters (168 patients, 157 transmission events) to assess whether WGS could delineate outbreaks more effectively. For each cluster we reconstructed the most plausible transmission chain based on epidemiological data collected by tuberculosis nurses, pairwise nucleotide distances, and times of diagnosis, and compared the genomic diversity across these constructed links with that within individuals and within household outbreaks. Findings 109 (96%) of 114 isolates were within five SNPs of another isolate taken from the same individual or from an individual in the same household outbreak. On the basis of longitudinal isolates from individuals or households, we estimated an evolutionary rate of 0·5 SNPs per genome per year, consistent with a maximum of five SNPs between related isolates 3 years or less apart. Using a greater than five SNP threshold to assess 11 MIRU-VNTR-based community clusters, we found that none of 69 epidemiologically related pairs of MIRU-VNTR-matched cases plausibly related by transmission, two of 13 possibly related pairs, and 13 of 75 pairs with no known epidemiological relation were separated by more than five SNPs (p<0·0001). Seven MIRU-VNTR-matched pairs with no epidemiological relation had more than 30 SNPs, five of seven belonging to the same immigrant community cluster. WGS also showed that 62 of 75 MIRU-VNTR-matched pairs for which no epidemiological relation had been identified from contact tracing were highly likely to indicate transmission: in one substance misuse cluster, 38 individuals were linked by five or fewer SNPs without a single epidemiological link having been established previously. Further analysis suggested that microevolutionary divergence of lineages within outbreaks could signal possible super-spreaders, corroborated by clinical and epidemiological data in two clusters. Interpretation WGS can delineate tuberculosis outbreaks with greater resolution than has previously been possible. These findings offer public health teams the potential to limit outbreak investigations to patients who are likely to be linked by recent transmission, irrespective of whether it has been possible to identify epidemiological links, and to save resources where they are not, even in the context of matched MIRU-VNTR genotypes. Uniquely, WGS also provides information about the genetic structure of outbreak clusters, thereby providing the potential to direct public health resources towards individuals most likely to have infected the largest number of secondary cases. As a consequence, the Health Protection Agency is considering introduction of WGS technology for routine tuberculosis public health practice in England. Funding NIHR Oxford Biomedical Research Centre and the UKCRC Modernising Medical Microbiology Consortium (UKCRC Translational Infection Research Initiative supported by MRC, Biotechnology and Biological Sciences Research Council, and NIHR on behalf of the Department of Health grant G0800778 and the Wellcome Trust 087646/Z/08/Z ).