Sequencing of target-enriched libraries is an efficient and cost-effective method for obtaining DNA sequence data from hundreds of nuclear loci for phylogeny reconstruction. Much of the cost of ...developing targeted sequencing approaches is associated with the generation of preliminary data needed for the identification of orthologous loci for probe design. In plants, identifying orthologous loci has proven difficult due to a large number of whole-genome duplication events, especially in the angiosperms (flowering plants). We used multiple sequence alignments from over 600 angiosperms for 353 putatively single-copy protein-coding genes identified by the One Thousand Plant Transcriptomes Initiative to design a set of targeted sequencing probes for phylogenetic studies of any angiosperm group. To maximize the phylogenetic potential of the probes, while minimizing the cost of production, we introduce a k-medoids clustering approach to identify the minimum number of sequences necessary to represent each coding sequence in the final probe set. Using this method, 5–15 representative sequences were selected per orthologous locus, representing the sequence diversity of angiosperms more efficiently than if probes were designed using available sequenced genomes alone. To test our approximately 80,000 probes, we hybridized libraries from 42 species spanning all higher-order groups of angiosperms, with a focus on taxa not present in the sequence alignments used to design the probes. Out of a possible 353 coding sequences, we recovered an average of 283 per species and at least 100 in all species. Differences among taxa in sequence recovery could not be explained by relatedness to the representative taxa selected for probe design, suggesting that there is no phylogenetic bias in the probe set. Our probe set, which targeted 260 kbp of coding sequence, achieved a median recovery of 137 kbp per taxon in coding regions, a maximum recovery of 250 kbp, and an additional median of 212 kbp per taxon in flanking non-coding regions across all species. These results suggest that the Angiosperms353 probe set described here is effective for any group of flowering plants and would be useful for phylogenetic studies from the species level to higher-order groups, including the entire angiosperm clade itself.
We report metrics from complete genome capture of nuclear DNA from extinct mammoths using biotinylated RNAs transcribed from an Asian elephant DNA extract. Enrichment of the nuclear genome ranged ...from 1.06- to 18.65-fold, to an apparent maximum threshold of ∼80% on-target. This projects an order of magnitude less costly complete genome sequencing from long-dead organisms, even when a reference genome is unavailable for bait design.
Smallpox holds a unique position in the history of medicine. It was the first disease for which a vaccine was developed and remains the only human disease eradicated by vaccination. Although there ...have been claims of smallpox in Egypt, India, and China dating back millennia 1–4, the timescale of emergence of the causative agent, variola virus (VARV), and how it evolved in the context of increasingly widespread immunization, have proven controversial 4–9. In particular, some molecular-clock-based studies have suggested that key events in VARV evolution only occurred during the last two centuries 4–6 and hence in apparent conflict with anecdotal historical reports, although it is difficult to distinguish smallpox from other pustular rashes by description alone. To address these issues, we captured, sequenced, and reconstructed a draft genome of an ancient strain of VARV, sampled from a Lithuanian child mummy dating between 1643 and 1665 and close to the time of several documented European epidemics 1, 2, 10. When compared to vaccinia virus, this archival strain contained the same pattern of gene degradation as 20th century VARVs, indicating that such loss of gene function had occurred before ca. 1650. Strikingly, the mummy sequence fell basal to all currently sequenced strains of VARV on phylogenetic trees. Molecular-clock analyses revealed a strong clock-like structure and that the timescale of smallpox evolution is more recent than often supposed, with the diversification of major viral lineages only occurring within the 18th and 19th centuries, concomitant with the development of modern vaccination.
•Variola virus genome was reconstructed from a 17th century mummified child•The archival strain is basal to all 20th century strains, with same gene degradation•Molecular-clock analyses show that much of variola virus evolution occurred recently
Using ancient DNA sequences of variola virus recovered from the mummified remains of a 17th century child, Duggan et al. reconstruct the evolutionary history of smallpox. With the ancient strain, the genetic diversification of the smallpox virus is found to be more recent than previously supposed and concurrent with the onset of widespread vaccination.
Summary Background Yersinia pestis has caused at least three human plague pandemics. The second (Black Death, 14–17th centuries) and third (19–20th centuries) have been genetically characterised, but ...there is only a limited understanding of the first pandemic, the Plague of Justinian (6–8th centuries). To address this gap, we sequenced and analysed draft genomes of Y pestis obtained from two individuals who died in the first pandemic. Methods Teeth were removed from two individuals (known as A120 and A76) from the early medieval Aschheim-Bajuwarenring cemetery (Aschheim, Bavaria, Germany). We isolated DNA from the teeth using a modified phenol-chloroform method. We screened DNA extracts for the presence of the Y pestis -specific pla gene on the pPCP1 plasmid using primers and standards from an established assay, enriched the DNA, and then sequenced it. We reconstructed draft genomes of the infectious Y pestis strains, compared them with a database of genomes from 131 Y pestis strains from the second and third pandemics, and constructed a maximum likelihood phylogenetic tree. Findings Radiocarbon dating of both individuals (A120 to 533 AD plus or minus 98 years; A76 to 504 AD plus or minus 61 years) places them in the timeframe of the first pandemic. Our phylogeny contains a novel branch (100% bootstrap at all relevant nodes) leading to the two Justinian samples. This branch has no known contemporary representatives, and thus is either extinct or unsampled in wild rodent reservoirs. The Justinian branch is interleaved between two extant groups, 0.ANT1 and 0.ANT2, and is distant from strains associated with the second and third pandemics. Interpretation We conclude that the Y pestis lineages that caused the Plague of Justinian and the Black Death 800 years later were independent emergences from rodents into human beings. These results show that rodent species worldwide represent important reservoirs for the repeated emergence of diverse lineages of Y pestis into human populations. Funding McMaster University, Northern Arizona University, Social Sciences and Humanities Research Council of Canada, Canada Research Chairs Program, US Department of Homeland Security, US National Institutes of Health, Australian National Health and Medical Research Council.
In the 19th century, there were several major cholera pandemics in the Indian subcontinent, Europe, and North America. The causes of these outbreaks and the genomic strain identities remain a ...mystery. We used targeted high-throughput sequencing to reconstruct the Vibrio cholerae genome from the preserved intestine of a victim of the 1849 cholera outbreak in Philadelphia, part of the second cholera pandemic. This O1 biotype strain has 95 to 97% similarity with the classical O395 genome, differing by 203 single-nucleotide polymorphisms (SNPs), lacking three genomic islands, and probably having one or more tandem cholera toxin prophage (CTX) arrays, which potentially affected its virulence. This result highlights archived medical remains as a potential resource for investigations into the genomic origins of past pandemics.
Archival formalin-fixed paraffin-embedded (FFPE) human tissue collections are typically in poor states of storage across the developing world. With advances in biomolecular techniques, these ...extraordinary and virtually untapped resources have become an essential part of retrospective epidemiological studies. To successfully use such tissues in genomic studies, scientists require high nucleic acid yields and purity. In spite of the increasing number of FFPE tissue kits available, few studies have analyzed their applicability in recovering high-quality nucleic acids from archived human autopsy samples. Here we provide a study involving 10 major extraction methods used to isolate total nucleic acid from FFPE tissues ranging in age from 3 to 13
years. Although all 10 methods recovered quantifiable amounts of DNA, only 6 recovered quantifiable RNA, varying considerably and generally yielding lower DNA concentrations. Overall, we show quantitatively that TrimGen’s WaxFree method and our in-house phenol–chloroform extraction method recovered the highest yields of amplifiable DNA, with considerable polymerase chain reaction (PCR) inhibition, whereas Ambion’s RecoverAll method recovered the most amplifiable RNA.
Environmental microbial diversity is often investigated from a molecular perspective using 16S ribosomal RNA (rRNA) gene amplicons and shotgun metagenomics. While amplicon methods are fast, low-cost, ...and have curated reference databases, they can suffer from amplification bias and are limited in genomic scope. In contrast, shotgun metagenomic methods sample more genomic regions with fewer sequence acquisition biases, but are much more expensive (even with moderate sequencing depth) and computationally challenging. Here, we develop a set of 16S rRNA sequence capture baits that offer a potential middle ground with the advantages from both approaches for investigating microbial communities. These baits cover the diversity of all 16S rRNA sequences available in the Greengenes (v. 13.5) database, with no sequence having <78% sequence identity to at least one bait for all segments of 16S. The use of our baits provide comparable results to 16S amplicon libraries and shotgun metagenomic libraries when assigning taxonomic units from 16S sequences within the metagenomic reads. We demonstrate that 16S rRNA capture baits can be used on a range of microbial samples (i.e., mock communities and rodent fecal samples) to increase the proportion of 16S rRNA sequences (average > 400-fold) and decrease analysis time to obtain consistent community assessments. Furthermore, our study reveals that bioinformatic methods used to analyze sequencing data may have a greater influence on estimates of community composition than library preparation method used, likely due in part to the extent and curation of the reference databases considered. Thus, enriching existing aliquots of shotgun metagenomic libraries and obtaining modest numbers of reads from them offers an efficient orthogonal method for assessment of bacterial community composition.
Pregnancy complications are poorly represented in the archeological record, despite their importance in contemporary and ancient societies. While excavating a Byzantine cemetery in Troy, we ...discovered calcified abscesses among a woman's remains. Scanning electron microscopy of the tissue revealed 'ghost cells', resulting from dystrophic calcification, which preserved ancient maternal, fetal and bacterial DNA of a severe infection, likely chorioamnionitis.
and
dominated the abscesses. Phylogenomic analyses of ancient, historical, and contemporary data showed that
Troy fell within contemporary genetic diversity, whereas
Troy belongs to a lineage that does not appear to be commonly associated with human disease today. We speculate that the ecology of
infection may have differed in the ancient world as a result of close contacts between humans and domesticated animals. These results highlight the complex and dynamic interactions with our microbial milieu that underlie severe maternal infections.
Ancient human remains of paleopathological interest typically contain highly degraded DNA in which pathogenic taxa are often minority components, making sequence-based metagenomic characterization ...costly. Microarrays may hold a potential solution to these challenges, offering a rapid, affordable, and highly informative snapshot of microbial diversity in complex samples without the lengthy analysis and/or high cost associated with high-throughput sequencing. Their versatility is well established for modern clinical specimens, but they have yet to be applied to ancient remains. Here we report bacterial profiles of archaeological and historical human remains using the Lawrence Livermore Microbial Detection Array (LLMDA). The array successfully identified previously-verified bacterial human pathogens, including Vibrio cholerae (cholera) in a 19th century intestinal specimen and Yersinia pestis ("Black Death" plague) in a medieval tooth, which represented only minute fractions (0.03% and 0.08% alignable high-throughput shotgun sequencing reads) of their respective DNA content. This demonstrates that the LLMDA can identify primary and/or co-infecting bacterial pathogens in ancient samples, thereby serving as a rapid and inexpensive paleopathological screening tool to study health across both space and time.