In this study, we sequenced, assembled, and annotated the plastome of
Physalis
cordata
Mill. and compared it with seven species of the genus
Physalis
sensu stricto.
Sequencing, annotating, and ...comparing plastomes allow us to understand the evolutionary mechanisms associated with physiological functions, select possible molecular markers, and identify the types of selection that have acted in different regions of the genome. The plastome of
P.
cordata
is 157,000
bp
long and presents the typical quadripartite structure with a large single-copy (
LSC
) region of 87,267
bp
and a small single-copy (
SSC
) region of 18,501
bp
, which are separated by two inverted repeat (
IRs
) regions of 25,616
bp
each. These values are similar to those found in the other species, except for
P.
angulata
L. and
P.
pruinosa
L., which presented an expansion of the
LSC
region and a contraction of the
IR
regions. The plastome in all
Physalis
species studied shows variation in the boundary of the regions with three distinct types, the percentage of the sequence identity between coding and non-coding regions, and the number of repetitive regions and microsatellites. Four genes and 10 intergenic regions show promise as molecular markers and eight genes were under positive selection. The maximum likelihood analysis showed that the plastome is a good source of information for phylogenetic inference in the genus, given the high support values and absence of polytomies. In the
Physalis
plastomes analyzed here, the differences found, the positive selection of genes, and the phylogenetic relationships do not show trends that correspond to the biological or ecological characteristics of the species studied.
DNA metabarcoding is a powerful new tool allowing characterization of species assemblages using high‐throughput amplicon sequencing. The utility of DNA metabarcoding for quantifying relative species ...abundances is currently limited by both biological and technical biases which influence sequence read counts. We tested the idea of sequencing 50/50 mixtures of target species and a control species in order to generate relative correction factors (RCFs) that account for multiple sources of bias and are applicable to field studies. RCFs will be most effective if they are not affected by input mass ratio or co‐occurring species. In a model experiment involving three target fish species and a fixed control, we found RCFs did vary with input ratio but in a consistent fashion, and that 50/50 RCFs applied to DNA sequence counts from various mixtures of the target species still greatly improved relative abundance estimates (e.g. average per species error of 19 ± 8% for uncorrected vs. 3 ± 1% for corrected estimates). To demonstrate the use of correction factors in a field setting, we calculated 50/50 RCFs for 18 harbour seal (Phoca vitulina) prey species (RCFs ranging from 0.68 to 3.68). Applying these corrections to field‐collected seal scats affected species percentages from individual samples (Δ 6.7 ± 6.6%) more than population‐level species estimates (Δ 1.7 ± 1.2%). Our results indicate that the 50/50 RCF approach is an effective tool for evaluating and correcting biases in DNA metabarcoding studies. The decision to apply correction factors will be influenced by the feasibility of creating tissue mixtures for the target species, and the level of accuracy needed to meet research objectives.
Human and animal fungal pathogens are a growing threat worldwide leading to emerging infections and creating new risks for established ones. There is a growing need for a rapid and accurate ...identification of pathogens to enable early diagnosis and targeted antifungal therapy. Morphological and biochemical identification methods are time-consuming and require trained experts. Alternatively, molecular methods, such as DNA barcoding, a powerful and easy tool for rapid monophasic identification, offer a practical approach for species identification and less demanding in terms of taxonomical expertise. However, its wide-spread use is still limited by a lack of quality-controlled reference databases and the evolving recognition and definition of new fungal species/complexes. An international consortium of medical mycology laboratories was formed aiming to establish a quality controlled ITS database under the umbrella of the ISHAM working group on “DNA barcoding of human and animal pathogenic fungi.” A new database, containing 2800 ITS sequences representing 421 fungal species, providing the medical community with a freely accessible tool at http://www.isham.org/ and http://its.mycologylab.org/ to rapidly and reliably identify most agents of mycoses, was established. The generated sequences included in the new database were used to evaluate the variation and overall utility of the ITS region for the identification of pathogenic fungi at intra-and interspecies level. The average intraspecies variation ranged from 0 to 2.25%. This highlighted selected pathogenic fungal species, such as the dermatophytes and emerging yeast, for which additional molecular methods/genetic markers are required for their reliable identification from clinical and veterinary specimens.
The proliferation of DNA data is revolutionizing all fields of systematic research. DNA barcode sequences, now available for millions of specimens and several hundred thousand species, are ...increasingly used in algorithmic species delimitations. This is complicated by occasional incongruences between species and gene genealogies, as indicated by situations where conspecific individuals do not form a monophyletic cluster in a gene tree. In two previous reviews, nonmonophyly has been reported as being common in mitochondrial DNA gene trees. We developed a novel web service "Monophylizer" to detect non-monophyly in phylogenetic trees and used it to ascertain the incidence of species nonmonophyly in COI (a.k.a. coxl) barcode sequence data from 4977 species and 41,583 specimens of European Lepidoptera, the largest data set of DNA barcodes analyzed from this regard. Particular attention was paid to accurate species identification to ensure data integrity. We investigated the effects of tree-building method, sampling effort, and other methodological issues, all of which can influence estimates of non-monophyly. We found a 12% incidence of non-monophyly, a value significantly lower than that observed in previous studies. Neighbor joining (NJ) and maximum likelihood (ML) methods yielded almost equal numbers of non-monophyletic species, but 24.1% of these cases of non-monophyly were only found by one of these methods. Non-monophyletic species tend to show either low genetic distances to their nearest neighbors or exceptionally high levels of intraspecific variability. Cases of polyphyly in COI trees arising as a result of deep intraspecific divergence are negligible, as the detected cases reflected misidentifications or methodological errors. Taking into consideration variation in sampling effort, we estimate that the true incidence of non-monophyly is ~23%, but with operational factors still being included. Within the operational factors, we separately assessed the frequency of taxonomic limitations (presence of overlooked cryptic and oversplit species) and identification uncertainties. We observed that operational factors are potentially present in more than half (58.6%) of the detected cases of non-monophyly. Furthermore, we observed that in about 20% of non-monophyletic species and entangled species, the lineages involved are either allopatric or parapatric—conditions where species delimitation is inherently subjective and particularly dependent on the species concept that has been adopted. These observations suggest that species-level non-monophyly in COI gene trees is less common than previously supposed, with many cases reflecting misidentifications, the subjectivity of species delimitation or other operational factors.
Biological monitoring has failed to develop from simple binary assessment outcomes of the impacted/unimpacted type, towards more diagnostic frameworks, despite significant scientific effort over the ...past fifty years. It is our assertion that this is largely because of the limited information content of biological samples processed by traditional morphology‐based taxonomy, which is a slow, imprecise process, focused on restricted groups of organisms. We envision a new paradigm in ecosystem assessment, which we refer to as ‘Biomonitoring 2.0’. This new schema employs DNA‐based identification of taxa, coupled with high‐throughput DNA sequencing on next‐generation sequencing platforms. We discuss the transformational nature of DNA‐based approaches in biodiversity discovery and ecosystem assessment and outline a path forward for their future widespread application.
DNA barcoding is a powerful tool for species detection, identification and discovery. Metazoan DNA barcoding is primarily based upon a specific region of the cytochrome c oxidase subunit I gene that ...is PCR amplified by primers HCO2198 and LCO1490 (‘Folmer primers’) designed by Folmer et al. (Molecular Marine Biology and Biotechnology, 3, 1994, 294). Analysis of sequences published since 1994 has revealed mismatches in the Folmer primers to many metazoans. These sequences also show that an extremely high level of degeneracy would be necessary in updated Folmer primers to maintain broad taxonomic utility. In primers jgHCO2198 and jgLCO1490, we replaced most fully degenerated sites with inosine nucleotides that complement all four natural nucleotides and modified other sites to better match major marine invertebrate groups. The modified primers were used to amplify and sequence cytochrome c oxidase subunit I from 9105 specimens from Moorea, French Polynesia and San Francisco Bay, California, USA representing 23 phyla, 42 classes and 121 orders. The new primers, jgHCO2198 and jgLCO1490, are well suited for routine DNA barcoding, all‐taxon surveys and metazoan metagenomics.
Display omitted
•COI-representation contrasted to the number of currently recognized species.•15.13% Total COI-coverage across the recognized biodiversity on Earth.•On average, 20.76% of each phylum ...covered with DNA barcodes.•Several phyla are severely neglected by barcoding campaigns.•Taxonomic expertise sorely needed in DNA barcoding.
The functionality of standard zoological DNA barcoding practice (the identification of unknown specimens by comparison of COI sequences) is contingent on working barcode databases with sufficient taxonomic coverage. It has already been established that the main barcoding repositories, NCBI and BOLD, are devoid of data for many animal groups but the specific taxonomic coverage of the repositories across animal biodiversity remains unexplored. Here, I shed light on this mystery by contrasting the number of unique taxon labels in the two databases with the number of currently recognized species for each animal phylum. The numbers reveal an overall paucity of COI sequence data in the repositories (15.13% total coverage across the recognized biodiversity on Earth, and 20.76% average taxonomic coverage for each phylum) and, more importantly, bear witness to the idleness towards numerous phyla, rendering current barcoding efforts either ineffective or inaccurate. The importance of further integrating taxonomic expertise into barcoding practice is briefly discussed and some guidelines, previously mentioned in the barcoding literature, are suggested anew. Finally, the asserted values concerning the taxonomic coverage in barcoding databases for Animalia are contrasted with those of Plantae and Fungi.
Poppies are beneficial plants with a variety of applications, including medicinal, edible, ornamental, and industrial purposes. Some Papaver species are forensically significant plants because they ...contain opium, a narcotic substance. Internationally trafficked species of illegal poppies are being identified by DNA barcoding employing multiple markers in response to their forensic value. However, effective markers for precise species identification of legal and illegal poppies are still under discussion, with research on illegal poppies focusing on Papaver somniferum L., and species identification studies of Papaver bracteatum and Papaver setigerum DC. still lacking. As a result, in order to evaluate the performance of genetic markers and classify their DNA sequences in the genus Papaver, this study developed the first machine learning-based two-layer model, in which the first layer classifies legal and illegal poppies from the given sequence and the second layer identifies species of illegal poppies using their sequences. We constructed the dataset and investigated biological features from four markers, internal transcribed spacer 1 (ITS1), internal transcribed spacer 2 (ITS2), transfer RNA Leucine (trnL), transfer RNA Leucine - transfer RNA Phenylalanine intergenic spacer (trnL–trnF intergenic spacer) and their combination, using four machine learning algorithms, K-nearest neighbor (KNN), Naïve Bayes (NB), extreme gradient boost (XGBoost) and Random Forest (RF). According to our findings, for Layer 1 to classify legal and illegal poppies, KNN-based models using combined ITS region achieved the greatest performance of accuracy 0.846 and 0.889 using training and test sets, respectively. Additionally, for Layer 2 to identify illegal poppy species, KNN-based models using combined ITS region achieved the best performance of 0.833 and 1.000 for using training and test sets, respectively. To validate the model, the combined ITS region, which includes ITS 1 and 2 sequences, from blind poppy samples were used as a case study, with the Layer 1 correctly classifying legal and illegal poppies with over 0.830 accuracy. Layer 2 correctly identified P. setigerum DC., however, only one of the three P. somniferum L. species was accurately identified. Nevertheless, our research shows that machine learning can be used to classify and identify legal and illegal poppy species using DNA barcodes which can then be used as an efficient and effective forensic tool for improved law enforcement and a safer society.
Display omitted
•A two-layer machine learning model was developed for the classification of legal and illegal poppy classification.•The KNN model trained on the ITS region sequences of poppy species showed best performance in classifying between legal and illegal poppies, and also in identifying the species of illegal poppies.•The utility of machine learning in the classification and identification of legal and illegal poppy species using DNA barcoding was demonstrated.
•DNA barcoding is facing many challenges as it incorporates new technological advances.•DNA barcoding and metabarcoding are highly complementary approaches.•We need a coordinated advancement of ...DNA-based species identification.•We need to unify traditional taxonomy, barcoding, and metabarcoding approaches.
DNA-based species identification, known as barcoding, transformed the traditional approach to the study of biodiversity science. The field is transitioning from barcoding individuals to metabarcoding communities. This revolution involves new sequencing technologies, bioinformatics pipelines, computational infrastructure, and experimental designs. In this dynamic genomics landscape, metabarcoding studies remain insular and biodiversity estimates depend on the particular methods used. In this opinion article, I discuss the need for a coordinated advancement of DNA-based species identification that integrates taxonomic and barcoding information. Such an approach would facilitate access to almost 3 centuries of taxonomic knowledge and 1 decade of building repository barcodes. Conservation projects are time sensitive, research funding is becoming restricted, and informed decisions depend on our ability to embrace integrative approaches to biodiversity science.
We live in an era of unprecedented biodiversity loss, affecting the taxonomic composition of ecosystems worldwide. The immense task of quantifying human imprints on global ecosystems has been greatly ...simplified by developments in high-throughput DNA sequencing technology (HTS). Approaches like DNA metabarcoding enable the study of biological communities at unparalleled detail. However, current protocols for HTS-based biodiversity exploration have several drawbacks. They are usually based on short sequences, with limited taxonomic and phylogenetic information content. Access to expensive HTS technology is often restricted in developing countries. Ecosystems of particular conservation priority are often remote and hard to access, requiring extensive time from field collection to laboratory processing of specimens. The advent of inexpensive mobile laboratory and DNA sequencing technologies show great promise to facilitate monitoring projects in biodiversity hot-spots around the world. Recent attention has been given to portable DNA sequencing studies related to infectious organisms, such as bacteria and viruses, yet relatively few studies have focused on applying these tools to Eukaryotes, such as plants and animals. Here, we outline the current state of genetic biodiversity monitoring of higher Eukaryotes using Oxford Nanopore Technology's MinION portable sequencing platform, as well as summarize areas of recent development.