Environmental DNA (eDNA) and metabarcoding are boosting our ability to acquire data on species distribution in a variety of ecosystems. Nevertheless, as most of sampling approaches, eDNA is not ...perfect. It can fail to detect species that are actually present, and even false positives are possible: a species may be apparently detected in areas where it is actually absent. Controlling false positives remains a main challenge for eDNA analyses: in this issue of Molecular Ecology Resources, Lahoz‐Monfort et al. () test the performance of multiple statistical modelling approaches to estimate the rate of detection and false positives from eDNA data. Here, we discuss the importance of controlling for false detection from early steps of eDNA analyses (laboratory, bioinformatics), to improve the quality of results and allow an efficient use of the site occupancy‐detection modelling (SODM) framework for limiting false presences in eDNA analysis.
Virtually all empirical ecological studies require species identification during data collection. DNA metabarcoding refers to the automated identification of multiple species from a single bulk ...sample containing entire organisms or from a single environmental sample containing degraded DNA (soil, water, faeces, etc.). It can be implemented for both modern and ancient environmental samples. The availability of next‐generation sequencing platforms and the ecologists’ need for high‐throughput taxon identification have facilitated the emergence of DNA metabarcoding. The potential power of DNA metabarcoding as it is implemented today is limited mainly by its dependency on PCR and by the considerable investment needed to build comprehensive taxonomic reference libraries. Further developments associated with the impressive progress in DNA sequencing will eliminate the currently required DNA amplification step, and comprehensive taxonomic reference libraries composed of whole organellar genomes and repetitive ribosomal nuclear DNA can be built based on the well‐curated DNA extract collections maintained by standardized barcoding initiatives. The near‐term future of DNA metabarcoding has an enormous potential to boost data acquisition in biodiversity research.
DNA barcoding has had a major impact on biodiversity science. The elegant simplicity of establishing massive scale databases for a few barcode loci is continuing to change our understanding of ...species diversity patterns, and continues to enhance human abilities to distinguish among species. Capitalizing on the developments of next generation sequencing technologies and decreasing costs of genome sequencing, there is now the opportunity for the DNA barcoding concept to be extended to new kinds of genomic data. We illustrate the benefits and capacity to do this, and also note the constraints and barriers to overcome before it is truly scalable. We advocate a twin track approach: (i) continuation and acceleration of global efforts to build the DNA barcode reference library of life on earth using standard DNA barcodes and (ii) active development and application of extended DNA barcodes using genome skimming to augment the standard barcoding approach.
Almost all empirical studies in ecology have to identify the species involved in the ecological process under examination. DNA metabarcoding, which couples the principles of DNA barcoding with next ...generation sequencing technology, provides an opportunity to easily produce large amounts of data on biodiversity. Microbiologists have long used metabarcoding approaches, but use of this technique in the assessment of biodiversity in plant and animal communities is under‐explored. Despite its relationship with DNA barcoding, several unique features of DNA metabarcoding justify the development of specific data analysis methodologies. In this review, we describe the bioinformatics tools available for DNA metabarcoding of plants and animals, and we revisit others developed for DNA barcoding or microbial metabarcoding. We also discuss the principles and associated tools for evaluating and comparing DNA barcodes in the context of DNA metabarcoding, for designing new custom‐made barcodes adapted to specific ecological question, for dealing with PCR and sequencing errors, and for inferring taxonomical data from sequences.
DNA metabarcoding offers new perspectives in biodiversity research. This recently developed approach to ecosystem study relies heavily on the use of next‐generation sequencing (NGS) and thus calls ...upon the ability to deal with huge sequence data sets. The obitools package satisfies this requirement thanks to a set of programs specifically designed for analysing NGS data in a DNA metabarcoding context. Their capacity to filter and edit sequences while taking into account taxonomic annotation helps to set up tailor‐made analysis pipelines for a broad range of DNA metabarcoding applications, including biodiversity surveys or diet analyses. The obitools package is distributed as an open source software available on the following website: http://metabarcoding.org/obitools. A Galaxy wrapper is available on the GenOuest core facility toolshed: http://toolshed.genouest.org.
Using non-conventional markers, DNA metabarcoding allows biodiversity assessment from complex substrates. In this article, we present ecoPrimers, a software for identifying new barcode markers and ...their associated PCR primers. ecoPrimers scans whole genomes to find such markers without a priori knowledge. ecoPrimers optimizes two quality indices measuring taxonomical range and discrimination to select the most efficient markers from a set of reference sequences, according to specific experimental constraints such as marker length or specifically targeted taxa. The key step of the algorithm is the identification of conserved regions among reference sequences for anchoring primers. We propose an efficient algorithm based on data mining, that allows the analysis of huge sets of sequences. We evaluate the efficiency of ecoPrimers by running it on three different sequence sets: mitochondrial, chloroplast and bacterial genomes. Identified barcode markers correspond either to barcode regions already in use for plants or animals, or to new potential barcodes. Results from empirical experiments carried out on a promising new barcode for analyzing vertebrate diversity fully agree with expectations based on bioinformatics analysis. These tests demonstrate the efficiency of ecoPrimers for inferring new barcodes fitting with diverse experimental contexts. ecoPrimers is available as an open source project at: http://www.grenoble.prabi.fr/trac/ecoPrimers.
During the last 15 years the internal transcribed spacer (ITS) of nuclear DNA has been used as a target for analyzing fungal diversity in environmental samples, and has recently been selected as the ...standard marker for fungal DNA barcoding. In this study we explored the potential amplification biases that various commonly utilized ITS primers might introduce during amplification of different parts of the ITS region in samples containing mixed templates ('environmental barcoding'). We performed in silico PCR analyses with commonly used primer combinations using various ITS datasets obtained from public databases as templates.
Some of the ITS primers, such as ITS1-F, were hampered with a high proportion of mismatches relative to the target sequences, and most of them appeared to introduce taxonomic biases during PCR. Some primers, e.g. ITS1-F, ITS1 and ITS5, were biased towards amplification of basidiomycetes, whereas others, e.g. ITS2, ITS3 and ITS4, were biased towards ascomycetes. The assumed basidiomycete-specific primer ITS4-B only amplified a minor proportion of basidiomycete ITS sequences, even under relaxed PCR conditions. Due to systematic length differences in the ITS2 region as well as the entire ITS, we found that ascomycetes will more easily amplify than basidiomycetes using these regions as targets. This bias can be avoided by using primers amplifying ITS1 only, but this would imply preferential amplification of 'non-dikarya' fungi.
We conclude that ITS primers have to be selected carefully, especially when used for high-throughput sequencing of environmental samples. We suggest that different primer combinations or different parts of the ITS region should be analyzed in parallel, or that alternative ITS primers should be searched for.
DNA barcoding is a key tool for assessing biodiversity in both taxonomic and environmental studies. Essential features of barcodes include their applicability to a wide spectrum of taxa and their ...ability to identify even closely related species. Several DNA regions have been proposed as barcodes and the region selected strongly influences the output of a study. However, formal comparisons between barcodes remained limited until now. Here we present a standard method for evaluating barcode quality, based on the use of a new bioinformatic tool that performs in silico PCR over large databases. We illustrate this approach by comparing the taxonomic coverage and the resolution of several DNA regions already proposed for the barcoding of vertebrates. To assess the relationship between in silico and in vitro PCR, we also developed specific primers amplifying different species of Felidae, and we tested them using both kinds of PCR RESULTS: Tests on specific primers confirmed the correspondence between in silico and in vitro PCR. Nevertheless, results of in silico and in vitro PCRs can be somehow different, also because tuning PCR conditions can increase the performance of primers with limited taxonomic coverage. The in silico evaluation of DNA barcodes showed a strong variation of taxonomic coverage (i.e., universality): barcodes based on highly degenerated primers and those corresponding to the conserved region of the Cyt-b showed the highest coverage. As expected, longer barcodes had a better resolution than shorter ones, which are however more convenient for ecological studies analysing environmental samples.
In silico PCR could be used to improve the performance of a study, by allowing the preliminary comparison of several DNA regions in order to identify the most appropriate barcode depending on the study aims.
Marine sediments are home to one of the richest species pools on Earth, but logistics and a dearth of taxonomic work-force hinders the knowledge of their biodiversity. We characterized α- and ...β-diversity of deep-sea assemblages from submarine canyons in the western Mediterranean using an environmental DNA metabarcoding. We used a new primer set targeting a short eukaryotic 18S sequence (ca. 110 bp). We applied a protocol designed to obtain extractions enriched in extracellular DNA from replicated sediment corers. With this strategy we captured information from DNA (local or deposited from the water column) that persists adsorbed to inorganic particles and buffered short-term spatial and temporal heterogeneity. We analysed replicated samples from 20 localities including 2 deep-sea canyons, 1 shallower canal, and two open slopes (depth range 100-2,250 m). We identified 1,629 MOTUs, among which the dominant groups were Metazoa (with representatives of 19 phyla), Alveolata, Stramenopiles, and Rhizaria. There was a marked small-scale heterogeneity as shown by differences in replicates within corers and within localities. The spatial variability between canyons was significant, as was the depth component in one of the canyons where it was tested. Likewise, the composition of the first layer (1 cm) of sediment was significantly different from deeper layers. We found that qualitative (presence-absence) and quantitative (relative number of reads) data showed consistent trends of differentiation between samples and geographic areas. The subset of exclusively benthic MOTUs showed similar patterns of β-diversity and community structure as the whole dataset. Separate analyses of the main metazoan phyla (in number of MOTUs) showed some differences in distribution attributable to different lifestyles. Our results highlight the differentiation that can be found even between geographically close assemblages, and sets the ground for future monitoring and conservation efforts on these bottoms of ecological and economic importance.
Environmental DNA (eDNA) metabarcoding is increasingly used to study the present and past biodiversity. eDNA analyses often rely on amplification of very small quantities or degraded DNA. To avoid ...missing detection of taxa that are actually present (false negatives), multiple extractions and amplifications of the same samples are often performed. However, the level of replication needed for reliable estimates of the presence/absence patterns remains an unaddressed topic. Furthermore, degraded DNA and PCR/sequencing errors might produce false positives. We used simulations and empirical data to evaluate the level of replication required for accurate detection of targeted taxa in different contexts and to assess the performance of methods used to reduce the risk of false detections. Furthermore, we evaluated whether statistical approaches developed to estimate occupancy in the presence of observational errors can successfully estimate true prevalence, detection probability and false‐positive rates. Replications reduced the rate of false negatives; the optimal level of replication was strongly dependent on the detection probability of taxa. Occupancy models successfully estimated true prevalence, detection probability and false‐positive rates, but their performance increased with the number of replicates. At least eight PCR replicates should be performed if detection probability is not high, such as in ancient DNA studies. Multiple DNA extractions from the same sample yielded consistent results; in some cases, collecting multiple samples from the same locality allowed detecting more species. The optimal level of replication for accurate species detection strongly varies among studies and could be explicitly estimated to improve the reliability of results.