High-throughput techniques based on restriction site-associated DNA sequencing (RADseq) are enabling the low-cost discovery and genotyping of thousands of genetic markers for any species, including ...non-model organisms, which is revolutionizing ecological, evolutionary and conservation genetics. Technical differences among these methods lead to important considerations for all steps of genomics studies, from the specific scientific questions that can be addressed, and the costs of library preparation and sequencing, to the types of bias and error inherent in the resulting data. In this Review, we provide a comprehensive discussion of RADseq methods to aid researchers in choosing among the many different approaches and avoiding erroneous scientific conclusions from RADseq data, a problem that has plagued other genetic marker types in the past.
Comparative phylogeography of the ocean planet Bowen, Brian W.; Gaither, Michelle R.; DiBattista, Joseph D. ...
Proceedings of the National Academy of Sciences - PNAS,
07/2016, Volume:
113, Issue:
29
Journal Article
Peer reviewed
Open access
Understanding how geography, oceanography, and climate have ultimately shaped marine biodiversity requires aligning the distributions of genetic diversity across multiple taxa. Here, we examine ...phylogeographic partitions in the sea against a backdrop of biogeographic provinces defined by taxonomy, endemism, and species composition. The taxonomic identities used to define biogeographic provinces are routinely accompanied by diagnostic genetic differences between sister species, indicating interspecific concordance between biogeography and phylogeography. In cases where individual species are distributed across two or more biogeographic provinces, shifts in genotype frequencies often align with biogeographic boundaries, providing intraspecific concordance between biogeography and phylogeography. Here, we provide examples of comparative phylogeography from (i) tropical seas that host the highest marine biodiversity, (ii) temperate seas with high productivity but volatile coastlines, (iii) migratory marine fauna, and (iv) plankton that are the most abundant eukaryotes on earth. Tropical and temperate zones both show impacts of glacial cycles, the former primarily through changing sea levels, and the latter through coastal habitat disruption. The general concordance between biogeography and phylogeography indicates that the population-level genetic divergences observed between provinces are a starting point for macroevolutionary divergences between species. However, isolation between provinces does not account for all marine biodiversity; the remainder arises through alternative pathways, such as ecological speciation and parapatric (semiisolated) divergences within provinces and biodiversity hotspots.
Next‐generation sequencing (NGS) technology is revolutionizing the fields of population genetics, molecular ecology and conservation biology. But it can be challenging for researchers to learn the ...new and rapidly evolving techniques required to use NGS data. A recent workshop entitled ‘Population Genomic Data Analysis’ was held to provide training in conceptual and practical aspects of data production and analysis for population genomics, with an emphasis on NGS data analysis. This workshop brought together 16 instructors who were experts in the field of population genomics and 31 student participants. Instructors provided helpful and often entertaining advice regarding how to choose and use a NGS method for a given research question, and regarding critical aspects of NGS data production and analysis such as library preparation, filtering to remove sequencing errors and outlier loci, and genotype calling. In addition, instructors provided general advice about how to approach population genomics data analysis and how to build a career in science. The overarching messages of the workshop were that NGS data analysis should be approached with a keen understanding of the theoretical models underlying the analyses, and with analyses tailored to each research question and project. When analysed carefully, NGS data provide extremely powerful tools for answering crucial questions in disciplines ranging from evolution and ecology to conservation and agriculture, including questions that could not be answered prior to the development of NGS technology.
Whole-genome sequencing data allow survey of variation from across the genome, reducing the constraint of balancing genome sub-sampling with estimating recombination rates and linkage between sampled ...markers and target loci. As sequencing costs decrease, low-coverage whole-genome sequencing of pooled or indexed-individual samples is commonly utilized to identify loci associated with phenotypes or environmental axes in non-model organisms. There are, however, relatively few publicly available bioinformatic pipelines designed explicitly to analyse these types of data, and fewer still that process the raw sequencing data, provide useful metrics of quality control and then execute analyses. Here, we present an updated version of a bioinformatics pipeline called PoolParty2 that can effectively handle either pooled or indexed DNA samples and includes new features to improve computational efficiency. Using simulated data, we demonstrate the ability of our pipeline to recover segregating variants, estimate their allele frequencies accurately, and identify genomic regions harbouring loci under selection. Based on the simulated data set, we benchmark the efficacy of our pipeline with another bioinformatic suite, angsd, and illustrate the compatibility and complementarity of these suites using angsd to generate genotype likelihoods as input for identifying linkage outlier regions using alignment files and variants provided by PoolParty2. Finally, we apply our updated pipeline to an empirical dataset of low-coverage whole genomic data from population samples of Columbia River steelhead trout (Oncorhynchus mykiss), results from which demonstrate the genomic impacts of decades of artificial selection in a prominent hatchery stock. Thus, we not only demonstrate the utility of PoolParty2 for genomic studies that combine sequencing data from multiple individuals, but also illustrate how it compliments other bioinformatics resources such as angsd.
Recently, Lowry et al. addressed the ability of RADseq approaches to detect loci under selection in genome scans. While the authors raise important considerations, such as accounting for the extent ...of linkage disequilibrium in a study system, we strongly disagree with their overall view of the ability of RADseq to inform our understanding of the genetic basis of adaptation. The family of RADseq protocols has radically improved the field of population genomics, expanding by several orders of magnitude the number of markers available while substantially reducing the cost per marker. Researchers whose goal is to identify regions of the genome under selection must consider the LD of the experimental system; however, there is no magical LD cutoff below which researchers should refuse to use RADseq. Lowry et al. further made two major arguments: a theoretical argument that modeled the likelihood of detecting selective sweeps with RAD markers, and gross summaries based on an anecdotal collection of RAD studies. Unfortunately, their simulations were off by two orders of magnitude in the worst case, while their anecdotes merely showed that it is possible to get widely divergent densities of RAD tags for any particular experiment, either by design or due to experimental efficacy. We strongly argue that RADseq remains a powerful and efficient approach that provides sufficient marker density for studying selection in many natural populations. Given limited resources, we argue that researchers should consider a wide range of trade‐offs among genomic techniques, in light of their study question and the power of different techniques to answer it.
If similar evolutionary forces maintain intra‐ and interspecific diversity, patterns of diversity at both levels of biological organization can be expected to covary across space. Although this ...prediction of a positive species‐genetic diversity correlation (SGDC) has been tested for several taxa in natural landscapes, no study has yet evaluated the influence of the community delineation on these SGDCs. In this study, we focused on tropical fishes of the Indo‐Pacific Ocean, using range‐wide single nucleotide polymorphism data for a deep‐sea fish (Etelis coruscans) and species presence data of 4878 Teleostei species. We investigated whether a diversity continuum occurred, for different community delineations (subfamily, family, order and class) and spatial extents, and which processes explained these diversity patterns. We found no association between genetic diversity and species richness (α‐SGDC), regardless of the community and spatial extent. In contrast, we evidenced a positive relationship between genetic and species dissimilarities (β‐SGDC) when the community was defined at the subfamily or family level of the species of interest, and when the Western Indian Ocean was excluded. This relationship was related to the imprint of dispersal processes across levels of biological organization in Lutjanidae. However, this positive β‐SGDC was lost when considering higher taxonomic communities and at the scale of the entire Indo‐Pacific, suggesting different responses of populations and communities to evolutionary processes at these scales. This study provides evidence that the taxonomic scale at which communities are defined and the spatial extent are pivotal to better understand the processes shaping diversity across levels of biological organization.
Display omitted
•We used RADseq to provide a high resolution nuclear phylogeny of the Delphininae.•The genus Stenella is least well resolved likely due to admixture among lineages.•Within Tursiops, ...coastal ecotypes divided early with offshore lineage evolving later.•Cross-lineage gene flow in this group has been more extensive than previously thought.•Results improve our understanding of evolutionary processes during rapid radiations.
Phylogeographic inference has provided extensive insight into the relative roles of geographical isolation and ecological processes during evolutionary radiations. However, the importance of cross-lineage admixture in facilitating adaptive radiations is increasingly being recognised, and suggested as a main cause of phylogenetic uncertainty. In this study, we used a double digest RADseq protocol to provide a high resolution (~4 Million bp) nuclear phylogeny of the Delphininae. Phylogenetic resolution of this group has been especially intractable, likely because it has experienced a recent species radiation. We carried out cross-lineage reticulation analyses, and tested for several sources of potential bias in determining phylogenies from genome sampling data. We assessed the divergence time and historical demography of T. truncatus and T. aduncus by sequencing the T. aduncus genome and comparing it with the T. truncatus reference genome. Our results suggest monophyly for the genus Tursiops, with the recently proposed T. australis species falling within the T. aduncus lineage. We also show the presence of extensive cross-lineage gene flow between pelagic and European coastal ecotypes of T. truncatus, as well as in the early stages of diversification between spotted (Stenella frontalis; Stenella attenuata), spinner (Stenella longirostris), striped (Stenella coeruleoalba), common (Delphinus delphis), and Fraser’s (Lagenodelphis hosei) dolphins. Our study suggests that cross-lineage gene flow in this group has been more extensive and complex than previously thought. In the context of biogeography and local habitat dependence, these results improve our understanding of the evolutionary processes determining the history of this lineage.
Understanding transmission dynamics of SARS-CoV-2 in institutions of higher education (IHEs) is important because these settings have potential for rapid viral spread. Here, we used genomic ...surveillance to retrospectively investigate transmission dynamics throughout the 2020-2021 academic year for the University of Idaho ("University"), a mid-sized IHE in a small rural town. We generated genome assemblies for 1168 SARS-CoV-2 samples collected during the academic year, representing 46.8% of positive samples collected from the University population and 49.8% of positive samples collected from the surrounding community ("Community") at the local hospital during this time. Transmission dynamics differed for the University when compared to the Community, with more infection waves that lasted shorter lengths of time, potentially resulting from high-transmission congregate settings along with mitigation efforts implemented by the University to combat outbreaks. We found evidence for low transmission rates between the University and Community, with approximately 8% of transmissions into the Community originating from the University, and approximately 6% of transmissions into the University originating from the Community. Potential transmission risk factors identified for the University included congregate settings such as sorority and fraternity events and residences, holiday travel, and high caseloads in the surrounding community. Knowledge of these risk factors can help the University and other IHEs develop effective mitigation measures for SARS-CoV-2 and similar pathogens.
Zooplanktonic taxa have a greater number of distinct populations and species than might be predicted based on their large population sizes and open‐ocean habitat, which lacks obvious physical ...barriers to dispersal and gene flow. To gain insight into the evolutionary mechanisms driving genetic diversification in zooplankton, we developed eight microsatellite markers to examine the population structure of an abundant, globally distributed mesopelagic copepod, Haloptilus longicornis, at 18 sample sites across the Atlantic and Pacific Oceans (n = 761). When comparing our microsatellite results with those of a prior study that used a mtDNA marker (mtCOII, n = 1059, 43 sample sites), we unexpectedly found evidence for the presence of a cryptic species pair. These species were globally distributed and apparently sympatric, and were separated by relatively weak genetic divergence (reciprocally monophyletic mtCOII lineages 1.6% divergent; microsatellite FST ranging from 0.28 to 0.88 across loci, P < 0.00001). Using both mtDNA and microsatellite data for the most common of the two species (n = 669 for microsatellites, n = 572 for mtDNA), we also found evidence for allopatric barriers to gene flow within species, with distinct populations separated by continental landmasses and equatorial waters in both the Atlantic and Pacific Ocean basins. Our study shows that oceanic barriers to gene flow can act as a mechanism promoting allopatric diversification in holoplanktonic taxa, despite the high potential dispersal abilities and pelagic habitat for these species.