This collection of specially commissioned articles looks at fragmented habitats, bringing together recent theoretical advances and empirical studies applying the metapopulation approach. Several ...chapters closely integrate ecology with genetics and evolutionary biology, and others illustrate how metapopulation concepts and models can be applied to answer questions about conservation, epidemiology, and speciation. The extensive coverage of theory from highly regarded scientists and the many substantive applications in this one-of-a-kind work make it invaluable to graduate students and researchers in a wide range of disciplines.
* Provides a comprehensive and authoritative account of all aspects of metapopulation biology, integrating ecology, genetics, and evolution * Developed by recognized experts, including Hanski who won the Balzan Prize for Ecological Sciences* Covers novel applications of the metapopulation approach to conservation
The recent availability of next‐generation sequencing (NGS) has made possible the use of dense genetic markers to identify regions of the genome that may be under the influence of selection. Several ...statistical methods have been developed recently for this purpose. Here, we present the results of an individual‐based simulation study investigating the power and error rate of popular or recent genome scan methods: linear regression, Bayescan, BayEnv and LFMM. Contrary to previous studies, we focus on complex, hierarchical population structure and on polygenic selection. Additionally, we use a false discovery rate (FDR)‐based framework, which provides an unified testing framework across frequentist and Bayesian methods. Finally, we investigate the influence of population allele frequencies versus individual genotype data specification for LFMM and the linear regression. The relative ranking between the methods is impacted by the consideration of polygenic selection, compared to a monogenic scenario. For strongly hierarchical scenarios with confounding effects between demography and environmental variables, the power of the methods can be very low. Except for one scenario, Bayescan exhibited moderate power and error rate. BayEnv performance was good under nonhierarchical scenarios, while LFMM provided the best compromise between power and error rate across scenarios. We found that it is possible to greatly reduce error rates by considering the results of all three methods when identifying outlier loci.
Identifying loci under natural selection from genomic surveys is of great interest in different research areas. Commonly used methods to separate neutral effects from adaptive effects are based on ...locus-specific population differentiation coefficients to identify outliers. Here we extend such an approach to estimate directly the probability that each locus is subject to selection using a Bayesian method. We also extend it to allow the use of dominant markers like AFLPs. It has been shown that this model is robust to complex demographic scenarios for neutral genetic differentiation. Here we show that the inclusion of isolated populations that underwent a strong bottleneck can lead to a high rate of false positives. Nevertheless, we demonstrate that it is possible to avoid them by carefully choosing the populations that should be included in the analysis. We analyze two previously published data sets: a human data set of codominant markers and a Littorina saxatilis data set of dominant markers. We also perform a detailed sensitivity study to compare the power of the method using amplified fragment length polymorphism (AFLP), SNP, and microsatellite markers. The method has been implemented in a new software available at our website (http://www-leca.ujf-grenoble.fr/logiciels.htm).
Computer simulations are excellent tools for understanding the evolutionary and genetic consequences of complex processes whose interactions cannot be analytically predicted. Simulations have ...traditionally been used in population genetics by a fairly small community with programming expertise, but the recent availability of dozens of sophisticated, customizable software packages for simulation now makes simulation an accessible option for researchers in many fields. The in silico genetic data produced by simulations, along with greater availability of population-genomics data, are transforming genetic epidemiology, anthropology, evolutionary and population genetics and conservation. In this Review of the state-of-the-art of simulation software, we identify applications of simulations, evaluate simulator capabilities, provide a guide for their use and summarize future directions.
Summary
Genome‐scan methods are used for screening genomewide patterns of DNA polymorphism to detect signatures of positive selection. There are two main types of methods: (i) ‘outlier’ detection ...methods based on
that detect loci with high differentiation compared to the rest of the genome and (ii) environmental association methods that test the association between allele frequencies and environmental variables.
We present a new
‐based genome‐scan method, BayeScEnv, which incorporates environmental information in the form of ‘environmental differentiation’. It is based on the
F
model, but, as opposed to existing approaches, it considers two locus‐specific effects: one due to divergent selection and the other due to various other processes different from local adaptation (e.g. range expansions, differences in mutation rates across loci or background selection). The method was developped in C++ and is available at
http://github.com/devillemereuil/bayescenv
.
A simulation study shows that our method has a much lower false positive rate than an existing
‐based method, BayeScan, under a wide range of demographic scenarios. Although it has lower power, it leads to a better compromise between power and false positive rate.
We apply our method to a human data set and show that it can be used successfully to study local adaptation. We discuss its scope and compare it to other existing methods.
Common garden experiments are precious to study adaptive phenomenon and adaptive potential, in that they allow to study local adaptation without the confounding effect of phenotypic plasticity. The ...QST − FST comparison framework, comparing genetic differentiation at the phenotypic and molecular level, is the usual way to test and measure whether local adaptation influences phenotypic divergence between populations.
Here, we highlight that the assumptions behind the expected equality QST = FST under neutrality correspond to a very simple model of population genetics. While the equality might, on average, be robust to violation of such assumptions, more complex population structure can generate strong evolutionary noise.
Synthesis. We highlight recent methodological developments aimed at overcoming this issue and at providing a more general framework to detect local adaptation, using less restrictive assumptions. We invite empiricists to look into these methods and theorists to continue developing even more general methods.
We highlight recent methodological developments aimed at overcoming this issue and at providing a more general framework to detect local adaptation, using less restrictive assumptions. We invite empiricists to look into these methods and theorists to continue developing even more general methods.
Living at high altitude is one of the most difficult challenges that humans had to cope with during their evolution. Whereas several genomic studies have revealed some of the genetic bases of ...adaptations in Tibetan, Andean, and Ethiopian populations, relatively little evidence of convergent evolution to altitude in different continents has accumulated. This lack of evidence can be due to truly different evolutionary responses, but it can also be due to the low power of former studies that have mainly focused on populations from a single geographical region or performed separate analyses on multiple pairs of populations to avoid problems linked to shared histories between some populations. We introduce here a hierarchical Bayesian method to detect local adaptation that can deal with complex demographic histories. Our method can identify selection occurring at different scales, as well as convergent adaptation in different regions. We apply our approach to the analysis of a large SNP data set from low- and high-altitude human populations from America and Asia. The simultaneous analysis of these two geographic areas allows us to identify several candidate genome regions for altitudinal selection, and we show that convergent evolution among continents has been quite common. In addition to identifying several genes and biological processes involved in high-altitude adaptation, we identify two specific biological pathways that could have evolved in both continents to counter toxic effects induced by hypoxia.
Summary
Genome‐scan methods are used for screening genomewide patterns of DNA polymorphism to detect signatures of positive selection. There are two main types of methods: (i) ‘outlier’ detection ...methods based on FST that detect loci with high differentiation compared to the rest of the genome and (ii) environmental association methods that test the association between allele frequencies and environmental variables.
We present a new FST‐based genome‐scan method, BayeScEnv, which incorporates environmental information in the form of ‘environmental differentiation’. It is based on the F model, but, as opposed to existing approaches, it considers two locus‐specific effects: one due to divergent selection and the other due to various other processes different from local adaptation (e.g. range expansions, differences in mutation rates across loci or background selection). The method was developped in C++ and is available at http://github.com/devillemereuil/bayescenv.
A simulation study shows that our method has a much lower false positive rate than an existing FST‐based method, BayeScan, under a wide range of demographic scenarios. Although it has lower power, it leads to a better compromise between power and false positive rate.
We apply our method to a human data set and show that it can be used successfully to study local adaptation. We discuss its scope and compare it to other existing methods.
Deep Learning in Population Genetics Korfmann, Kevin; Gaggiotti, Oscar E; Fumagalli, Matteo
Genome biology and evolution,
02/2023, Letnik:
15, Številka:
2
Journal Article
Odprti dostop
Population genetics is transitioning into a data-driven discipline thanks to the availability of large-scale genomic data and the need to study increasingly complex evolutionary scenarios. With ...likelihood and Bayesian approaches becoming either intractable or computationally unfeasible, machine learning, and in particular deep learning, algorithms are emerging as popular techniques for population genetic inferences. These approaches rely on algorithms that learn non-linear relationships between the input data and the model parameters being estimated through representation learning from training data sets. Deep learning algorithms currently employed in the field comprise discriminative and generative models with fully connected, convolutional, or recurrent layers. Additionally, a wide range of powerful simulators to generate training data under complex scenarios are now available. The application of deep learning to empirical data sets mostly replicates previous findings of demography reconstruction and signals of natural selection in model organisms. To showcase the feasibility of deep learning to tackle new challenges, we designed a branched architecture to detect signals of recent balancing selection from temporal haplotypic data, which exhibited good predictive performance on simulated data. Investigations on the interpretability of neural networks, their robustness to uncertain training data, and creative representation of population genetic data, will provide further opportunities for technological advancements in the field.
Identifying genomic regions targeted by positive selection has been a long‐standing interest of evolutionary biologists. This objective was difficult to achieve until the recent emergence of ...next‐generation sequencing, which is fostering the development of large‐scale catalogues of genetic variation for increasing number of species. Several statistical methods have been recently developed to analyse these rich data sets, but there is still a poor understanding of the conditions under which these methods produce reliable results. This study aims at filling this gap by assessing the performance of genome‐scan methods that consider explicitly the physical linkage among SNPs surrounding a selected variant. Our study compares the performance of seven recent methods for the detection of selective sweeps (iHS, nSL, EHHST, xp‐EHH, XP‐EHHST, XPCLR and hapFLK). We use an individual‐based simulation approach to investigate the power and accuracy of these methods under a wide range of population models under both hard and soft sweeps. Our results indicate that XPCLR and hapFLK perform best and can detect soft sweeps under simple population structure scenarios if migration rate is low. All methods perform poorly with moderate‐to‐high migration rates, or with weak selection and very poorly under a hierarchical population structure. Finally, no single method is able to detect both starting and nearly completed selective sweeps. However, combining several methods (XPCLR or hapFLK with iHS or nSL) can greatly increase the power to pinpoint the selected region.