Genetic studies focus on increasingly larger genomic regions of both extant and ancient DNA, and there is a need for simulation software to match these technological advances. We present here a new ...coalescent-based simulation program fastsimcoal, which is able to quickly simulate a variety of genetic markers scattered over very long genomic regions with arbitrary recombination patterns under complex evolutionary scenarios.
fastsimcoal is a C++ program compiled for Windows, MacOsX and Linux platforms. It is freely available at cmpg.unibe.ch/software/fastsimcoal/, together with its detailed user manual and example input files.
Expanding populations incur a mutation burden – the so‐called expansion load. Previous studies of expansion load have focused on codominant mutations. An important consequence of this assumption is ...that expansion load stems exclusively from the accumulation of new mutations occurring in individuals living at the wave front. Using individual‐based simulations, we study here the dynamics of standing genetic variation at the front of expansions, and its consequences on mean fitness if mutations are recessive. We find that deleterious genetic diversity is quickly lost at the front of the expansion, but the loss of deleterious mutations at some loci is compensated by an increase of their frequencies at other loci. The frequency of deleterious homozygotes therefore increases along the expansion axis, whereas the average number of deleterious mutations per individual remains nearly constant across the species range. This reveals two important differences to codominant models: (i) mean fitness at the front of the expansion drops much faster if mutations are recessive, and (ii) mutation load can increase during the expansion even if the total number of deleterious mutations per individual remains constant. We use our model to make predictions about the shape of the site frequency spectrum at the front of range expansion, and about correlations between heterozygosity and fitness in different parts of the species range. Importantly, these predictions provide opportunities to empirically validate our theoretical results. We discuss our findings in the light of recent results on the distribution of deleterious genetic variation across human populations and link them to empirical results on the correlation of heterozygosity and fitness found in many natural range expansions.
It has been long recognized that population demographic expansions lead to distinctive features in the molecular diversity of populations. However, recent simulation results have suggested that a ...distinction could be made between a pure demographic expansion in an unsubdivided population, and a range expansion in a subdivided population, both leading to a large increase in the total number of the individuals. In order to better characterize the effect of a range expansion, I introduce a simple model of instantaneous expansion under an infinite‐island model, under which I derive the distribution of the number of mutation differences between pairs of genes (the mismatch distribution), the heterozygosity, the average number of pairwise difference, and the fixation index FST. These derivations are checked against simulations, and are shown to lead to results qualitatively similar to those one would obtain after a range expansion in a 2‐dimensional stepping‐stone model. I then apply these results to estimate immigration rates in hunter‐gather and post‐Neolithic human populations from patterns of mitochondrial (mtDNA) diversity. Potential problems with this estimation procedure are also discussed.
Range expansions cause a series of founder events. We show that, in a one-dimensional habitat, these founder events are the spatial analog of genetic drift in a randomly mating population. The ...spatial series of allele frequencies created by successive founder events is equivalent to the time series of allele frequencies in a population of effective size ke, the effective number of founders. We derive an expression for ke in a discrete-population model that allows for local population growth and migration among established populations. If there is selection, the net effect is determined approximately by the product of the selection coefficients and the number of generations between successive founding events. We use the model of a single population to compute analytically several quantities for an allele present in the source population: (i) the probability that it survives the series of colonization events, (ii) the probability that it reaches a specified threshold frequency in the last population, and (iii) the mean and variance of the frequencies in each population. We show that the analytic theory provides a good approximation to simulation results. A consequence of our approximation is that the average heterozygosity of neutral alleles decreases by a factor of 1-1/(2ke) in each new population. Therefore, the population genetic consequences of surfing can be predicted approximately by the effective number of founders and the effective selection coefficients, even in the presence of migration among populations. We also show that our analytic results are applicable to a model of range expansion in a continuously distributed population.
Abstract
Motivation
fastsimcoal2 extends fastsimcoal, a continuous time coalescent-based genetic simulation program, by enabling the estimation of demographic parameters under very complex scenarios ...from the site frequency spectrum under a maximum-likelihood framework.
Results
Other improvements include multi-threading, handling of population inbreeding, extended input file syntax facilitating the description of complex demographic scenarios, and more efficient simulations of sparsely structured populations and of large chromosomes.
Availability and implementation
fastsimcoal2 is freely available on http://cmpg.unibe.ch/software/fastsimcoal2/. It includes console versions for Linux, Windows and MacOS, additional scripts for the analysis and visualization of simulated and estimated scenarios, as well as a detailed documentation and ready-to-use examples.
Disentangling the effect on genomic diversity of natural selection from that of demography is notoriously difficult, but necessary to properly reconstruct the history of species. Here, we use ...high-quality human genomic data to show that purifying selection at linked sites (i.e. background selection, BGS) and GC-biased gene conversion (gBGC) together affect as much as 95% of the variants of our genome. We find that the magnitude and relative importance of BGS and gBGC are largely determined by variation in recombination rate and base composition. Importantly, synonymous sites and non-transcribed regions are also affected, albeit to different degrees. Their use for demographic inference can lead to strong biases. However, by conditioning on genomic regions with recombination rates above 1.5 cM/Mb and mutation types (C↔G, A↔T), we identify a set of SNPs that is mostly unaffected by BGS or gBGC, and that avoids these biases in the reconstruction of human history.
Recent studies have shown that low-frequency alleles can sometimes surf on the wave of advance of a population range expansion, reaching high frequencies and spreading over large areas. Using ...microbial populations, Hallatschek and colleagues have provided the first experimental evidence of surfing during spatial expansions. They also show that the newly colonized area should become structured into sectors of low genetic diversity separated by sharp allele frequency gradients, increasing the global genetic differentiation of the population. These experimental results can be easily reproduced
in silico and they should apply to a wide variety of higher organisms. They also suggest that a single range expansion can create very complex patterns at neutral loci, mimicking adaptive processes and resembling postglacial segregation of clades from distinct refuge areas.
Understanding why some evolutionary lineages generate exceptionally high species diversity is an important goal in evolutionary biology. Haplochromine cichlid fishes of Africa's Lake Victoria region ...encompass >700 diverse species that all evolved in the last 150,000 years. How this 'Lake Victoria Region Superflock' could evolve on such rapid timescales is an enduring question. Here, we demonstrate that hybridization between two divergent lineages facilitated this process by providing genetic variation that subsequently became recombined and sorted into many new species. Notably, the hybridization event generated exceptional allelic variation at an opsin gene known to be involved in adaptation and speciation. More generally, differentiation between new species is accentuated around variants that were fixed differences between the parental lineages, and that now appear in many new combinations in the radiation species. We conclude that hybridization between divergent lineages, when coincident with ecological opportunity, may facilitate rapid and extensive adaptive radiation.
The estimation of demographic parameters from genetic data often requires the computation of likelihoods. However, the likelihood function is computationally intractable for many realistic ...evolutionary models, and the use of Bayesian inference has therefore been limited to very simple models. The situation changed recently with the advent of Approximate Bayesian Computation (ABC) algorithms allowing one to obtain parameter posterior distributions based on simulations not requiring likelihood computations.
Here we present ABCtoolbox, a series of open source programs to perform Approximate Bayesian Computations (ABC). It implements various ABC algorithms including rejection sampling, MCMC without likelihood, a Particle-based sampler and ABC-GLM. ABCtoolbox is bundled with, but not limited to, a program that allows parameter inference in a population genetics context and the simultaneous use of different types of markers with different ploidy levels. In addition, ABCtoolbox can also interact with most simulation and summary statistics computation programs. The usability of the ABCtoolbox is demonstrated by inferring the evolutionary history of two evolutionary lineages of Microtus arvalis. Using nuclear microsatellites and mitochondrial sequence data in the same estimation procedure enabled us to infer sex-specific population sizes and migration rates and to find that males show smaller population sizes but much higher levels of migration than females.
ABCtoolbox allows a user to perform all the necessary steps of a full ABC analysis, from parameter sampling from prior distributions, data simulations, computation of summary statistics, estimation of posterior distributions, model choice, validation of the estimation procedure, and visualization of the results.
Recent studies have revealed that 2–3% of the genome of non-Africans might come from Neanderthals, suggesting a more complex scenario of modern human evolution than previously anticipated. In this ...paper, we use a model of admixture during a spatial expansion to study the hybridization of Neanderthals with modern humans during their spread out of Africa. We find that observed low levels of Neanderthal ancestry in Eurasians are compatible with a very low rate of interbreeding (<2%), potentially attributable to a very strong avoidance of interspecific matings, a low fitness of hybrids, or both. These results suggesting the presence of very effective barriers to gene flow between the two species are robust to uncertainties about the exact demography of the Paleolithic populations, and they are also found to be compatible with the observed lack of mtDNA introgression. Our model additionally suggests that similarly low levels of introgression in Europe and Asia may result from distinct admixture events having occurred beyond the Middle East, after the split of Europeans and Asians. This hypothesis could be tested because it predicts that different components of Neanderthal ancestry should be present in Europeans and in Asians.