Rice is a staple crop that has undergone substantial phenotypic and physiological changes during domestication. Here we resequenced the genomes of 40 cultivated accessions selected from the major ...groups of rice and 10 accessions of their wild progenitors (Oryza rufipogon and Oryza nivara) to >15 × raw data coverage. We investigated genome-wide variation patterns in rice and obtained 6.5 million high-quality single nucleotide polymorphisms (SNPs) after excluding sites with missing data in any accession. Using these population SNP data, we identified thousands of genes with significantly lower diversity in cultivated but not wild rice, which represent candidate regions selected during domestication. Some of these variants are associated with important biological features, whereas others have yet to be functionally characterized. The molecular markers we have identified should be valuable for breeding and for identifying agronomically important genes in rice.
Celotno besedilo
Dostopno za:
DOBA, IJS, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
A major goal of population genomics is to reconstruct the history of natural populations and to infer the neutral and selective scenarios that can explain the present-day polymorphism patterns. ...However, the separation between neutral and selective hypotheses has proven hard, mainly because both may predict similar patterns in the genome. This study focuses on the development of methods that can be used to distinguish neutral from selective hypotheses in equilibrium and nonequilibrium populations. These methods utilize a combination of statistics on the basis of the site frequency spectrum (SFS) and linkage disequilibrium (LD). We investigate the patterns of genetic variation along recombining chromosomes using a multitude of comparisons between neutral and selective hypotheses, such as selection or neutrality in equilibrium and nonequilibrium populations and recurrent selection models. We perform hypothesis testing using the classical P-value approach, but we also introduce methods from the machine-learning field. We demonstrate that the combination of SFS- and LD-based statistics increases the power to detect recent positive selection in populations that have experienced past demographic changes.
The recent increase in time-series population genomic data from experimental, natural, and ancient populations has been accompanied by a promising growth in methodologies for inferring demographic ...and selective parameters from such data. However, these methods have largely presumed that the populations of interest are well-described by the Kingman coalescent. In reality, many groups of organisms, including viruses, marine organisms, and some plants, protists, and fungi, typified by high variance in progeny number, may be best characterized by multiple-merger coalescent models. Estimation of population genetic parameters under Wright-Fisher assumptions for these organisms may thus be prone to serious mis-inference. We propose a novel method for the joint inference of demography and selection under the Ψ-coalescent model, termed Multiple-Merger Coalescent Approximate Bayesian Computation, or MMC-ABC. We first demonstrate mis-inference under the Kingman, and then exhibit the superior performance of MMC-ABC under conditions of skewed offspring distributions. In order to highlight the utility of this approach, we reanalyzed previously published drug-selection lines of influenza A virus. We jointly inferred the extent of progeny-skew inherent to viral replication and identified putative drug-resistance mutations.
Population genetics has evolved from a theory-driven field with little empirical data into a data-driven discipline in which genome-scale data sets test the limits of available models and ...computational analysis methods. In humans and a few model organisms, analyses of whole-genome sequence polymorphism data are currently under way. And in light of the falling costs of next-generation sequencing technologies, such studies will soon become common in many other organisms as well. Here, we assess the challenges to analyzing whole-genome sequence polymorphism data, and we discuss the potential of these data to yield new insights concerning population history and the genomic prevalence of natural selection.
Abstract
Building evolutionarily appropriate baseline models for natural populations is not only important for answering fundamental questions in population genetics—including quantifying the ...relative contributions of adaptive versus nonadaptive processes—but also essential for identifying candidate loci experiencing relatively rare and episodic forms of selection (e.g., positive or balancing selection). Here, a baseline model was developed for a human population of West African ancestry, the Yoruba, comprising processes constantly operating on the genome (i.e., purifying and background selection, population size changes, recombination rate heterogeneity, and gene conversion). Specifically, to perform joint inference of selective effects with demography, an approximate Bayesian approach was employed that utilizes the decay of background selection effects around functional elements, taking into account genomic architecture. This approach inferred a recent 6-fold population growth together with a distribution of fitness effects that is skewed towards effectively neutral mutations. Importantly, these results further suggest that, although strong and/or frequent recurrent positive selection is inconsistent with observed data, weak to moderate positive selection is consistent but unidentifiable if rare.
The role of balancing selection in maintaining genetic variation remains an open question in population genetics. Recent years have seen numerous studies identifying candidate loci potentially ...experiencing balancing selection, most predominantly in human populations. There are however numerous alternative evolutionary processes that may leave similar patterns of variation, thereby potentially confounding inference, and the expected signatures of balancing selection additionally change in a temporal fashion. Here we use forward-in-time simulations to quantify expected statistical power to detect balancing selection using both site frequency spectrum (SFS)- and linkage disequilibrium (LD)-based methods under a variety of evolutionarily realistic null models. We find that whilst SFS-based methods have little power immediately after a balanced mutation begins segregating, power increases with time since the introduction of the balanced allele. Conversely, LD-based methods have considerable power whilst the allele is young, and power dissipates rapidly as the time since introduction increases. Taken together, this suggests that SFS-based methods are most effective at detecting long-term balancing selection (>25N generations since the introduction of the balanced allele) whilst LD-based methods are effective over much shorter timescales (<1N generations), thereby leaving a large time frame over which current methods have little power to detect the action of balancing selection. Finally, we investigate the extent to which alternative evolutionary processes may mimic these patterns, and demonstrate the need for caution in attempting to distinguish the signatures of balancing selection from those of both neutral processes (e.g., population structure and admixture) as well as of alternative selective processes (e.g., partial selective sweeps).
Nonequilibrium demography impacts coalescent genealogies leaving detectable, well-studied signatures of variation. However, similar genomic footprints are also expected under models of large ...reproductive skew, posing a serious problem when trying to make inference. Furthermore, current approaches consider only one of the two processes at a time, neglecting any genomic signal that could arise from their simultaneous effects, preventing the possibility of jointly inferring parameters relating to both offspring distribution and population history. Here, we develop an extended Moran model with exponential population growth, and demonstrate that the underlying ancestral process converges to a time-inhomogeneous psi-coalescent. However, by applying a nonlinear change of time scale-analogous to the Kingman coalescent-we find that the ancestral process can be rescaled to its time-homogeneous analog, allowing the process to be simulated quickly and efficiently. Furthermore, we derive analytical expressions for the expected site-frequency spectrum under the time-inhomogeneous psi-coalescent, and develop an approximate-likelihood framework for the joint estimation of the coalescent and growth parameters. By means of extensive simulation, we demonstrate that both can be estimated accurately from whole-genome data. In addition, not accounting for demography can lead to serious biases in the inferred coalescent model, with broad implications for genomic studies ranging from ecology to conservation biology. Finally, we use our method to analyze sequence data from Japanese sardine populations, and find evidence of high variation in individual reproductive success, but few signs of a recent demographic expansion.
Abstract
The interplay of gene flow, genetic drift, and local selective pressure is a dynamic process that has been well studied from a theoretical perspective over the last century. Wright and ...Haldane laid the foundation for expectations under an island-continent model, demonstrating that an island-specific beneficial allele may be maintained locally if the selection coefficient is larger than the rate of migration of the ancestral allele from the continent. Subsequent extensions of this model have provided considerably more insight. Yet, connecting theoretical results with empirical data has proven challenging, owing to a lack of information on the relationship between genotype, phenotype, and fitness. Here, we examine the demographic and selective history of deer mice in and around the Nebraska Sand Hills, a system in which variation at the Agouti locus affects cryptic coloration that in turn affects the survival of mice in their local habitat. We first genotyped 250 individuals from 11 sites along a transect spanning the Sand Hills at 660,000 single nucleotide polymorphisms across the genome. Using these genomic data, we found that deer mice first colonized the Sand Hills following the last glacial period. Subsequent high rates of gene flow have served to homogenize the majority of the genome between populations on and off the Sand Hills, with the exception of the Agouti pigmentation locus. Furthermore, mutations at this locus are strongly associated with the pigment traits that are strongly correlated with local soil coloration and thus responsible for cryptic coloration.
Adaptation is central to population persistence in the face of environmental change, yet we seldom precisely understand the origin and spread of adaptive variation in natural populations. Snowshoe ...hares (Lepus americanus) along the Pacific Northwest coast have evolved brown winter camouflage through positive selection on recessive variation at the Agouti pigmentation gene introgressed from black-tailed jackrabbits (Lepus californicus). Here, we combine new and published whole-genome and exome sequences with targeted genotyping of Agouti to investigate the evolutionary history of local seasonal camouflage adaptation in the Pacific Northwest. We find evidence of significantly elevated inbreeding and mutational load in coastal winter-brown hares, consistent with a recent range expansion into temperate coastal environments that incurred indirect fitness costs. The genome-wide distribution of introgression tract lengths supports a pulse of hybridization near the end of the last glacial maximum, which may have facilitated range expansion via introgression of winter-brown camouflage variation. However, signatures of a selective sweep at Agouti indicate a much more recent spread of winter-brown camouflage. Through simulations, we show that the delay between the hybrid origin and subsequent selective sweep of the recessive winter-brown allele can be largely attributed to the limits of natural selection imposed by simple allelic dominance. We argue that while hybridization during periods of environmental change may provide a critical reservoir of adaptive variation at range edges, the probability and pace of local adaptation will strongly depend on population demography and the genetic architecture of introgressed variation.