A powerful way to detect selection in a population is by modeling local allele frequency changes in a particular region of the genome under scenarios of selection and neutrality and finding which ...model is most compatible with the data. A previous method based on a cross-population composite likelihood ratio (XP-CLR) uses an outgroup population to detect departures from neutrality that could be compatible with hard or soft sweeps, at linked sites near a beneficial allele. However, this method is most sensitive to recent selection and may miss selective events that happened a long time ago. To overcome this, we developed an extension of XP-CLR that jointly models the behavior of a selected allele in a three-population tree. Our method - called "3-population composite likelihood ratio" (3P-CLR) - outperforms XP-CLR when testing for selection that occurred before two populations split from each other and can distinguish between those events and events that occurred specifically in each of the populations after the split. We applied our new test to population genomic data from the 1000 Genomes Project, to search for selective sweeps that occurred before the split of Yoruba and Eurasians, but after their split from Neanderthals, and that could have led to the spread of modern-human-specific phenotypes. We also searched for sweep events that occurred in East Asians, Europeans, and the ancestors of both populations, after their split from Yoruba. In both cases, we are able to confirm a number of regions identified by previous methods and find several new candidates for selection in recent and ancient times. For some of these, we also find suggestive functional mutations that may have driven the selective events.
Ancient DNA and human history Slatkin, Montgomery; Racimo, Fernando
Proceedings of the National Academy of Sciences - PNAS,
06/2016, Volume:
113, Issue:
23
Journal Article
Peer reviewed
Open access
We review studies of genomic data obtained by sequencing hominin fossils with particular emphasis on the unique information that ancient DNA (aDNA) can provide about the demographic history of humans ...and our closest relatives. We concentrate on nuclear genomic sequences that have been published in the past few years. In many cases, particularly in the Arctic, the Americas, and Europe, aDNA has revealed historical demographic patterns in a way that could not be resolved by analyzing present-day genomes alone. Ancient DNA from archaic hominins has revealed a rich history of admixture between early modern humans, Neanderthals, and Denisovans, and has allowed us to disentangle complex selective processes. Information from aDNA studies is nowhere near saturation, and we believe that future aDNA sequences will continue to change our understanding of hominin history.
The sequencing of ancient DNA from archaic humans—Neanderthals and Denisovans—has revealed that modern and archaic humans interbred at least twice during the Pleistocene. The field of human ...paleogenomics has now turned its attention towards understanding the nature of this genetic legacy in the gene pool of present-day humans. What exactly did modern humans obtain from interbreeding with Neanderthals and Denisovans? Was the introgressed genetic material beneficial, neutral or maladaptive? Can differences in phenotypes among present-day human populations be explained by archaic human introgression? These questions are of prime importance for our understanding of recent human evolution, but will require careful computational modeling and extensive functional assays before they can be answered in full. Here, we review the recent literature characterizing introgressed DNA and the likely biological consequences for their modern human carriers. We focus particularly on archaic human haplotypes that were beneficial to modern humans as they expanded across the globe, and on ways to understand how populations harboring these haplotypes evolved over time.
Several recent papers have reported strong signals of selection on European polygenic height scores. These analyses used height effect estimates from the GIANT consortium and replication studies. ...Here, we describe a new analysis based on the the UK Biobank (UKB), a large, independent dataset. We find that the signals of selection using UKB effect estimates are strongly attenuated or absent. We also provide evidence that previous analyses were confounded by population stratification. Therefore, the conclusion of strong polygenic adaptation now lacks support. Moreover, these discrepancies highlight (1) that methods for correcting for population stratification in GWAS may not always be sufficient for polygenic trait analyses, and (2) that claims of differences in polygenic scores between populations should be treated with caution until these issues are better understood.
This article has been through an editorial process in which the authors decide how to respond to the issues raised during peer review. The Reviewing Editor's assessment is that all the issues have been addressed (see decision letter).
Studies in a variety of species have shown evidence for positively selected variants introduced into a population via introgression from another, distantly related population—a process known as ...adaptive introgression. However, there are few explicit frameworks for jointly modelling introgression and positive selection, in order to detect these variants using genomic sequence data. Here, we develop an approach based on convolutional neural networks (CNNs). CNNs do not require the specification of an analytical model of allele frequency dynamics and have outperformed alternative methods for classification and parameter estimation tasks in various areas of population genetics. Thus, they are potentially well suited to the identification of adaptive introgression. Using simulations, we trained CNNs on genotype matrices derived from genomes sampled from the donor population, the recipient population and a related non-introgressed population, in order to distinguish regions of the genome evolving under adaptive introgression from those evolving neutrally or experiencing selective sweeps. Our CNN architecture exhibits 95% accuracy on simulated data, even when the genomes are unphased, and accuracy decreases only moderately in the presence of heterosis. As a proof of concept, we applied our trained CNNs to human genomic datasets—both phased and unphased—to detect candidates for adaptive introgression that shaped our evolutionary history.
Comparisons of DNA from archaic and modern humans show that these groups interbred, and in some cases received an evolutionary advantage from doing so. This process-adaptive introgression-may lead to ...a faster rate of adaptation than is predicted from models with mutation and selection alone. Within the last couple of years, a series of studies have identified regions of the genome that are likely examples of adaptive introgression. In many cases, once a region was ascertained as being introgressed, commonly used statistics based on both haplotype as well as allele frequency information were employed to test for positive selection. Introgression by itself, however, changes both the haplotype structure and the distribution of allele frequencies, thus confounding traditional tests for detecting positive selection. Therefore, patterns generated by introgression alone may lead to false inferences of positive selection. Here we explore models involving both introgression and positive selection to investigate the behavior of various statistics under adaptive introgression. In particular, we find that the number and allelic frequencies of sites that are uniquely shared between archaic humans and specific present-day populations are particularly useful for detecting adaptive introgression. We then examine the 1000 Genomes dataset to characterize the landscape of uniquely shared archaic alleles in human populations. Finally, we identify regions that were likely subject to adaptive introgression and discuss some of the most promising candidate genes located in these regions.
In the field of human history, ancient DNA has provided answers to long-standing debates about major movements of people and has begun to inform on other important facets of the human experience. The ...field is now moving from mostly large-scale supraregional studies to a more local perspective, shedding light on socioeconomic processes, inheritance rules, marriage practices and technological diffusion. In this Review, we summarize recent studies showcasing these types of insights, focusing on methods used to infer sociocultural aspects of human behaviour. This approach often involves working across disciplines - such as anthropology, archaeology, linguistics and genetics - that have until recently evolved in separation. Multidisciplinary dialogue is important for an integrated reconstruction of human history, which can yield extraordinary insights about past societies, reproductive behaviours and even lifestyle habits that would not be possible to obtain otherwise.
An open question in human evolution is the importance of polygenic adaptation: adaptive changes in the mean of a multifactorial trait due to shifts in allele frequencies across many loci. In recent ...years, several methods have been developed to detect polygenic adaptation using loci identified in genome-wide association studies (GWAS). Though powerful, these methods suffer from limited interpretability: they can detect which sets of populations have evidence for polygenic adaptation, but are unable to reveal where in the history of multiple populations these processes occurred. To address this, we created a method to detect polygenic adaptation in an admixture graph, which is a representation of the historical divergences and admixture events relating different populations through time. We developed a Markov chain Monte Carlo (MCMC) algorithm to infer branch-specific parameters reflecting the strength of selection in each branch of a graph. Additionally, we developed a set of summary statistics that are fast to compute and can indicate which branches are most likely to have experienced polygenic adaptation. We show via simulations that this method-which we call PolyGraph-has good power to detect polygenic adaptation, and applied it to human population genomic data from around the world. We also provide evidence that variants associated with several traits, including height, educational attainment, and self-reported unibrow, have been influenced by polygenic adaptation in different populations during human evolution.
Ancient DNA is revealing new insights into the genetic relationship between Pleistocene hominins and modern humans. Nuclear DNA indicated Neanderthals as a sister group of Denisovans after diverging ...from modern humans. However, the closer affinity of the Neanderthal mitochondrial DNA (mtDNA) to modern humans than Denisovans has recently been suggested as the result of gene flow from an African source into Neanderthals before 100,000 years ago. Here we report the complete mtDNA of an archaic femur from the Hohlenstein-Stadel (HST) cave in southwestern Germany. HST carries the deepest divergent mtDNA lineage that splits from other Neanderthals ∼270,000 years ago, providing a lower boundary for the time of the putative mtDNA introgression event. We demonstrate that a complete Neanderthal mtDNA replacement is feasible over this time interval even with minimal hominin introgression. The highly divergent HST branch is indicative of greater mtDNA diversity during the Middle Pleistocene than in later periods.
The explosion in population genomic data demands ever more complex modes of analysis, and increasingly, these analyses depend on sophisticated simulations. Recent advances in population genetic ...simulation have made it possible to simulate large and complex models, but specifying such models for a particular simulation engine remains a difficult and error-prone task. Computational genetics researchers currently re-implement simulation models independently, leading to inconsistency and duplication of effort. This situation presents a major barrier to empirical researchers seeking to use simulations for power analyses of upcoming studies or sanity checks on existing genomic data. Population genetics, as a field, also lacks standard benchmarks by which new tools for inference might be measured. Here, we describe a new resource, stdpopsim, that attempts to rectify this situation. Stdpopsim is a community-driven open source project, which provides easy access to a growing catalog of published simulation models from a range of organisms and supports multiple simulation engine backends. This resource is available as a well-documented python library with a simple command-line interface. We share some examples demonstrating how stdpopsim can be used to systematically compare demographic inference methods, and we encourage a broader community of developers to contribute to this growing resource.