Archaeogenomic research has proven to be a valuable tool to trace migrations of historic and prehistoric individuals and groups, whereas relationships within a group or burial site have not been ...investigated to a large extent. Knowing the genetic kinship of historic and prehistoric individuals would give important insights into social structures of ancient and historic cultures. Most archaeogenetic research concerning kinship has been restricted to uniparental markers, while studies using genome-wide information were mainly focused on comparisons between populations. Applications which infer the degree of relationship based on modern-day DNA information typically require diploid genotype data. Low concentration of endogenous DNA, fragmentation and other post-mortem damage to ancient DNA (aDNA) makes the application of such tools unfeasible for most archaeological samples. To infer family relationships for degraded samples, we developed the software READ (Relationship Estimation from Ancient DNA). We show that our heuristic approach can successfully infer up to second degree relationships with as little as 0.1x shotgun coverage per genome for pairs of individuals. We uncover previously unknown relationships among prehistoric individuals by applying READ to published aDNA data from several human remains excavated from different cultural contexts. In particular, we find a group of five closely related males from the same Corded Ware culture site in modern-day Germany, suggesting patrilocality, which highlights the possibility to uncover social structures of ancient populations by applying READ to genome-wide aDNA data. READ is publicly available from https://bitbucket.org/tguenther/read.
The identification of the genetic structure of populations from multilocus genotype data has become a central component of modern population‐genetic data analysis. Application of model‐based ...clustering programs often entails a number of steps, in which the user considers different modelling assumptions, compares results across different predetermined values of the number of assumed clusters (a parameter typically denoted K), examines multiple independent runs for each fixed value of K, and distinguishes among runs belonging to substantially distinct clustering solutions. Here, we present Clumpak (Cluster Markov Packager Across K), a method that automates the postprocessing of results of model‐based population structure analyses. For analysing multiple independent runs at a single K value, Clumpak identifies sets of highly similar runs, separating distinct groups of runs that represent distinct modes in the space of possible solutions. This procedure, which generates a consensus solution for each distinct mode, is performed by the use of a Markov clustering algorithm that relies on a similarity matrix between replicate runs, as computed by the software Clumpp. Next, Clumpak identifies an optimal alignment of inferred clusters across different values of K, extending a similar approach implemented for a fixed K in Clumpp and simplifying the comparison of clustering results across different K values. Clumpak incorporates additional features, such as implementations of methods for choosing K and comparing solutions obtained by different programs, models, or data subsets. Clumpak, available at http://clumpak.tau.ac.il, simplifies the use of model‐based analyses of population structure in population genetics and molecular ecology.
Advances in the sequencing and the analysis of the genomes of both modern and ancient peoples have facilitated a number of breakthroughs in our understanding of human evolutionary history. These ...include the discovery of interbreeding between anatomically modern humans and extinct hominins; the development of an increasingly detailed description of the complex dispersal of modern humans out of Africa and their population expansion worldwide; and the characterization of many of the genetic adaptions of humans to local environmental conditions. Our interpretation of the evolutionary history and adaptation of humans is being transformed by analyses of these new genomic data.
F
ST
is frequently used as a summary of genetic differentiation among groups. It has been suggested that
F
ST
depends on the allele frequencies at a locus, as it exhibits a variety of peculiar ...properties related to genetic diversity: higher values for biallelic single-nucleotide polymorphisms (SNPs) than for multiallelic microsatellites, low values among high-diversity populations viewed as substantially distinct, and low values for populations that differ primarily in their profiles of rare alleles. A full mathematical understanding of the dependence of
F
ST
on allele frequencies, however, has been elusive. Here, we examine the relationship between
F
ST
and the frequency of the most frequent allele, demonstrating that the range of values that
F
ST
can take is restricted considerably by the allele-frequency distribution. For a two-population model, we derive strict bounds on
F
ST
as a function of the frequency
M
of the allele with highest mean frequency between the pair of populations. Using these bounds, we show that for a value of
M
chosen uniformly between 0 and 1 at a multiallelic locus whose number of alleles is left unspecified, the mean maximum
F
ST
is ∼0.3585. Further,
F
ST
is restricted to values much less than 1 when
M
is low or high, and the contribution to the maximum
F
ST
made by the most frequent allele is on average ∼0.4485. Using bounds on homozygosity that we have previously derived as functions of
M
, we describe strict bounds on
F
ST
in terms of the homozygosity of the total population, finding that the mean maximum
F
ST
given this homozygosity is 1 − ln 2 ≈ 0.3069. Our results provide a conceptual basis for understanding the dependence of
F
ST
on allele frequencies and genetic diversity and for interpreting the roles of these quantities in computations of
F
ST
from population-genetic data. Further, our analysis suggests that many unusual observations of
F
ST
, including the relatively low
F
ST
values in high-diversity human populations from Africa and the relatively low estimates of
F
ST
for microsatellites compared to SNPs, can be understood not as biological phenomena associated with different groups of populations or classes of markers but rather as consequences of the intrinsic mathematical dependence of
F
ST
on the properties of allele-frequency distributions.
Archaic human ancestry in East Asia Skoglund, Pontus; Jakobsson, Mattias
Proceedings of the National Academy of Sciences - PNAS,
11/2011, Letnik:
108, Številka:
45
Journal Article
Recenzirano
Odprti dostop
Recent studies of ancient genomes have suggested that gene flow from archaic hominin groups to the ancestors of modern humans occurred on two separate occasions during the modem human expansion out ...of Africa. At the same time, decreasing levels of human genetic diversity have been found at increasing distance from Africa as a consequence of human expansion out of Africa. We analyzed the signal of archaic ancestry in modern human populations, and we investigated how serial founder models of human expansion affect the signal of archaic ancestry using simulations. For descendants of an archaic admixture event we show that genetic drift coupled with ascertainment bias for common alíeles can cause artificial but largely predictable differences in similarity to archaic genomes. In genotype data from non-Africans, this effect results in a biased genetic similarity to Neandertals with increasing distance from Africa. However, in addition to the previously reported gene flow between Neandertals and non-Africans as well as gene flow between an archaic human population from Siberia ("Denisovans") and Oceanians, we found a significant affinity between East Asians, particularly Southeast Asians, and the Denisova genome— a pattern that is not expected under a model of solely Neandertal admixture in the ancestry of East Asians. These results suggest admixture between Denisovans or a Denisova-related population and the ancestors of East Asians, and that the history of anatomically modern and archaic humans might be more complex than previously proposed.
Accurate identification of the biological sex of ancient remains is vital for critically testing hypotheses about social structure in prehistoric societies. However, morphological methods are ...imprecise for juvenile individuals and fragmentary remains, and molecular methods that rely on particular sex-specific marker loci such as the amelogenin gene suffer from allelic dropout and sensitivity to modern contamination. Analyzing shotgun sequencing data from 14 present-day humans of known biological sex and 16 ancient individuals from a time span of 100 to ∼70,000 years ago, we show that even relatively sparse shotgun sequencing (about 100,000 human sequences) can be used to reliably identify chromosomal sex simply by considering the ratio of sequences aligning to the X and Y chromosomes, and highlight two examples where the genetic assignments indicate morphological misassignment. Furthermore, we show that accurate sex identification of highly degraded remains can be performed in the presence of substantial amounts of present-day contamination by utilizing the signature of cytosine deamination, a characteristic feature of ancient DNA.
•We present a simple sex identification method using low-coverage DNA sequencing.•The approach is validated using a panel of 30 modern and ancient individuals.•The method can be robust to modern-day contamination by using degradation patterns.•The study illustrates the risk of misidentification of sex using morphological approaches.
In the last three decades, genetic studies have played an increasingly important role in exploring human history. They have helped to conclusively establish that anatomically modern humans first ...appeared in Africa roughly 250,000-350,000 years before present and subsequently migrated to other parts of the world. The history of humans in Africa is complex and includes demographic events that influenced patterns of genetic variation across the continent. Through genetic studies, it has become evident that deep African population history is captured by relationships among African hunter-gatherers, as the world's deepest population divergences occur among these groups, and that the deepest population divergence dates to 300,000 years before present. However, the spread of pastoralism and agriculture in the last few thousand years has shaped the geographic distribution of present-day Africans and their genetic diversity. With today's sequencing technologies, we can obtain full genome sequences from diverse sets of extant and prehistoric Africans. The coming years will contribute exciting new insights toward deciphering human evolutionary history in Africa.
The majority of sub-Saharan Africans today speak a number of closely related languages collectively referred to as ‘Bantu’ languages. The current distribution of Bantu-speaking populations has been ...found to largely be a consequence of the movement of people rather than a diffusion of language alone. Linguistic and single marker genetic studies have generated various hypotheses regarding the timing and the routes of the Bantu expansion, but these hypotheses have not been thoroughly investigated. In this study, we re-analysed microsatellite markers typed for large number of African populations that—owing to their fast mutation rates—capture signatures of recent population history. We confirm the spread of west African people across most of sub-Saharan Africa and estimated the expansion of Bantu-speaking groups, using a Bayesian approach, to around 5600 years ago. We tested four different divergence models for Bantu-speaking populations with a distribution comprising three geographical regions in Africa. We found that the most likely model for the movement of the eastern branch of Bantu-speakers involves migration of Bantu-speaking groups to the east followed by migration to the south. This model, however, is only marginally more likely than other models, which might indicate direct movement from the west and/or significant gene flow with the western Branch of Bantu-speakers. Our study use multi-loci genetic data to explicitly investigate the timing and mode of the Bantu expansion and it demonstrates that west African groups rapidly expanded both in numbers and over a large geographical area, affirming the fact that the Bantu expansion was one of the most dramatic demographic events in human history.
Southern Africa is consistently placed as a potential region for the evolution of Homo sapiens. We present genome sequences, up to 13x coverage, from seven ancient individuals from KwaZulu-Natal, ...South Africa. The remains of three Stone Age hunter-gatherers (about 2000 years old) were genetically similar to current-day southern San groups, and those of four Iron Age farmers (300 to 500 years old) were genetically similar to present-day Bantu-language speakers. We estimate that all modern-day Khoe-San groups have been influenced by 9 to 30% genetic admixture from East Africans/Eurasians. Using traditional and new approaches, we estimate the first modern human population divergence time to between 350,000 and 260,000 years ago. This estimate increases the deepest divergence among modern humans, coinciding with anatomical developments of archaic humans into modern humans, as represented in the local fossil record.
Motivation: Analysis of the distribution of alleles across populations is a useful tool for examining population diversity and relationships. However, sample sizes often differ across populations, ...sometimes making it difficult to assess allelic distributions across groups. Results: We introduce a generalized rarefaction approach for counting alleles private to combinations of populations. Our method evaluates the number of alleles found in each of a set of populations but absent in all remaining populations, considering equal-sized subsamples from each population. Applying this method to a worldwide human microsatellite dataset, we observe a high number of alleles private to the combination of African and Oceanian populations. This result supports the possibility of a migration out of Africa into Oceania separate from the migrations responsible for the majority of the ancestry of the modern populations of Asia, and it highlights the utility of our approach to sample size correction in evaluating hypotheses about population history. Availability: We have implemented our method in the computer pro-gram ADZE, which is available for download at http://rosenberglab.bioinformatics.med.umich.edu/adze.html. Contact: szpiechz@umich.edu