The history of southern Africa involved interactions between indigenous hunter–gatherers and a range of populations that moved into the region. Here we use genome-wide genetic data to show that there ...are at least two admixture events in the history of Khoisan populations (southern African hunter–gatherers and pastoralists who speak non-Bantu languages with click consonants). One involved populations related to Niger–Congo-speaking African populations, and the other introduced ancestry most closely related to west Eurasian (European or Middle Eastern) populations. We date this latter admixture event to ∼900–1,800 y ago and show that it had the largest demographic impact in Khoisan populations that speak Khoe–Kwadi languages. A similar signal of west Eurasian ancestry is present throughout eastern Africa. In particular, we also find evidence for two admixture events in the history of Kenyan, Tanzanian, and Ethiopian populations, the earlier of which involved populations related to west Eurasians and which we date to ∼2,700–3,300 y ago. We reconstruct the allele frequencies of the putative west Eurasian population in eastern Africa and show that this population is a good proxy for the west Eurasian ancestry in southern Africa. The most parsimonious explanation for these findings is that west Eurasian ancestry entered southern Africa indirectly through eastern Africa.
The establishment of agrarian economy in Eneolithic East Europe is associated with the Pre-Cucuteni-Cucuteni-Trypillia complex (PCCTC). PCCTC farmers interacted with Eneolithic forager-pastoralist ...groups of the North Pontic steppe as PCCTC extended from the Carpathian foothills to the Dnipro Valley beginning in the late 5th millennium BCE. While the cultural interaction between the two groups is evident through the Cucuteni C pottery style that carries steppe influence, the extent of biological interactions between Trypillian farmers and the steppe remains unclear. Here we report the analysis of artefacts from the late 5th millennium Trypillian settlement at the Kolomiytsiv Yar Tract (KYT) archaeological complex in central Ukraine, focusing on a human bone fragment found in the Trypillian context at KYT. Diet stable isotope ratios obtained from the bone fragment suggest the diet of the KYT individual to be within the range of forager-pastoralists of the North Pontic area. Strontium isotope ratios of the KYT individual are consistent with having originated from contexts of the Serednii Stih (Sredny Stog) culture sites of the Middle Dnipro Valley. Genetic analysis of the KYT individual indicates ancestry derived from a proto-Yamna population such as Serednii Stih. Overall, the KYT archaeological site presents evidence of interactions between Trypillians and Eneolithic Pontic steppe inhabitants of the Serednii Stih horizon and suggests a potential for gene flow between the two groups as early as the beginning of the 4th millennium BCE.
Mutations are the raw material of evolution but have been difficult to study directly. We report the largest study of new mutations to date, comprising 2,058 germline changes discovered by analyzing ...85,289 Icelanders at 2,477 microsatellites. The paternal-to-maternal mutation rate ratio is 3.3, and the rate in fathers doubles from age 20 to 58, whereas there is no association with age in mothers. Longer microsatellite alleles are more mutagenic and tend to decrease in length, whereas the opposite is seen for shorter alleles. We use these empirical observations to build a model that we apply to individuals for whom we have both genome sequence and microsatellite data, allowing us to estimate key parameters of evolution without calibration to the fossil record. We infer that the sequence mutation rate is 1.4-2.3×10(-8) mutations per base pair per generation (90% credible interval) and that human-chimpanzee speciation occurred 3.7-6.6 million years ago.
Africa is the origin of modern humans within the past 300 thousand years. To infer the complex demographic history of African populations and adaptation to diverse environments, we sequenced the ...genomes of 92 individuals from 44 indigenous African populations.
Genetic structure analyses indicate that among Africans, genetic ancestry is largely partitioned by geography and language, though we observe mixed ancestry in many individuals, consistent with both short- and long-range migration events followed by admixture. Phylogenetic analysis indicates that the San genetic lineage is basal to all modern human lineages. The San and Niger-Congo, Afroasiatic, and Nilo-Saharan lineages were substantially diverged by 160 kya (thousand years ago). In contrast, the San and Central African rainforest hunter-gatherer (CRHG), Hadza hunter-gatherer, and Sandawe hunter-gatherer lineages were diverged by ~ 120-100 kya. Niger-Congo, Nilo-Saharan, and Afroasiatic lineages diverged more recently by ~ 54-16 kya. Eastern and western CRHG lineages diverged by ~ 50-31 kya, and the western CRHG lineages diverged by ~ 18-12 kya. The San and CRHG populations maintained the largest effective population size compared to other populations prior to 60 kya. Further, we observed signatures of positive selection at genes involved in muscle development, bone synthesis, reproduction, immune function, energy metabolism, and cell signaling, which may contribute to local adaptation of African populations.
We observe high levels of genomic variation between ethnically diverse Africans which is largely correlated with geography and language. Our study indicates ancient population substructure and local adaptation of Africans.
Using DNA extracted from a finger bone found in Denisova Cave in southern Siberia, we have sequenced the genome of an archaic hominin to about 1.9-fold coverage. This individual is from a group that ...shares a common origin with Neanderthals. This population was not involved in the putative gene flow from Neanderthals into Eurasians; however, the data suggest that it contributed 4-6% of its genetic material to the genomes of present-day Melanesians. We designate this hominin population 'Denisovans' and suggest that it may have been widespread in Asia during the Late Pleistocene epoch. A tooth found in Denisova Cave carries a mitochondrial genome highly similar to that of the finger bone. This tooth shares no derived morphological features with Neanderthals or modern humans, further indicating that Denisovans have an evolutionary history distinct from Neanderthals and modern humans.
Clinical exome sequencing routinely identifies missense variants in disease-related genes, but functional characterization is rarely undertaken, leading to diagnostic uncertainty. For example, ...mutations in PPARG cause Mendelian lipodystrophy and increase risk of type 2 diabetes (T2D). Although approximately 1 in 500 people harbor missense variants in PPARG, most are of unknown consequence. To prospectively characterize PPARγ variants, we used highly parallel oligonucleotide synthesis to construct a library encoding all 9,595 possible single-amino acid substitutions. We developed a pooled functional assay in human macrophages, experimentally evaluated all protein variants, and used the experimental data to train a variant classifier by supervised machine learning. When applied to 55 new missense variants identified in population-based and clinical sequencing, the classifier annotated 6 variants as pathogenic; these were subsequently validated by single-variant assays. Saturation mutagenesis and prospective experimental characterization can support immediate diagnostic interpretation of newly discovered missense variants in disease-related genes.
Imputation using external reference panels (e.g. 1000 Genomes) is a widely used approach for increasing power in genome-wide association studies and meta-analysis. Existing hidden Markov models ...(HMM)-based imputation approaches require individual-level genotypes. Here, we develop a new method for Gaussian imputation from summary association statistics, a type of data that is becoming widely available.
In simulations using 1000 Genomes (1000G) data, this method recovers 84% (54%) of the effective sample size for common (>5%) and low-frequency (1-5%) variants increasing to 87% (60%) when summary linkage disequilibrium information is available from target samples versus the gold standard of 89% (67%) for HMM-based imputation, which cannot be applied to summary statistics. Our approach accounts for the limited sample size of the reference panel, a crucial step to eliminate false-positive associations, and it is computationally very fast. As an empirical demonstration, we apply our method to seven case-control phenotypes from the Wellcome Trust Case Control Consortium (WTCCC) data and a study of height in the British 1958 birth cohort (1958BC). Gaussian imputation from summary statistics recovers 95% (105%) of the effective sample size (as quantified by the ratio of Formula: see text association statistics) compared with HMM-based imputation from individual-level genotypes at the 227 (176) published single nucleotide polymorphisms (SNPs) in the WTCCC (1958BC height) data. In addition, for publicly available summary statistics from large meta-analyses of four lipid traits, we publicly release imputed summary statistics at 1000G SNPs, which could not have been obtained using previously published methods, and demonstrate their accuracy by masking subsets of the data. We show that 1000G imputation using our approach increases the magnitude and statistical evidence of enrichment at genic versus non-genic loci for these traits, as compared with an analysis without 1000G imputation. Thus, imputation of summary statistics will be a valuable tool in future functional enrichment analyses.
Publicly available software package available at http://bogdan.bioinformatics.ucla.edu/software/.
bpasaniuc@mednet.ucla.edu or aprice@hsph.harvard.edu
Supplementary materials are available at Bioinformatics online.
Austronesian languages are spread across half the globe, from Easter Island to Madagascar. Evidence from linguistics and archaeology indicates that the 'Austronesian expansion,' which began ...4,000-5,000 years ago, likely had roots in Taiwan, but the ancestry of present-day Austronesian-speaking populations remains controversial. Here, we analyse genome-wide data from 56 populations using new methods for tracing ancestral gene flow, focusing primarily on Island Southeast Asia. We show that all sampled Austronesian groups harbour ancestry that is more closely related to aboriginal Taiwanese than to any present-day mainland population. Surprisingly, western Island Southeast Asian populations have also inherited ancestry from a source nested within the variation of present-day populations speaking Austro-Asiatic languages, which have historically been nearly exclusive to the mainland. Thus, either there was once a substantial Austro-Asiatic presence in Island Southeast Asia, or Austronesian speakers migrated to and through the mainland, admixing there before continuing to western Indonesia.
The more than 1.5 billion people who live in South Asia are correctly viewed not as a single large population but as many small endogamous groups. We assembled genome-wide data from over 2,800 ...individuals from over 260 distinct South Asian groups. We identified 81 unique groups, 14 of which had estimated census sizes of more than 1 million, that descend from founder events more extreme than those in Ashkenazi Jews and Finns, both of which have high rates of recessive disease due to founder events. We identified multiple examples of recessive diseases in South Asia that are the result of such founder events. This study highlights an underappreciated opportunity for decreasing disease burden among South Asians through discovery of and testing for recessive disease-associated genes.
The material culture of the Late Chalcolithic period in the southern Levant (4500-3900/3800 BCE) is qualitatively distinct from previous and subsequent periods. Here, to test the hypothesis that the ...advent and decline of this culture was influenced by movements of people, we generated genome-wide ancient DNA from 22 individuals from Peqi'in Cave, Israel. These individuals were part of a homogeneous population that can be modeled as deriving ~57% of its ancestry from groups related to those of the local Levant Neolithic, ~17% from groups related to those of the Iran Chalcolithic, and ~26% from groups related to those of the Anatolian Neolithic. The Peqi'in population also appears to have contributed differently to later Bronze Age groups, one of which we show cannot plausibly have descended from the same population as that of Peqi'in Cave. These results provide an example of how population movements propelled cultural changes in the deep past.