The human history of Oceania comprises two extremes: the initial colonizations of Near Oceania, one of the oldest out-of-Africa migrations, and of Remote Oceania, the most recent expansion into ...unoccupied territories. Genetic studies, mostly using uniparentally inherited DNA, have shed some light on human origins in Oceania, particularly indicating that Polynesians are of mixed East Asian and Near Oceanian ancestry. Here, we use ∼1 million single nucleotide polymorphisms (SNPs) to investigate the demographic history of Oceania in a more detailed manner.
We developed a new approach to account for SNP ascertainment bias, used approximate Bayesian computation simulations to choose the best-fitting model of population history, and estimated demographic parameters. We find that the ancestors of Near Oceanians diverged from ancestral Eurasians ∼27 thousand years ago (kya), suggesting separate initial occupations of both territories. The genetic admixture in Polynesian history between East Asians (∼87%) and Near Oceanians (∼13%) occurred ∼3 kya, prior to the colonization of Polynesia. Fijians are of Polynesian (∼65%) and additional Near Oceanian (∼35%) ancestry not found in Polynesians, with this admixture occurring considerably after the initial settlement of Remote Oceania. Our data support a greater contribution of East Asian women than men in the admixture history of Remote Oceania and highlight population substructure in Polynesia and New Guinea.
Despite the inherent ascertainment bias, genome-wide SNP data provide new insights into the genetic history of Oceana. Our approach to correct for ascertainment bias and obtain reliable inferences concerning demographic history should prove useful in other such studies.
► Near Oceanians and Eurasians split ∼27,000 years ago, indicating separate migrations ► Polynesian admixture occurred ∼3,000 years ago, prior to colonization of Polynesia ► Fijians obtained additional Near Oceanian contributions after Polynesian settlement ► Polynesian admixture involved more East Asian women than men
We describe a PCA-based genome scan approach to analyze genome-wide admixture structure, and introduce wavelet transform analysis as a method for estimating the time of admixture. We test the wavelet ...transform method with simulations and apply it to genome-wide SNP data from eight admixed human populations. The wavelet transform method offers better resolution than existing methods for dating admixture, and can be applied to either SNP or sequence data from humans or other species.
Nonrecombining Y-chromosomal microsatellites (Y-STRs) are widely used to infer population histories, discover genealogical relationships, and identify males for criminal justice purposes. Although a ...key requirement for their application is reliable mutability knowledge, empirical data are only available for a small number of Y-STRs thus far. To rectify this, we analyzed a large number of 186 Y-STR markers in nearly 2000 DNA-confirmed father-son pairs, covering an overall number of 352,999 meiotic transfers. Following confirmation by DNA sequence analysis, the retrieved mutation data were modeled via a Bayesian approach, resulting in mutation rates from 3.78 × 10−4 (95% credible interval CI, 1.38 × 10−5 − 2.02 × 10−3) to 7.44 × 10−2 (95% CI, 6.51 × 10−2 − 9.09 × 10−2) per marker per generation. With the 924 mutations at 120 Y-STR markers, a nonsignificant excess of repeat losses versus gains (1.16:1), as well as a strong and significant excess of single-repeat versus multirepeat changes (25.23:1), was observed. Although the total repeat number influenced Y-STR locus mutability most strongly, repeat complexity, the length in base pairs of the repeated motif, and the father's age also contributed to Y-STR mutability. To exemplify how to practically utilize this knowledge, we analyzed the 13 most mutable Y-STRs in an independent sample set and empirically proved their suitability for distinguishing close and distantly related males. This finding is expected to revolutionize Y-chromosomal applications in forensic biology, from previous male lineage differentiation toward future male individual identification.
Abstract The panels of 9–17 Y-chromosomal short tandem repeats (Y-STRs) currently used in forensic genetics have adequate resolution of different paternal lineages in many human populations, but have ...lower abilities to separate paternal lineages in populations expressing low Y-chromosome diversity. Moreover, current Y-STR sets usually fail to differentiate between related males who belong to the same paternal lineage and, as a consequence, conclusions cannot be drawn on the individual level as is desirable for forensic interpretations. Recently, we identified a new panel of rapidly mutating (RM) Y-STRs, composed of 13 markers with mutation rates above 1 × 10−2 , whereas most Y-STRs, including all currently used in forensics, have mutation rates in the order of 1 × 10−3 or lower. In the present study, we demonstrate in 604 unrelated males sampled from 51 worldwide populations (HGDP-CEPH) that the RM Y-STRs provide substantially higher haplotype diversity and haplotype discrimination capacity (with only 3 haplotypes shared between 8 of the 604 worldwide males), than obtained with the largest set of 17 currently used Y-STRs (Yfiler) in the same samples (33 haplotypes shared between 85 males). Hence, RM Y-STRs yield high-resolution paternal lineage differentiation and provide a considerable improvement compared to Yfiler. We also find in this worldwide dataset substantially less genetic population substructure within and between geographic regions with RM Y-STRs than with Yfiler Y-STRs. Furthermore, with the present study we provide enhanced data evidence that the RM Y-STR panel is extremely successful in differentiating between closely and distantly related males. Among 305 male relatives, paternally connected by 1–20 meiotic transfers in 127 independent pedigrees, we show that 66% were separated by mutation events with the RM Y-STR panel whereas only 15% were with Yfiler; hence, RM Y-STRs provide a statistically significant 4.4-fold increase of average male relative differentiation relative to Yfiler. The RM Y-STR panel is powerful enough to separate closely related males; nearly 50% of the father and sons, and 60% of brothers could be distinguished with RM Y-STRs, whereas only 7.7% and 8%, respectively, with Yfiler. Thus, by introducing RM Y-STRs to the forensic genetic community we provide important solutions to several of the current limitations of Y chromosome analysis in forensic genetics.
Previous studies have successfully identified genetic variants in several genes associated with human iris (eye) color; however, they all used simplified categorical trait information. Here, we ...quantified continuous eye color variation into hue and saturation values using high-resolution digital full-eye photographs and conducted a genome-wide association study on 5,951 Dutch Europeans from the Rotterdam Study. Three new regions, 1q42.3, 17q25.3, and 21q22.13, were highlighted meeting the criterion for genome-wide statistically significant association. The latter two loci were replicated in 2,261 individuals from the UK and in 1,282 from Australia. The LYST gene at 1q42.3 and the DSCR9 gene at 21q22.13 serve as promising functional candidates. A model for predicting quantitative eye colors explained over 50% of trait variance in the Rotterdam Study. Over all our data exemplify that fine phenotyping is a useful strategy for finding genes involved in human complex traits.
Attempts to detect genetic population substructure in humans are troubled by the fact that the vast majority of the total amount of observed genetic variation is present within populations rather ...than between populations. Here we introduce a new algorithm for transforming a genetic distance matrix that reduces the within-population variation considerably. Extensive computer simulations revealed that the transformed matrix captured the genetic population differentiation better than the original one which was based on the T1 statistic. In an empirical genomic data set comprising 2,457 individuals from 23 different European subpopulations, the proportion of individuals that were determined as a genetic neighbour to another individual from the same sampling location increased from 25% with the original matrix to 52% with the transformed matrix. Similarly, the percentage of genetic variation explained between populations by means of Analysis of Molecular Variance (AMOVA) increased from 1.62% to 7.98%. Furthermore, the first two dimensions of a classical multidimensional scaling (MDS) using the transformed matrix explained 15% of the variance, compared to 0.7% obtained with the original matrix. Application of MDS with Mclust, SPA with Mclust, and GemTools algorithms to the same dataset also showed that the transformed matrix gave a better association of the genetic clusters with the sampling locations, and particularly so when it was used in the AMOVA framework with a genetic algorithm. Overall, the new matrix transformation introduced here substantially reduces the within population genetic differentiation, and can be broadly applied to methods such as AMOVA to enhance their sensitivity to reveal population substructure. We herewith provide a publically available (http://www.erasmusmc.nl/fmb/resources/GAGA) model-free method for improved genetic population substructure detection that can be applied to human as well as any other species data in future studies relevant to evolutionary biology, behavioural ecology, medicine, and forensics.
The Y-chromosomal short tandem repeat (Y-STR) polymorphisms included in the AmpF
l
STR® Yfiler® polymerase chain reaction amplification kit have become widely used for forensic and evolutionary ...applications where a reliable knowledge on mutation properties is necessary for correct data interpretation. Therefore, we investigated the 17 Yfiler Y-STRs in 1,730–1,764 DNA-confirmed father–son pairs per locus and found 84 sequence-confirmed mutations among the 29,792 meiotic transfers covered. Of the 84 mutations, 83 (98.8%) were single-repeat changes and one (1.2%) was a double-repeat change (ratio, 1:0.01), as well as 43 (51.2%) were repeat gains and 41 (48.8%) repeat losses (ratio, 1:0.95). Medians from Bayesian estimation of locus-specific mutation rates ranged from 0.0003 for DYS448 to 0.0074 for DYS458, with a median rate across all 17 Y-STRs of 0.0025. The mean age (at the time of son’s birth) of fathers with mutations was with 34.40 (±11.63) years higher than that of fathers without ones at 30.32 (±10.22) years, a difference that is highly statistically significant (
p
< 0.001). A Poisson-based modeling revealed that the Y-STR mutation rate increased with increasing father’s age on a statistically significant level (
α
= 0.0294, 2.5% quantile = 0.0001). From combining our data with those previously published, considering all together 135,212 meiotic events and 331 mutations, we conclude for the Yfiler Y-STRs that (1) none had a mutation rate of >1%, 12 had mutation rates of >0.1% and four of <0.1%, (2) single-repeat changes were strongly favored over multiple-repeat ones for all loci but 1 and (3) considerable variation existed among loci in the ratio of repeat gains versus losses. Our finding of three Y-STR mutations in one father–son pair (and two pairs with two mutations each) has consequences for determining the threshold of allelic differences to conclude exclusion constellations in future applications of Y-STRs in paternity testing and pedigree analyses.
The relationship between quantitative genetics and population genetics has been studied for nearly a century, almost since the existence of these two disciplines. Here we ask to what extent ...quantitative genetic models in which selection is assumed to operate on a polygenic trait predict adaptive fixations that may lead to footprints in the genome (selective sweeps). We study two-locus models of stabilizing selection (with and without genetic drift) by simulations and analytically. For symmetric viability selection we find that ∼16% of the trajectories may lead to fixation if the initial allele frequencies are sampled from the neutral site-frequency spectrum and the effect sizes are uniformly distributed. However, if the population is preadapted when it undergoes an environmental change (i.e., sits in one of the equilibria of the model), the fixation probability decreases dramatically. In other two-locus models with general viabilities or an optimum shift, the proportion of adaptive fixations may increase to >24%. Similarly, genetic drift leads to a higher probability of fixation. The predictions of alternative quantitative genetics models, initial conditions, and effect-size distributions are also discussed.
Abstract Recently, the field of predicting phenotypes of externally visible characteristics (EVCs) from DNA genotypes with the final aim of concentrating police investigations to find persons ...completely unknown to investigating authorities, also referred to as Forensic DNA Phenotyping (FDP), has started to become established in forensic biology. We previously developed and forensically validated the IrisPlex system for accurate prediction of blue and brown eye colour from DNA, and recently showed that all major hair colour categories are predictable from carefully selected DNA markers. Here, we introduce the newly developed HIrisPlex system, which is capable of simultaneously predicting both hair and eye colour from DNA. HIrisPlex consists of a single multiplex assay targeting 24 eye and hair colour predictive DNA variants including all 6 IrisPlex SNPs, as well as two prediction models, a newly developed model for hair colour categories and shade, and the previously developed IrisPlex model for eye colour. The HIrisPlex assay was designed to cope with low amounts of template DNA, as well as degraded DNA, and preliminary sensitivity testing revealed full DNA profiles down to 63 pg input DNA. The power of the HIrisPlex system to predict hair colour was assessed in 1551 individuals from three different parts of Europe showing different hair colour frequencies. Using a 20% subset of individuals, while 80% were used for model building, the individual-based prediction accuracies employing a prediction-guided approach were 69.5% for blond, 78.5% for brown, 80% for red and 87.5% for black hair colour on average. Results from HIrisPlex analysis on worldwide DNA samples imply that HIrisPlex hair colour prediction is reliable independent of bio-geographic ancestry (similar to previous IrisPlex findings for eye colour). We furthermore demonstrate that it is possible to infer with a prediction accuracy of >86% if a brown-eyed, black-haired individual is of non-European (excluding regions nearby Europe) versus European (including nearby regions) bio-geographic origin solely from the strength of HIrisPlex eye and hair colour probabilities, which can provide extra intelligence for future forensic applications. The HIrisPlex system introduced here, including a single multiplex test assay, an interactive tool and prediction guide, and recommendations for reporting final outcomes, represents the first tool for simultaneously establishing categorical eye and hair colour of a person from DNA. The practical forensic application of the HIrisPlex system is expected to benefit cases where other avenues of investigation, including STR profiling, provide no leads on who the unknown crime scene sample donor or the unknown missing person might be.
Global skin colour prediction from DNA Walsh, Susan; Chaitanya, Lakshmi; Breslin, Krystal ...
Human genetics,
07/2017, Volume:
136, Issue:
7
Journal Article
Peer reviewed
Open access
Human skin colour is highly heritable and externally visible with relevance in medical, forensic, and anthropological genetics. Although eye and hair colour can already be predicted with high ...accuracies from small sets of carefully selected DNA markers, knowledge about the genetic predictability of skin colour is limited. Here, we investigate the skin colour predictive value of 77 single-nucleotide polymorphisms (SNPs) from 37 genetic loci previously associated with human pigmentation using 2025 individuals from 31 global populations. We identified a minimal set of 36 highly informative skin colour predictive SNPs and developed a statistical prediction model capable of skin colour prediction on a global scale. Average cross-validated prediction accuracies expressed as area under the receiver-operating characteristic curve (AUC) ± standard deviation were 0.97 ± 0.02 for Light, 0.83 ± 0.11 for Dark, and 0.96 ± 0.03 for Dark-Black. When using a 5-category, this resulted in 0.74 ± 0.05 for Very Pale, 0.72 ± 0.03 for Pale, 0.73 ± 0.03 for Intermediate, 0.87±0.1 for Dark, and 0.97 ± 0.03 for Dark-Black. A comparative analysis in 194 independent samples from 17 populations demonstrated that our model outperformed a previously proposed 10-SNP-classifier approach with AUCs rising from 0.79 to 0.82 for White, comparable at the intermediate level of 0.63 and 0.62, respectively, and a large increase from 0.64 to 0.92 for Black. Overall, this study demonstrates that the chosen DNA markers and prediction model, particularly the 5-category level; allow skin colour predictions within and between continental regions for the first time, which will serve as a valuable resource for future applications in forensic and anthropologic genetics.