Micronesia began to be peopled earlier than other parts of Remote Oceania, but the origins of its inhabitants remain unclear. We generated genome-wide data from 164 ancient and 112 modern ...individuals. Analysis reveals five migratory streams into Micronesia. Three are East Asian related, one is Polynesian, and a fifth is a Papuan source related to mainland New Guineans that is different from the New Britain-related Papuan source for southwest Pacific populations but is similarly derived from male migrants ~2500 to 2000 years ago. People of the Mariana Archipelago may derive all of their precolonial ancestry from East Asian sources, making them the only Remote Oceanians without Papuan ancestry. Female-inherited mitochondrial DNA was highly differentiated across early Remote Oceanian communities but homogeneous within, implying matrilocal practices whereby women almost never raised their children in communities different from the ones in which they grew up.
Preeclampsia is a multi-organ complication of pregnancy characterized by sudden hypertension and proteinuria that is among the leading causes of preterm delivery and maternal morbidity and mortality ...worldwide. The heterogeneity of preeclampsia poses a challenge for understanding its etiology and molecular basis. Intriguingly, risk for the condition increases in high-altitude regions such as the Peruvian Andes. To investigate the genetic basis of preeclampsia in a population living at high altitude, we characterized genome-wide variation in a cohort of preeclamptic and healthy Andean families (n = 883) from Puno, Peru, a city located above 3,800 meters of altitude. Our study collected genomic DNA and medical records from case-control trios and duos in local hospital settings. We generated genotype data for 439,314 SNPs, determined global ancestry patterns, and mapped associations between genetic variants and preeclampsia phenotypes. A transmission disequilibrium test (TDT) revealed variants near genes of biological importance for placental and blood vessel function. The top candidate region was found on chromosome 13 of the fetal genome and contains clotting factor genes PROZ, F7, and F10. These findings provide supporting evidence that common genetic variants within coagulation genes play an important role in preeclampsia. A selection scan revealed a potential adaptive signal around the ADAM12 locus on chromosome 10, implicated in pregnancy disorders. Our discovery of an association in a functional pathway relevant to pregnancy physiology in an understudied population of Native American origin demonstrates the increased power of family-based study design and underscores the importance of conducting genetic research in diverse populations.
A general imbalance in the proportion of disembarked males and females in the Americas has been documented during the Trans-Atlantic Slave Trade and the Colonial Era and, although less prominent, ...more recently. This imbalance may have left a signature on the genomes of modern-day populations characterised by high levels of admixture. The analysis of the uniparental systems and the evaluation of continental proportion ratio of autosomal and X chromosomes revealed a general sex imbalance towards males for European and females for African and Indigenous American ancestries. However, the consistency and degree of this imbalance are variable, suggesting that other factors, such as cultural and social practices, may have played a role in shaping it. Moreover, very few investigations have evaluated the sex imbalance using haplotype data, containing more critical information than genotypes. Here, we analysed genome-wide data for more than 5000 admixed American individuals to assess the presence, direction and magnitude of sex-biased admixture in the Americas. For this purpose, we applied two haplotype-based approaches, ELAI and NNLS, and we compared them with a genotype-based method, ADMIXTURE. In doing so, besides a general agreement between methods, we unravelled that the post-colonial admixture dynamics show higher complexity than previously described.
Papua New Guinea (PNG) hosts distinct environments mainly represented by the ecoregions of the Highlands and Lowlands that display increased altitude and a predominance of pathogens, respectively. ...Since its initial peopling approximately 50,000 years ago, inhabitants of these ecoregions might have differentially adapted to the environmental pressures exerted by each of them. However, the genetic basis of adaptation in populations from these areas remains understudied. Here, we investigated signals of positive selection in 62 highlanders and 43 lowlanders across 14 locations in the main island of PNG using whole-genome genotype data from the Oceanian Genome Variation Project (OGVP) and searched for signals of positive selection through population differentiation and haplotype-based selection scans. Additionally, we performed archaic ancestry estimation to detect selection signals in highlanders within introgressed regions of the genome. Among highland populations we identified candidate genes representing known biomarkers for mountain sickness (SAA4, SAA1, PRDX1, LDHA) as well as candidate genes of the Notch signaling pathway (PSEN1, NUMB, RBPJ, MAML3), a novel proposed pathway for high altitude adaptation in multiple organisms. We also identified candidate genes involved in oxidative stress, inflammation, and angiogenesis, processes inducible by hypoxia, as well as in components of the eye lens and the immune response. In contrast, candidate genes in the lowlands are mainly related to the immune response (HLA-DQB1, HLA-DQA2, TAAR6, TAAR9, TAAR8, RNASE4, RNASE6, ANG). Moreover, we find two candidate regions to be also enriched with archaic introgressed segments, suggesting that archaic admixture has played a role in the local adaptation of PNG populations.Papua New Guinea (PNG) hosts distinct environments mainly represented by the ecoregions of the Highlands and Lowlands that display increased altitude and a predominance of pathogens, respectively. Since its initial peopling approximately 50,000 years ago, inhabitants of these ecoregions might have differentially adapted to the environmental pressures exerted by each of them. However, the genetic basis of adaptation in populations from these areas remains understudied. Here, we investigated signals of positive selection in 62 highlanders and 43 lowlanders across 14 locations in the main island of PNG using whole-genome genotype data from the Oceanian Genome Variation Project (OGVP) and searched for signals of positive selection through population differentiation and haplotype-based selection scans. Additionally, we performed archaic ancestry estimation to detect selection signals in highlanders within introgressed regions of the genome. Among highland populations we identified candidate genes representing known biomarkers for mountain sickness (SAA4, SAA1, PRDX1, LDHA) as well as candidate genes of the Notch signaling pathway (PSEN1, NUMB, RBPJ, MAML3), a novel proposed pathway for high altitude adaptation in multiple organisms. We also identified candidate genes involved in oxidative stress, inflammation, and angiogenesis, processes inducible by hypoxia, as well as in components of the eye lens and the immune response. In contrast, candidate genes in the lowlands are mainly related to the immune response (HLA-DQB1, HLA-DQA2, TAAR6, TAAR9, TAAR8, RNASE4, RNASE6, ANG). Moreover, we find two candidate regions to be also enriched with archaic introgressed segments, suggesting that archaic admixture has played a role in the local adaptation of PNG populations.
The present dataset comprises 36,931 SNPs genotyped in 46 maize landraces native to Mexico as well as the teosinte subspecies Zea maiz ssp. parviglumis and ssp. mexicana. These landraces were ...collected directly from farmers mostly between 2006 and 2010. We accompany these data with a short description of the variation within each landrace, as well as maps, principal component analyses and neighbor joining trees showing the distribution of the genetic diversity relative to landrace, geographical features and maize biogeography. High levels of genetic variation were detected for the maize landraces (HE=0.234 to 0.318 (mean 0.311), while slightly lower levels were detected in Zea m. mexicana and Zea m. parviglumis (HE=0.262 and 0.234, respectively). The distribution of genetic variation was better explained by environmental variables given by the interaction of altitude and latitude than by landrace identity. This dataset is a follow up product of the Global Native Maize Project, an initiative to update the data on Mexican maize landraces and their wild relatives, and to generate information that is necessary for implementing the Mexican Biosafety Law.
Current Genome-Wide Association Studies (GWAS) rely on genotype imputation to increase statistical power, improve fine-mapping of association signals, and facilitate meta-analyses. Due to the complex ...demographic history of Latin America and the lack of balanced representation of Native American genomes in current imputation panels, the discovery of locally relevant disease variants is likely to be missed, limiting the scope and impact of biomedical research in these populations. Therefore, the necessity of better diversity representation in genomic databases is a scientific imperative. Here, we expand the 1,000 Genomes reference panel (1KGP) with 134 Native American genomes (1KGP + NAT) to assess imputation performance in Latin American individuals of mixed ancestry. Our panel increased the number of SNPs above the GWAS quality threshold, thus improving statistical power for association studies in the region. It also increased imputation accuracy, particularly in low-frequency variants segregating in Native American ancestry tracts. The improvement is subtle but consistent across countries and proportional to the number of genomes added from local source populations. To project the potential improvement with a higher number of reference genomes, we performed simulations and found that at least 3,000 Native American genomes are needed to equal the imputation performance of variants in European ancestry tracts. This reflects the concerning imbalance of diversity in current references and highlights the contribution of our work to reducing it while complementing efforts to improve global equity in genomic research.
It has been suggested that the higher susceptibility of Hispanics to metabolic disease is related to their Native American heritage. A frequent cholesterol transporter ABCA1 (ATP-binding cassette ...transporter A1) gene variant (R230C, rs9282541) apparently exclusive to Native American individuals was associated with low high-density lipoprotein cholesterol (HDL-C) levels, obesity and type 2 diabetes in Mexican Mestizos. We performed a more extensive analysis of this variant in 4405 Native Americans and 863 individuals from other ethnic groups to investigate genetic evidence of positive selection, to assess its functional effect in vitro and to explore associations with HDL-C levels and other metabolic traits. The C230 allele was found in 29 of 36 Native American groups, but not in European, Asian or African individuals. C230 was observed on a single haplotype, and C230-bearing chromosomes showed longer relative haplotype extension compared with other haplotypes in the Americas. Additionally, single-nucleotide polymorphism data from the Human Genome Diversity Panel Native American populations were enriched in significant integrated haplotype score values in the region upstream of the ABCA1 gene. Cells expressing the C230 allele showed a 27% cholesterol efflux reduction (P< 0.001), confirming this variant has a functional effect in vitro. Moreover, the C230 allele was associated with lower HDL-C levels (P = 1.77 × 10−11) and with higher body mass index (P = 0.0001) in the combined analysis of Native American populations. This is the first report of a common functional variant exclusive to Native American and descent populations, which is a major determinant of HDL-C levels and may have contributed to the adaptive evolution of Native American populations.
Many disease-susceptible SNPs exhibit significant disparity in ancestral and derived allele frequencies across worldwide populations. While previous studies have examined population differentiation ...of alleles at specific SNPs, global ethnic patterns of ensembles of disease risk alleles across human diseases are unexamined. To examine these patterns, we manually curated ethnic disease association data from 5,065 papers on human genetic studies representing 1,495 diseases, recording the precise risk alleles and their measured population frequencies and estimated effect sizes. We systematically compared the population frequencies of cross-ethnic risk alleles for each disease across 1,397 individuals from 11 HapMap populations, 1,064 individuals from 53 HGDP populations, and 49 individuals with whole-genome sequences from 10 populations. Type 2 diabetes (T2D) demonstrated extreme directional differentiation of risk allele frequencies across human populations, compared with null distributions of European-frequency matched control genomic alleles and risk alleles for other diseases. Most T2D risk alleles share a consistent pattern of decreasing frequencies along human migration into East Asia. Furthermore, we show that these patterns contribute to disparities in predicted genetic risk across 1,397 HapMap individuals, T2D genetic risk being consistently higher for individuals in the African populations and lower in the Asian populations, irrespective of the ethnicity considered in the initial discovery of risk alleles. We observed a similar pattern in the distribution of T2D Genetic Risk Scores, which are associated with an increased risk of developing diabetes in the Diabetes Prevention Program cohort, for the same individuals. This disparity may be attributable to the promotion of energy storage and usage appropriate to environments and inconsistent energy intake. Our results indicate that the differential frequencies of T2D risk alleles may contribute to the observed disparity in T2D incidence rates across ethnic populations.
Current South American populations trace their origins mainly to three continental ancestries, i.e. European, Amerindian and African. Individual variation in relative proportions of each of these ...ancestries may be confounded with socio-economic factors due to population stratification. Therefore, ancestry is a potential confounder variable that should be considered in epidemiologic studies and in public health plans. However, there are few studies that have assessed the ancestry of the current admixed Chilean population. This is partly due to the high cost of genome-scale technologies commonly used to estimate ancestry. In this study we have designed a small panel of SNPs to accurately assess ancestry in the largest sampling to date of the Chilean mestizo population (n = 3349) from eight cities. Our panel is also able to distinguish between the two main Amerindian components of Chileans: Aymara from the north and Mapuche from the south.
A panel of 150 ancestry-informative markers (AIMs) of SNP type was selected to maximize ancestry informativeness and genome coverage. Of these, 147 were successfully genotyped by KASPar assays in 2843 samples, with an average missing rate of 0.012, and a 0.95 concordance with microarray data. The ancestries estimated with the panel of AIMs had relative high correlations (0.88 for European, 0.91 for Amerindian, 0.70 for Aymara, and 0.68 for Mapuche components) with those obtained with AXIOM LAT1 array. The country's average ancestry was 0.53 ± 0.14 European, 0.04 ± 0.04 African, and 0.42 ± 0.14 Amerindian, disaggregated into 0.18 ± 0.15 Aymara and 0.25 ± 0.13 Mapuche. However, Mapuche ancestry was highest in the south (40.03%) and Aymara in the north (35.61%) as expected from the historical location of these ethnic groups. We make our results available through an online app and demonstrate how it can be used to adjust for ancestry when testing association between incidence of a disease and nongenetic risk factors.
We have conducted the most extensive sampling, across many different cities, of current Chilean population. Ancestry varied significantly by latitude and human development. The panel of AIMs is available to the community for estimating ancestry at low cost in Chileans and other populations with similar ancestry.