Linear mixed models (LMMs) are widely used in genome-wide association studies (GWASs) to account for population structure and relatedness, for both continuous and binary traits. Motivated by the ...failure of LMMs to control type I errors in a GWAS of asthma, a binary trait, we show that LMMs are generally inappropriate for analyzing binary traits when population stratification leads to violation of the LMM’s constant-residual variance assumption. To overcome this problem, we develop a computationally efficient logistic mixed model approach for genome-wide analysis of binary traits, the generalized linear mixed model association test (GMMAT). This approach fits a logistic mixed model once per GWAS and performs score tests under the null hypothesis of no association between a binary trait and individual genetic variants. We show in simulation studies and real data analysis that GMMAT effectively controls for population structure and relatedness when analyzing binary traits in a wide variety of study designs.
Populations change in size over time due to factors such as population growth, migration, bottleneck events, natural disasters, and disease. The historical effective size of a population affects the ...power and resolution of genetic association studies. For admixed populations, it is not only the overall effective population size that is of interest, but also the effective sizes of the component ancestral populations. We use identity by descent and local ancestry inferred from genome-wide genetic data to estimate overall and ancestry-specific effective population size during the past hundred generations for nine admixed American populations from the Hispanic Community Health Study/Study of Latinos, and for African-American and European-American populations from two US cities. In these populations, the estimated pre-admixture effective sizes of the ancestral populations vary by sampled population, suggesting that the ancestors of different sampled populations were drawn from different sub-populations. In addition, we estimate that overall effective population sizes dropped substantially in the generations immediately after the commencement of European and African immigration, reaching a minimum around 12 generations ago, but rebounded within a small number of generations afterwards. Of the populations that we considered, the population of individuals originating from Puerto Rico has the smallest bottleneck size of one thousand, while the Pittsburgh African-American population has the largest bottleneck size of two hundred thousand.
GWASTools is an R/Bioconductor package for quality control and analysis of genome-wide association studies (GWAS). GWASTools brings the interactive capability and extensive statistical libraries of R ...to GWAS. Data are stored in NetCDF format to accommodate extremely large datasets that cannot fit within R's memory limits. The documentation includes instructions for converting data from multiple formats, including variants called from sequencing. GWASTools provides a convenient interface for linking genotypes and intensity data with sample and single nucleotide polymorphism annotation.
Variation in levels of the human metabolome reflect changes in homeostasis, providing a window into health and disease. The genetic impact on circulating metabolites in Hispanics, a population with ...high cardiometabolic disease burden, is largely unknown. We conducted genome-wide association analyses on 640 circulating metabolites in 3,926 Hispanic Community Health Study/Study of Latinos participants. The estimated heritability for 640 metabolites ranged between 0%–54% with a median at 2.5%. We discovered 46 variant-metabolite pairs (p value < 1.2 × 10−10, minor allele frequency ≥ 1%, proportion of variance explained PEV mean = 3.4%, PEVrange = 1%–22%) with generalized effects in two population-based studies and confirmed 301 known locus-metabolite associations. Half of the identified variants with generalized effect were located in genes, including five nonsynonymous variants. We identified co-localization with the expression quantitative trait loci at 105 discovered and 151 known loci-metabolites sets. rs5855544, upstream of SLC51A, was associated with higher levels of three steroid sulfates and co-localized with expression levels of SLC51A in several tissues. Mendelian randomization (MR) analysis identified several metabolites associated with coronary heart disease (CHD) and type 2 diabetes. For example, two variants located in or near CYP4F2 (rs2108622 and rs79400241, respectively), involved in vitamin E metabolism, were associated with the levels of octadecanedioate and vitamin E metabolites (gamma-CEHC and gamma-CEHC glucuronide); MR analysis showed that genetically high levels of these metabolites were associated with lower odds of CHD. Our findings document the genetic architecture of circulating metabolites in an underrepresented Hispanic/Latino community, shedding light on disease etiology.
Numerous lines of evidence point to a genetic basis for facial morphology in humans, yet little is known about how specific genetic variants relate to the phenotypic expression of many common facial ...features. We conducted genome-wide association meta-analyses of 20 quantitative facial measurements derived from the 3D surface images of 3118 healthy individuals of European ancestry belonging to two US cohorts. Analyses were performed on just under one million genotyped SNPs (Illumina OmniExpress+Exome v1.2 array) imputed to the 1000 Genomes reference panel (Phase 3). We observed genome-wide significant associations (p < 5 x 10-8) for cranial base width at 14q21.1 and 20q12, intercanthal width at 1p13.3 and Xq13.2, nasal width at 20p11.22, nasal ala length at 14q11.2, and upper facial depth at 11q22.1. Several genes in the associated regions are known to play roles in craniofacial development or in syndromes affecting the face: MAFB, PAX9, MIPOL1, ALX3, HDAC8, and PAX1. We also tested genotype-phenotype associations reported in two previous genome-wide studies and found evidence of replication for nasal ala length and SNPs in CACNA2D3 and PRDM16. These results provide further evidence that common variants in regions harboring genes of known craniofacial function contribute to normal variation in human facial features. Improved understanding of the genes associated with facial morphology in healthy individuals can provide insights into the pathways and mechanisms controlling normal and abnormal facial morphogenesis.
African ancestry alleles may contribute to CKD among Hispanics/Latinos, but whether associations differ by Hispanic/Latino background remains unknown. We examined the association of CKD measures with ...African ancestry-specific
alleles that were directly genotyped and sickle cell trait (hemoglobin subunit
gene
variant) on the basis of imputation in 12,226 adult Hispanics/Latinos grouped according to Caribbean or Mainland background. We also performed an unbiased genome-wide association scan of urine albumin-to-creatinine ratios. Overall, 41.4% of participants were male, 44.6% of participants had a Caribbean background, and the mean age of all participants was 46.1 years. The Caribbean background group, compared with the Mainland background group, had a higher frequency of two
alleles (1.0% versus 0.1%) and the
variant (2.0% versus 0.7%). In the Caribbean background group, presence of
alleles (2 versus 0/1 copies) or the
variant (1 versus 0 copies) were significantly associated with albuminuria (odds ratio OR, 3.2; 95% confidence interval 95% CI, 1.7 to 6.1; and OR, 2.6; 95% CI, 1.8 to 3.8, respectively) and albuminuria and/or eGFR<60 ml/min per 1.73 m
(OR, 2.9; 95% CI, 1.5 to 5.4; and OR, 2.4; 95% CI, 1.7 to 3.5, respectively). The urine albumin-to-creatinine ratio genome-wide association scan identified associations with the
variant among all participants, with the strongest association in the Caribbean background group (
=3.1×10
versus
=9.3×10
for the Mainland background group). In conclusion, African-specific alleles associate with CKD in Hispanics/Latinos, but allele frequency varies by Hispanic/Latino background/ancestry.
Blood soluble E-selectin (sE-selectin) levels have been related to various conditions such as type 2 diabetes. We performed a genome-wide association study among women of European ancestry from the ...Nurses' Health Study, and identified genome-wide significant associations between a cluster of markers at the ABO locus (9q34) and plasma sE-selectin concentration. The strongest association was with rs651007, which explained ∼9.71% of the variation in sE-selectin concentrations. SNP rs651007 was also nominally associated with soluble intracellular cell adhesion molecule-1 (sICAM-1) (P = 0.026) and TNF-R2 levels (P = 0.018), independent of sE-selectin. In addition, the genetic-inferred ABO blood group genotypes were associated with sE-selectin concentrations (P = 3.55 × 10−47). Moreover, we found that the genetic-inferred blood group B was associated with a decreased risk (OR = 0.44, 0.27–0.70) of type 2 diabetes compared with blood group O, adjusting for sE-selectin, sICAM-1, TNF-R2 and other covariates. Our findings indicate that the genetic variants at ABO locus affect plasma sE-selectin levels and diabetes risk. The genetic associations with diabetes risk were independent of sE-selectin levels.
Co-inheritance of α-thalassemia has a significant protective effect on the severity of complications of sickle cell disease (SCD), including stroke. However, little information exists on the ...association and interactions for the common African ancestral α-thalassemia mutation (-α3.7 deletion) and β-globin traits (HbS trait SCT and HbC trait) on important clinical phenotypes such as red blood cell parameters, anemia, and chronic kidney disease (CKD). In a community-based cohort of 2,916 African Americans from the Jackson Heart Study, we confirmed the expected associations between SCT, HbC trait, and the -α3.7 deletion with lower mean corpuscular volume/mean corpuscular hemoglobin and higher red blood cell count and red cell distribution width. In addition to the recently recognized association of SCT with lower estimated glomerular filtration rate and glycated hemoglobin (HbA1c), we observed a novel association of the -α3.7 deletion with higher HbA1c levels. Co-inheritance of each additional copy of the -α3.7 deletion significantly lowered the risk of anemia and chronic kidney disease among individuals with SCT (P-interaction = 0.031 and 0.019, respectively). Furthermore, co-inheritance of a novel α-globin regulatory variant was associated with normalization of red cell parameters in individuals with the -α3.7 deletion and significantly negated the protective effect of α-thalassemia on stroke in 1,139 patients with sickle cell anemia from the Cooperative Study of Sickle Cell Disease (CSSCD) (P-interaction = 0.0049). Functional assays determined that rs11865131, located in the major alpha-globin enhancer MCS-R2, was the most likely causal variant. These findings suggest that common α- and β-globin variants interact to influence hematologic and clinical phenotypes in African Americans, with potential implications for risk-stratification and counseling of individuals with SCD and SCT.
Dental caries is the most common chronic disease worldwide, and exhibits profound disparities in the USA with racial and ethnic minorities experiencing disproportionate disease burden. Though ...heritable, the specific genes influencing risk of dental caries remain largely unknown. Therefore, we performed genome-wide association scans (GWASs) for dental caries in a population-based cohort of 12 000 Hispanic/Latino participants aged 18-74 years from the HCHS/SOL. Intra-oral examinations were used to generate two common indices of dental caries experience which were tested for association with 27.7 M genotyped or imputed single-nucleotide polymorphisms separately in the six ancestry groups. A mixed-models approach was used, which adjusted for age, sex, recruitment site, five principal components of ancestry and additional features of the sampling design. Meta-analyses were used to combine GWAS results across ancestry groups. Heritability estimates ranged from 20-53% in the six ancestry groups. The most significant association observed via meta-analysis for both phenotypes was in the region of the NAMPT gene (rs190395159; P-value = 6 × 10(-10)), which is involved in many biological processes including periodontal healing. Another significant association was observed for rs72626594 (P-value = 3 × 10(-8)) downstream of BMP7, a tooth development gene. Other associations were observed in genes lacking known or plausible roles in dental caries. In conclusion, this was the largest GWAS of dental caries, to date and was the first to target Hispanic/Latino populations. Understanding the factors influencing dental caries susceptibility may lead to improvements in prediction, prevention and disease management, which may ultimately reduce the disparities in oral health across racial, ethnic and socioeconomic strata.