The association between a geographical region and an mtDNA haplogroup(s) has provided the basis for using mtDNA haplogroups to infer an individual’s place of origin and genetic ancestry. Although it ...is well known that ancestry inferences using mtDNA haplogroups and those using genome-wide markers are frequently discrepant, little empirical information exists on the magnitude and scope of such discrepancies between multiple mtDNA haplogroups and worldwide populations. We compared genetic-ancestry inferences made by mtDNA-haplogroup membership to those made by autosomal SNPs in ∼940 samples of the Human Genome Diversity Panel and recently admixed populations from the 1000 Genomes Project. Continental-ancestry proportions often varied widely among individuals sharing the same mtDNA haplogroup. For only half of mtDNA haplogroups did the highest average continental-ancestry proportion match the highest continental-ancestry proportion of a majority of individuals with that haplogroup. Prediction of an individual’s mtDNA haplogroup from his or her continental-ancestry proportions was often incorrect. Collectively, these results indicate that for most individuals in the worldwide populations sampled, mtDNA-haplogroup membership provides limited information about either continental ancestry or continental region of origin.
Biological mechanisms underlying human germline mutations remain largely unknown. We statistically decompose variation in the rate and spectra of mutations along the genome using volume-regularized ...nonnegative matrix factorization. The analysis of a sequencing dataset (TOPMed) reveals nine processes that explain the variation in mutation properties between loci. We provide a biological interpretation for seven of these processes. We associate one process with bulky DNA lesions that are resolved asymmetrically with respect to transcription and replication. Two processes track direction of replication fork and replication timing, respectively. We identify a mutagenic effect of active demethylation primarily acting in regulatory regions and a mutagenic effect of long interspersed nuclear elements. We localize a mutagenic process specific to oocytes from population sequencing data. This process appears transcriptionally asymmetric.
Determining historical sex ratios throughout human evolution can provide insight into patterns of genomic variation, the structure and composition of ancient populations, and the cultural factors ...that influence the sex ratio (e.g., sex-specific migration rates). Although numerous studies have suggested that unequal sex ratios have existed in human evolutionary history, a coherent picture of sex-biased processes has yet to emerge. For example, two recent studies compared human X chromosome to autosomal variation to make inferences about historical sex ratios but reached seemingly contradictory conclusions, with one study finding evidence for a male bias and the other study identifying a female bias. Here, we show that a large part of this discrepancy can be explained by methodological differences. Specifically, through reanalysis of empirical data, derivation of explicit analytical formulae, and extensive simulations we demonstrate that two estimators of the effective sex ratio based on population structure and nucleotide diversity preferentially detect biases that have occurred on different timescales. Our results clarify apparently contradictory evidence on the role of sex-biased processes in human evolutionary history and show that extant patterns of human genomic variation are consistent with both a recent male bias and an earlier, persistent female bias.
Abstract
Introduction
Genetic variants associated with nicotine dependence have previously been identified, primarily in European-ancestry populations. No genome-wide association studies (GWAS) have ...been reported for smoking behaviors in Hispanics/Latinos in the United States and Latin America, who are of mixed ancestry with European, African, and American Indigenous components.
Methods
We examined genetic associations with smoking behaviors in the Hispanic Community Health Study/Study of Latinos (HCHS/SOL) (N = 12 741 with smoking data, 5119 ever-smokers), using ~2.3 million genotyped variants imputed to the 1000 Genomes Project phase 3. Mixed logistic regression models accounted for population structure, sampling, relatedness, sex, and age.
Results
The known region of CHRNA5, which encodes the α5 cholinergic nicotinic receptor subunit, was associated with heavy smoking at genome-wide significance (p ≤ 5 × 10–8) in a comparison of 1929 ever-smokers reporting cigarettes per day (CPD) > 10 versus 3156 reporting CPD ≤ 10. The functional variant rs16969968 in CHRNA5 had a p value of 2.20 × 10–7 and odds ratio (OR) of 1.32 for the minor allele (A); its minor allele frequency was 0.22 overall and similar across Hispanic/Latino background groups (Central American = 0.17; South American = 0.19; Mexican = 0.18; Puerto Rican = 0.22; Cuban = 0.29; Dominican = 0.19). CHRNA4 on chromosome 20 attained p < 10–4, supporting prior findings in non-Hispanics. For nondaily smoking, which is prevalent in Hispanic/Latino smokers, compared to daily smoking, loci on chromosomes 2 and 4 achieved genome-wide significance; replication attempts were limited by small Hispanic/Latino sample sizes.
Conclusions
Associations of nicotinic receptor gene variants with smoking, first reported in non-Hispanic European-ancestry populations, generalized to Hispanics/Latinos despite different patterns of smoking behavior.
Implications
We conducted the first large-scale genome-wide association study (GWAS) of smoking behavior in a US Hispanic/Latino cohort, and the first GWAS of daily/nondaily smoking in any population. Results show that the region of the nicotinic receptor subunit gene CHRNA5, which in non-Hispanic European-ancestry smokers has been associated with heavy smoking as well as cessation and treatment efficacy, is also significantly associated with heavy smoking in this Hispanic/Latino cohort. The results are an important addition to understanding the impact of genetic variants in understudied Hispanic/Latino smokers.
Despite ethnic disparities in lipid profiles, there are few genome-wide association studies investigating genetic variation of lipids in non-European ancestry populations. In this study, we present ...findings from genetic association analyses for total cholesterol, low density lipoprotein cholesterol (LDL), high density lipoprotein cholesterol (HDL), and triglycerides in a large Hispanic/Latino cohort in the U.S., the Hispanic Community Health Study / Study of Latinos (HCHS/SOL).
We estimated a heritability of approximately 20% for each lipid trait, similar to previous estimates in Europeans. To search for novel lipid loci, we performed conditional association analysis in which the statistical model was adjusted for previously reported SNPs associated with any of the four lipid traits. SNPs that remained genome-wide significant (P < 5 × 10
) after conditioning on known loci were evaluated for replication.
We identified eight potentially novel lipid signals with minor allele frequencies <1%, none of which replicated. We tested previously reported SNP-trait associations for generalization to Hispanics/Latinos via a statistical framework. The generalization analysis revealed that approximately 50% of previously established lipid variants generalize to HCHS/SOL based on directional FDR r-value < 0.05. Some failures to generalize were due to lack of power.
These results demonstrate that many loci associated with lipid levels are shared across populations.
Objective
Associations of IRS1 genetic variation with adiposity and metabolic profile in U.S. Hispanic/Latino individuals of diverse backgrounds were examined.
Methods
Previously genome‐wide ...association study‐identified IRS1 variants (rs2943650, rs2972146, rs2943641, and rs2943634) as related to body fat percentage (BF%) and multiple metabolic traits were tested among up to 12,730 adults (5,232 men; 7,515 women) from the Hispanic Community Health Study/Study of Latinos.
Results
The C‐allele (frequency = 26%) of rs2943650 was significantly associated with higher BF% overall (β = 0.34 ± 0.11% per allele; P = 0.002) and in women (β = 0.41 ± 0.14% per C‐allele; P = 0.003), but not in men (β = 0.28 ± 0.18% per C‐allele; P = 0.11), though there was no significant sex difference. Using the inverse normal‐transformed data to compare effect sizes, it was found that the association with BF% was stronger in Hispanic/Latino women than that previously reported in European women (β = 0.054 ± 0.018SD vs. β = 0.008 ± 0.011SD per C‐allele; P = 0.03). The BF%‐increasing allele of rs2943650 was significantly associated with lower levels of fasting insulin, homeostatic model assessment of insulin resistance, hemoglobin A1c, and triglycerides and higher high‐density lipoprotein cholesterol (P < 0.05).
Conclusions
This study confirmed and extended previous findings of IRS1 variation associated with increased adiposity but a favorable metabolic profile in U.S. Hispanics/Latinos, with a relatively stronger genetic effect on BF% in Hispanic/Latino women compared with European women.
Objective
Associations of
IRS1
genetic variation with adiposity and metabolic profile in U.S. Hispanic/Latino individuals of diverse backgrounds were examined.
Methods
Previously genome‐wide ...association study‐identified
IRS1
variants (rs2943650, rs2972146, rs2943641, and rs2943634) as related to body fat percentage (BF%) and multiple metabolic traits were tested among up to 12,730 adults (5,232 men; 7,515 women) from the Hispanic Community Health Study/Study of Latinos.
Results
The C‐allele (frequency = 26%) of rs2943650 was significantly associated with higher BF% overall (
β =
0.34 ± 0.11% per allele;
P
= 0.002) and in women (
β =
0.41 ± 0.14% per C‐allele;
P
= 0.003), but not in men (
β =
0.28 ± 0.18% per C‐allele;
P
= 0.11), though there was no significant sex difference. Using the inverse normal‐transformed data to compare effect sizes, it was found that the association with BF% was stronger in Hispanic/Latino women than that previously reported in European women (
β =
0.054 ± 0.018SD vs.
β =
0.008 ± 0.011SD per C‐allele;
P
= 0.03). The BF%‐increasing allele of rs2943650 was significantly associated with lower levels of fasting insulin, homeostatic model assessment of insulin resistance, hemoglobin A1c, and triglycerides and higher high‐density lipoprotein cholesterol (
P
< 0.05).
Conclusions
This study confirmed and extended previous findings of
IRS1
variation associated with increased adiposity but a favorable metabolic profile in U.S. Hispanics/Latinos, with a relatively stronger genetic effect on BF% in Hispanic/Latino women compared with European women.
The advent of large-scale population genomic datasets has enabled detailed inferences regarding human evolutionary history. Demographic changes and positive selection have left their marks on the ...genome and we can now begin to decipher them. In this dissertation, I present the work I have completed on the topic of human population genomic inference. In chapter 1, I begin by reviewing the importance of human genetic variation and the factors that influence it, focusing on the effects of demographic changes and positive selection. Chapter 2 describes an analysis of genetic ancestry in a worldwide sample of human populations. I show that mitochondrial lineage tests overlook large amounts of variation in genetic ancestry. In chapter 3, I focus on inferences regarding the effective sex ratio in the recent evolutionary past. I present a reanalysis of SNP and resequencing data that resolves a set of conflicting results from previous studies. Using coalescent simulations, I present a model of a recent male bias in effective population size, coupled with an earlier female bias, which is consistent with existing genetic variation on the X chromosome and the autosomes. In chapter 4, I present a comprehensive study of the performance of a battery of neutrality test statistics under a wide range of realistic models of positive selection in recent human evolution. I demonstrate that existing tests perform better than expected for detecting the signatures of a soft sweep from standing variation. Then, I develop a genome-wide approach, the Cumulative Selection Score (CSS), for combining the signals from multiple neutrality test statistics to detect the signatures of positive selection with greater accuracy. By implementing this approach in genomic variation data for chromosome 2, I show that the CSS can be applied to whole-genome datasets. I conclude in chapter 5 by discussing the potential of population genomic inferences and the future of the field.
Oncogenic mutations in the serine/threonine kinase B-RAF (also known as BRAF) are found in 50-70% of malignant melanomas. Pre-clinical studies have demonstrated that the B-RAF(V600E) mutation ...predicts a dependency on the mitogen-activated protein kinase (MAPK) signalling cascade in melanoma-an observation that has been validated by the success of RAF and MEK inhibitors in clinical trials. However, clinical responses to targeted anticancer therapeutics are frequently confounded by de novo or acquired resistance. Identification of resistance mechanisms in a manner that elucidates alternative 'druggable' targets may inform effective long-term treatment strategies. Here we expressed ∼600 kinase and kinase-related open reading frames (ORFs) in parallel to interrogate resistance to a selective RAF kinase inhibitor. We identified MAP3K8 (the gene encoding COT/Tpl2) as a MAPK pathway agonist that drives resistance to RAF inhibition in B-RAF(V600E) cell lines. COT activates ERK primarily through MEK-dependent mechanisms that do not require RAF signalling. Moreover, COT expression is associated with de novo resistance in B-RAF(V600E) cultured cell lines and acquired resistance in melanoma cells and tissue obtained from relapsing patients following treatment with MEK or RAF inhibitors. We further identify combinatorial MAPK pathway inhibition or targeting of COT kinase activity as possible therapeutic strategies for reducing MAPK pathway activation in this setting. Together, these results provide new insights into resistance mechanisms involving the MAPK pathway and articulate an integrative approach through which high-throughput functional screens may inform the development of novel therapeutic strategies.