The molecular basis of adaptation--and, in particular, the relative roles of protein-coding versus gene expression changes--has long been the subject of speculation and debate. Recently, the ...genotyping of diverse human populations has led to the identification of many putative "local adaptations" that differ between populations. Here I show that these local adaptations are over 10-fold more likely to affect gene expression than amino acid sequence. In addition, a novel framework for identifying polygenic local adaptations detects recent positive selection on the expression levels of genes involved in UV radiation response, immune cell proliferation, and diabetes-related pathways. These results provide the first examples of polygenic gene expression adaptation in humans, as well as the first genome-scale support for the hypothesis that changes in gene expression have driven human adaptation.
Distinguishing which traits have evolved under natural selection, as opposed to neutral evolution, is a major goal of evolutionary biology. Several tests have been proposed to accomplish this, but ...these either rely on false assumptions or suffer from low power. Here, I introduce an approach to detecting selection that makes minimal assumptions and only requires phenotypic data from ∼10 individuals. The test compares the phenotypic difference between two populations to what would be expected by chance under neutral evolution, which can be estimated from the phenotypic distribution of an F₂ cross between those populations. Simulations show that the test is robust to variation in the number of loci affecting the trait, the distribution of locus effect sizes, heritability, dominance, and epistasis. Comparing its performance to the QTL sign test—an existing test of selection that requires both genotype and phenotype data—the new test achieves comparable power with 50- to 100-fold fewer individuals (and no genotype data). Applying the test to empirical data spanning over a century shows strong directional selection in many crops, as well as on naturally selected traits such as head shape in Hawaiian Drosophila and skin color in humans. Applied to gene expression data, the test reveals that the strength of stabilizing selection acting on mRNA levels in a species is strongly associated with that species’ effective population size. In sum, this test is applicable to phenotypic data from almost any genetic cross, allowing selection to be detected more easily and powerfully than previously possible.
Interspecific hybrids have played a key role in research on gene expression regulation. A growing number of studies have measured genome-wide allele-specific expression in hybrids and observed that ...cis-regulatory changes often oppose trans-acting changes affecting the same genes, suggesting stabilizing selection for compensatory changes. However, the most common method for estimating these effects is biased, producing artifactual patterns of compensatory evolution. Here I introduce a simple modification leveraging biological replicates that ameliorates the bias.
The recent advent of ribosome profiling-sequencing of short ribosome-bound fragments of mRNA-has offered an unprecedented opportunity to interrogate the sequence features responsible for modulating ...translational rates. Nevertheless, numerous analyses of the first riboprofiling data set have produced equivocal and often incompatible results. Here we analyze three independent yeast riboprofiling data sets, including two with much higher coverage than previously available, and find that all three show substantial technical sequence biases that confound interpretations of ribosomal occupancy. After accounting for these biases, we find no effect of previously implicated factors on ribosomal pausing. Rather, we find that incorporation of proline, whose unique side-chain stalls peptide synthesis in vitro, also slows the ribosome in vivo. We also reanalyze a method that implicated positively charged amino acids as the major determinant of ribosomal stalling and demonstrate that it produces false signals of stalling in low-coverage data. Our results suggest that any analysis of riboprofiling data should account for sequencing biases and sparse coverage. To this end, we establish a robust methodology that enables analysis of ribosome profiling data without prior assumptions regarding which positions spanned by the ribosome cause stalling.
Measuring allele-specific expression in interspecies hybrids is a powerful way to detect cis-regulatory changes underlying adaptation. However, it remains difficult to identify genes most likely to ...explain species-specific traits. Here, we outline a simple strategy that leverages population-scale allele-specific RNA-seq data to identify genes that show constrained cis-regulation within species yet show divergence between species. Applying this strategy to data from human-chimpanzee hybrid cortical organoids, we identify signatures of lineage-specific selection on genes related to saccharide metabolism, neurodegeneration, and primary cilia. We also highlight cis-regulatory divergence in CUX1 and EDNRB that may shape the trajectory of human brain development.
Somatic mutations in healthy tissues contribute to aging, neurodegeneration, and cancer initiation, yet they remain largely uncharacterized.
To gain a better understanding of the genome-wide ...distribution and functional impact of somatic mutations, we leverage the genomic information contained in the transcriptome to uniformly call somatic mutations from over 7500 tissue samples, representing 36 distinct tissues. This catalog, containing over 280,000 mutations, reveals a wide diversity of tissue-specific mutation profiles associated with gene expression levels and chromatin states. For example, lung samples with low expression of the mismatch-repair gene MLH1 show a mutation signature of deficient mismatch repair. In addition, we find pervasive negative selection acting on missense and nonsense mutations, except for mutations previously observed in cancer samples, which are under positive selection and are highly enriched in many healthy tissues.
These findings reveal fundamental patterns of tissue-specific somatic evolution and shed light on aging and the earliest stages of tumorigenesis.
Cis-regulatory elements such as transcription factor (TF) binding sites can be identified genome-wide, but it remains far more challenging to pinpoint genetic variants affecting TF binding. Here, we ...introduce a pooling-based approach to mapping quantitative trait loci (QTLs) for molecular-level traits. Applying this to five TFs and a histone modification, we mapped thousands of cis-acting QTLs, with over 25-fold lower cost compared to standard QTL mapping. We found that single genetic variants frequently affect binding of multiple TFs, and CTCF can recruit all five TFs to its binding sites. These QTLs often affect local chromatin and transcription but can also influence long-range chromosomal contacts, demonstrating a role for natural genetic variation in chromosomal architecture. Thousands of these QTLs have been implicated in genome-wide association studies, providing candidate molecular mechanisms for many disease risk loci and suggesting that TF binding variation may underlie a large fraction of human phenotypic variation.
Display omitted
•A pooling-based approach maps QTLs for molecular-level traits with reduced cost•Thousands of cis-acting QTLs affect transcription factor binding in humans•CTCF anchors binding of multiple transcription factors•Binding QTLs link genetic variation to 3D genome architecture and complex traits
Examination of thousands of human genetic variants that affect transcription factor binding demonstrates a role for natural gene variation in chromosomal architecture and illustrates the efficiency and economy of using pooled samples for these analyses.
Despite the greater functional importance of protein levels, our knowledge of gene expression evolution is based almost entirely on studies of mRNA levels. In contrast, our understanding of how ...translational regulation evolves has lagged far behind. Here we have applied ribosome profiling--which measures both global mRNA levels and their translation rates--to two species of Saccharomyces yeast and their interspecific hybrid in order to assess the relative contributions of changes in mRNA abundance and translation to regulatory evolution. We report that both cis- and trans-acting regulatory divergence in translation are abundant, affecting at least 35% of genes. The majority of translational divergence acts to buffer changes in mRNA abundance, suggesting a widespread role for stabilizing selection acting across regulatory levels. Nevertheless, we observe evidence of lineage-specific selection acting on several yeast functional modules, including instances of reinforcing selection acting at both levels of regulation. Finally, we also uncover multiple instances of stop-codon readthrough that are conserved between species. Our analysis reveals the underappreciated complexity of post-transcriptional regulatory divergence and indicates that partitioning the search for the locus of selection into the binary categories of "coding" versus "regulatory" may overlook a significant source of selection, acting at multiple regulatory levels along the path from genotype to phenotype.
Epigenetics is emerging as an attractive mechanism to explain the persistent genomic embedding of early-life experiences. Tightly linked to chromatin, which packages DNA into chromosomes, epigenetic ...marks primarily serve to regulate the activity of genes. DNA methylation is the most accessible and characterized component of the many chromatin marks that constitute the epigenome, making it an ideal target for epigenetic studies in human populations. Here, using peripheral blood mononuclear cells collected from a community-based cohort stratified for early-life socioeconomic status, we measured DNA methylation in the promoter regions of more than 14,000 human genes. Using this approach, we broadly assessed and characterized epigenetic variation, identified some of the factors that sculpt the epigenome, and determined its functional relation to gene expression. We found that the leukocyte composition of peripheral blood covaried with patterns of DNA methylation at many sites, as did demographic factors, such as sex, age, and ethnicity. Furthermore, psychosocial factors, such as perceived stress, and cortisol output were associated with DNA methylation, as was early-life socioeconomic status. Interestingly, we determined that DNA methylation was strongly correlated to the ex vivo inflammatory response of peripheral blood mononuclear cells to stimulation with microbial products that engage Toll-like receptors. In contrast, our work found limited effects of DNA methylation marks on the expression of associated genes across individuals, suggesting a more complex relationship than anticipated.