Gene regulation shapes the evolution of phenotypic diversity. We investigated the evolution of liver promoters and enhancers in six primate species using ChIP-seq (H3K27ac and H3K4me1) to profile
...-regulatory elements (CREs) and using RNA-seq to characterize gene expression in the same individuals. To quantify regulatory divergence, we compared CRE activity across species by testing differential ChIP-seq read depths directly measured for orthologous sequences. We show that the primate regulatory landscape is largely conserved across the lineage, with 63% of the tested human liver CREs showing similar activity across species. Conserved CRE function is associated with sequence conservation, proximity to coding genes, cell-type specificity, and transcription factor binding. Newly evolved CREs are enriched in immune response and neurodevelopmental functions. We further demonstrate that conserved CREs bind master regulators, suggesting that while CREs contribute to species adaptation to the environment, core functions remain intact. Newly evolved CREs are enriched in young transposable elements (TEs), including Long-Terminal-Repeats (LTRs) and SINE-VNTR-
s (SVAs), that significantly affect gene expression. Conversely, only 16% of conserved CREs overlap TEs. We tested the
-regulatory activity of 69 TE subfamilies by luciferase reporter assays, spanning all major TE classes, and showed that 95.6% of tested TEs can function as either transcriptional activators or repressors. In conclusion, we demonstrated the critical role of TEs in primate gene regulation and illustrated potential mechanisms underlying evolutionary divergence among the primate species through the noncoding genome.
The resources generated by the GTEx consortium offer unprecedented opportunities to advance our understanding of the biology of human diseases. Here, we present an in-depth examination of the ...phenotypic consequences of transcriptome regulation and a blueprint for the functional interpretation of genome-wide association study-discovered loci. Across a broad set of complex traits and diseases, we demonstrate widespread dose-dependent effects of RNA expression and splicing. We develop a data-driven framework to benchmark methods that prioritize causal genes and find no single approach outperforms the combination of multiple approaches. Using colocalization and association approaches that take into account the observed allelic heterogeneity of gene expression, we propose potential target genes for 47% (2519 out of 5385) of the GWAS loci examined.
Genome-wide association studies have struggled to identify functional genes and variants underlying complex phenotypes. We recruited a multi-ethnic cohort of healthy volunteers (n = 91) and used ...their tissue to generate induced pluripotent stem cells (iPSCs) and hepatocyte-like cells (HLCs) for genome-wide mapping of expression quantitative trait loci (eQTLs) and allele-specific expression (ASE). We identified many eQTL genes (eGenes) not observed in the comparably sized Genotype-Tissue Expression project’s human liver cohort (n = 96). Focusing on blood lipid-associated loci, we performed massively parallel reporter assays to screen candidate functional variants and used genome-edited stem cells, CRISPR interference, and mouse modeling to establish rs2277862-CPNE1, rs10889356-DOCK7, rs10889356-ANGPTL3, and rs10872142-FRK as functional SNP-gene sets. We demonstrated HLC eGenes CPNE1, VKORC1, UBE2L3, and ANGPTL3 and HLC ASE gene ACAA2 to be lipid-functional genes in mouse models. These findings endorse an iPSC-based experimental framework to discover functional variants and genes contributing to complex human traits.
Display omitted
•Genome-wide maps of eQTL/ASE genes in a multi-ethnic cohort of 91 healthy donors•Reporters screen candidate functional variants in blood lipid-associated eQTLs•CRISPR-Cas9 in hPSCs and mice define functional variants of lipid-associated eQTLs•Mouse models establish HLC eQTL/ASE genes as lipid-functional genes
Musunuru, Brown, Rader, and colleagues of the NHLBI NextGen consortium use multi-ethnic population cohorts of iPSCs and differentiated hepatocyte-like cells, in combination with mouse models, to discover and validate functional DNA variants and genes at blood lipid-associated loci previously identified by genome-wide association studies.
Population structure among study subjects may confound genetic association studies, and lack of proper correction can lead to spurious findings. The Genotype-Tissue Expression (GTEx) project largely ...contains individuals of European ancestry, but the v8 release also includes up to 15% of individuals of non-European ancestry. Assessing ancestry-based adjustments in GTEx improves portability of this research across populations and further characterizes the impact of population structure on GWAS colocalization.
Here, we identify a subset of 117 individuals in GTEx (v8) with a high degree of population admixture and estimate genome-wide local ancestry. We perform genome-wide cis-eQTL mapping using admixed samples in seven tissues, adjusted by either global or local ancestry. Consistent with previous work, we observe improved power with local ancestry adjustment. At loci where the two adjustments produce different lead variants, we observe 31 loci (0.02%) where a significant colocalization is called only with one eQTL ancestry adjustment method. Notably, both adjustments produce similar numbers of significant colocalizations within each of two different colocalization methods, COLOC and FINEMAP. Finally, we identify a small subset of eQTL-associated variants highly correlated with local ancestry, providing a resource to enhance functional follow-up.
We provide a local ancestry map for admixed individuals in the GTEx v8 release and describe the impact of ancestry and admixture on gene expression, eQTLs, and GWAS colocalization. While the majority of the results are concordant between local and global ancestry-based adjustments, we identify distinct advantages and disadvantages to each approach.
From a forward mutagenetic screen to discover mutations associated with obesity, we identified mutations in the Spag7 gene linked to metabolic dysfunction in mice. Here, we show that SPAG7 KO mice ...are born smaller and develop obesity and glucose intolerance in adulthood. This obesity does not stem from hyperphagia, but a decrease in energy expenditure. The KO animals also display reduced exercise tolerance and muscle function due to impaired mitochondrial function. Furthermore, SPAG7-deficiency in developing embryos leads to intrauterine growth restriction, brought on by placental insufficiency, likely due to abnormal development of the placental junctional zone. This insufficiency leads to loss of SPAG7-deficient fetuses in utero and reduced birth weights of those that survive. We hypothesize that a ‘thrifty phenotype’ is ingrained in SPAG7 KO animals during development that leads to adult obesity. Collectively, these results indicate that SPAG7 is essential for embryonic development and energy homeostasis later in life.
Obesity rates are climbing worldwide, leading to an increase in associated conditions such as type 2 diabetes. While new pharmaceutical approaches are available to help individuals manage their weight, many patients do not respond to them or experience prohibitive side effects. Identifying alternative treatments will likely require pinpointing the genes and molecular actors involved in the biological processes that control weight regulation. Previous research suggests that a protein known as SPAG7 could help shape how mice use and store the energy they extract from food. Flaherty et al. therefore set out to investigate the role this protein plays in the body. To do so, they created a line of mice born without SPAG7, which they monitored closely throughout life. These animals were underweight at birth and did not eat more than other mice, yet they were obese as adults. Their ability to exercise was reduced, their muscles were weaker and contained fibers with functional defects. The mice also exhibited biological changes associated with the onset of diabetes. Yet deleting SPAG7 during adulthood led to no such changes; these mice maintained normal muscle function and body weight. Closely examining how SPAG7-deficient mice developed in the womb revealed placental defects which likely caused these animals to receive fewer nutrients from their mother. Such early-life deprivation is known to be associated with the body shifting towards maximizing its use of resources and privileging fat storage, even into and throughout adulthood. By shedding light on the biological role of SPAG7, the work by Flaherty et al. helps to better understand how developmental events can increase the likelihood of obesity later in life. Further investigations are now needed to explore whether this knowledge could help design interventions relevant to human health.
Human genetics studies have implicated GALNT2, encoding GalNAc-T2, as a regulator of high-density lipoprotein cholesterol (HDL-C) metabolism, but the mechanisms relating GALNT2 to HDL-C remain ...unclear. We investigated the impact of homozygous GALNT2 deficiency on HDL-C in humans and mammalian models. We identified two humans homozygous for loss-of-function mutations in GALNT2 who demonstrated low HDL-C. We also found that GALNT2 loss of function in mice, rats, and nonhuman primates decreased HDL-C. O-glycoproteomics studies of a human GALNT2-deficient subject validated ANGPTL3 and ApoC-III as GalNAc-T2 targets. Additional glycoproteomics in rodents identified targets influencing HDL-C, including phospholipid transfer protein (PLTP). GALNT2 deficiency reduced plasma PLTP activity in humans and rodents, and in mice this was rescued by reconstitution of hepatic Galnt2. We also found that GALNT2 GWAS SNPs associated with reduced HDL-C also correlate with lower hepatic GALNT2 expression. These results posit GALNT2 as a direct modulator of HDL metabolism across mammals.
Display omitted
•Genetic loss of function of GALNT2 lowers HDL-C in man, rodents, and nonhuman primates•ANGPTL3 and ApoC-III are non-redundant targets of GalNAc-T2 in humans•PLTP O-glycosylation by GalNAc-T2 potentiates its activity and raises HDL-C•GALNT2 GWAS SNP alleles associated with lower HDL-C reduce hepatic GALNT2 expression
SNPs in GALNT2 are associated with HDL-C metabolism, but whether GALNT2 causes HDL-C to go up or down has been debated. Khetarpal et al. show that loss of function of GALNT2 reduces HDL-C in humans, rodents, and nonhuman primates. They also show species-specific glycosylation targets for GalNAc-T2.
Induced pluripotent stem cells (iPSCs) are an established cellular system to study the impact of genetic variants in derived cell types and developmental contexts. However, in their pluripotent ...state, the disease impact of genetic variants is less well known. Here, we integrate data from 1,367 human iPSC lines to comprehensively map common and rare regulatory variants in human pluripotent cells. Using this population-scale resource, we report hundreds of new colocalization events for human traits specific to iPSCs, and find increased power to identify rare regulatory variants compared with somatic tissues. Finally, we demonstrate how iPSCs enable the identification of causal genes for rare diseases.
Genome-wide association studies have identified multiple novel genomic loci associated with vascular diseases. Many of these loci are common non-coding variants that affect the expression of ...disease-relevant genes within coronary vascular cells. To identify such genes on a genome-wide level, we performed deep transcriptomic analysis of genotyped primary human coronary artery smooth muscle cells (HCASMCs) and coronary endothelial cells (HCAECs) from the same subjects, including splicing Quantitative Trait Loci (sQTL), allele-specific expression (ASE), and colocalization analyses. We identified sQTLs for TARS2, YAP1, CFDP1, and STAT6 in HCASMCs and HCAECs, and 233 ASE genes, a subset of which are also GTEx eGenes in arterial tissues. Colocalization of GWAS association signals for coronary artery disease (CAD), migraine, stroke and abdominal aortic aneurysm with GTEx eGenes in aorta, coronary artery and tibial artery discovered novel candidate risk genes for these diseases. At the CAD and stroke locus tagged by rs2107595 we demonstrate colocalization with expression of the proximal gene TWIST1. We show that disrupting the rs2107595 locus alters TWIST1 expression and that the risk allele has increased binding of the NOTCH signaling protein RBPJ. Finally, we provide data that TWIST1 expression influences vascular SMC phenotypes, including proliferation and calcification, as a potential mechanism supporting a role for TWIST1 in CAD.
Deciphering the impact of genetic variation on gene regulation is fundamental to understanding common, complex human diseases. Although histone modifications are important markers of gene regulatory ...elements of the genome, any specific histone modification has not been assayed in more than a few individuals in the human liver. As a result, the effects of genetic variation on histone modification states in the liver are poorly understood. Here, we generate the most comprehensive genome-wide dataset of two epigenetic marks, H3K4me3 and H3K27ac, and annotate thousands of putative regulatory elements in the human liver. We integrate these findings with genome-wide gene expression data collected from the same human liver tissues and high-resolution promoter-focused chromatin interaction maps collected from human liver-derived HepG2 cells. We demonstrate widespread functional consequences of natural genetic variation on putative regulatory element activity and gene expression levels. Leveraging these extensive datasets, we fine-map a total of 74 GWAS loci that have been associated with at least one complex phenotype. Our results reveal a repertoire of genes and regulatory mechanisms governing complex disease development and further the basic understanding of genetic and epigenetic regulation of gene expression in the human liver tissue.
After emerging in China in late 2019, the novel coronavirus severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) spread worldwide, and as of mid-2021, it remains a significant threat ...globally. Only a few coronaviruses are known to infect humans, and only two cause infections similar in severity to SARS-CoV-2:
, a species closely related to SARS-CoV-2 that emerged in 2002, and
, which emerged in 2012. Unlike the current pandemic, previous epidemics were controlled rapidly through public health measures, but the body of research investigating severe acute respiratory syndrome and Middle East respiratory syndrome has proven valuable for identifying approaches to treating and preventing novel coronavirus disease 2019 (COVID-19). Building on this research, the medical and scientific communities have responded rapidly to the COVID-19 crisis and identified many candidate therapeutics. The approaches used to identify candidates fall into four main categories: adaptation of clinical approaches to diseases with related pathologies, adaptation based on virological properties, adaptation based on host response, and data-driven identification (ID) of candidates based on physical properties or on pharmacological compendia. To date, a small number of therapeutics have already been authorized by regulatory agencies such as the Food and Drug Administration (FDA), while most remain under investigation. The scale of the COVID-19 crisis offers a rare opportunity to collect data on the effects of candidate therapeutics. This information provides insight not only into the management of coronavirus diseases but also into the relative success of different approaches to identifying candidate therapeutics against an emerging disease.
The COVID-19 pandemic is a rapidly evolving crisis. With the worldwide scientific community shifting focus onto the SARS-CoV-2 virus and COVID-19, a large number of possible pharmaceutical approaches for treatment and prevention have been proposed. What was known about each of these potential interventions evolved rapidly throughout 2020 and 2021. This fast-paced area of research provides important insight into how the ongoing pandemic can be managed and also demonstrates the power of interdisciplinary collaboration to rapidly understand a virus and match its characteristics with existing or novel pharmaceuticals. As illustrated by the continued threat of viral epidemics during the current millennium, a rapid and strategic response to emerging viral threats can save lives. In this review, we explore how different modes of identifying candidate therapeutics have borne out during COVID-19.