Cell type-specific gene expression patterns and dynamics during development or in disease are controlled by cis-regulatory elements (CREs), such as promoters and enhancers. Distinct classes of CREs ...can be characterized by their epigenomic features, including DNA methylation, chromatin accessibility, combinations of histone modifications and conformation of local chromatin. Tremendous progress has been made in cataloguing CREs in the human genome using bulk transcriptomic and epigenomic methods. However, single-cell epigenomic and multi-omic technologies have the potential to provide deeper insight into cell type-specific gene regulatory programmes as well as into how they change during development, in response to environmental cues and through disease pathogenesis. Here, we highlight recent advances in single-cell epigenomic methods and analytical tools and discuss their readiness for human tissue profiling.
Current catalogs of regulatory sequences in the human genome are still incomplete and lack cell type resolution. To profile the activity of gene regulatory elements in diverse cell types and tissues ...in the human body, we applied single-cell chromatin accessibility assays to 30 adult human tissue types from multiple donors. We integrated these datasets with previous single-cell chromatin accessibility data from 15 fetal tissue types to reveal the status of open chromatin for ∼1.2 million candidate cis-regulatory elements (cCREs) in 222 distinct cell types comprised of >1.3 million nuclei. We used these chromatin accessibility maps to delineate cell-type-specificity of fetal and adult human cCREs and to systematically interpret the noncoding variants associated with complex human traits and diseases. This rich resource provides a foundation for the analysis of gene regulatory programs in human cell types across tissues, life stages, and organ systems.
Display omitted
•Integrating > 1.3 million single-cell chromatin profiles from adult/fetal human tissues•An atlas of ∼1.2 million candidate cis-regulatory elements across 222 cell types•Cell-type specificity of fetal and adult candidate cis-regulatory elements•Interpretation of noncoding variants associated with complex traits and diseases
A cell-type-resolved map of human cis-regulatory elements, derived from single cell analysis of diverse tissue types, facilitates functional interpretation of the noncoding variants associated with complex human traits and diseases.
Genome-wide association studies (GWAS) have linked hundreds of thousands of sequence variants in the human genome to common traits and diseases. However, translating this knowledge into a mechanistic ...understanding of disease-relevant biology remains challenging, largely because such variants are predominantly in non-protein-coding sequences that still lack functional annotation at cell-type resolution. Recent advances in single-cell epigenomics assays have enabled the generation of cell type-, subtype- and state-resolved maps of the epigenome in heterogeneous human tissues. These maps have facilitated cell type-specific annotation of candidate cis-regulatory elements and their gene targets in the human genome, enhancing our ability to interpret the genetic basis of common traits and diseases.
Detection of recent natural selection is a challenging problem in population genetics. Here we introduce the singleton density score (SDS), a method to infer very recent changes in allele frequencies ...from contemporary genome sequences. Applied to data from the UK10K Project, SDS reflects allele frequency changes in the ancestors of modern Britons during the past ~2OOO to 3000 years. We see strong signals of selection at lactase and the major histocompatibility complex, and in favor of blond hair and blue eyes. For polygenic adaptation, we find that recent selection for increased height has driven allele frequency shifts across most of the genome. Moreover, we identify shifts associated with other complex traits, suggesting that polygenic adaptation has played a pervasive role in shaping genotypic and phenotypic variation in modern humans.
Genome and exome sequencing in large cohorts enables characterization of the role of rare variation in complex diseases. Success in this endeavor, however, requires investigators to test a diverse ...array of genetic hypotheses which differ in the number, frequency and effect sizes of underlying causal variants. In this study, we evaluated the power of gene-based association methods to interrogate such hypotheses, and examined the implications for study design. We developed a flexible simulation approach, using 1000 Genomes data, to (a) generate sequence variation at human genes in up to 10K case-control samples, and (b) quantify the statistical power of a panel of widely used gene-based association tests under a variety of allelic architectures, locus effect sizes, and significance thresholds. For loci explaining ~1% of phenotypic variance underlying a common dichotomous trait, we find that all methods have low absolute power to achieve exome-wide significance (~5-20% power at α = 2.5 × 10(-6)) in 3K individuals; even in 10K samples, power is modest (~60%). The combined application of multiple methods increases sensitivity, but does so at the expense of a higher false positive rate. MiST, SKAT-O, and KBAC have the highest individual mean power across simulated datasets, but we observe wide architecture-dependent variability in the individual loci detected by each test, suggesting that inferences about disease architecture from analysis of sequencing studies can differ depending on which methods are used. Our results imply that tens of thousands of individuals, extensive functional annotation, or highly targeted hypothesis testing will be required to confidently detect or exclude rare variant signals at complex disease loci.
Recent advances in the understanding of the genetics of type 2 diabetes (T2D) susceptibility have focused attention on the regulation of transcriptional activity within the pancreatic beta-cell. ...MicroRNAs (miRNAs) represent an important component of regulatory control, and have proven roles in the development of human disease and control of glucose homeostasis. We set out to establish the miRNA profile of human pancreatic islets and of enriched beta-cell populations, and to explore their potential involvement in T2D susceptibility. We used Illumina small RNA sequencing to profile the miRNA fraction in three preparations each of primary human islets and of enriched beta-cells generated by fluorescence-activated cell sorting. In total, 366 miRNAs were found to be expressed (i.e. >100 cumulative reads) in islets and 346 in beta-cells; of the total of 384 unique miRNAs, 328 were shared. A comparison of the islet-cell miRNA profile with those of 15 other human tissues identified 40 miRNAs predominantly expressed (i.e. >50% of all reads seen across the tissues) in islets. Several highly-expressed islet miRNAs, such as miR-375, have established roles in the regulation of islet function, but others (e.g. miR-27b-3p, miR-192-5p) have not previously been described in the context of islet biology. As a first step towards exploring the role of islet-expressed miRNAs and their predicted mRNA targets in T2D pathogenesis, we looked at published T2D association signals across these sites. We found evidence that predicted mRNA targets of islet-expressed miRNAs were globally enriched for signals of T2D association (p-values <0.01, q-values <0.1). At six loci with genome-wide evidence for T2D association (AP3S2, KCNK16, NOTCH2, SCL30A8, VPS26A, and WFS1) predicted mRNA target sites for islet-expressed miRNAs overlapped potentially causal variants. In conclusion, we have described the miRNA profile of human islets and beta-cells and provide evidence linking islet miRNAs to T2D pathogenesis.
The Major Histocompatibility (MHC) class II haplotypes HLA-DR3 and HLA-DR4 confer substantial genetic risk for type 1 diabetes (T1D), where 90% of individuals with T1D carry at least one copy of DR3 ...and/or DR4. We sought to understand whether the remaining 10% of T1D individuals without DR3 or DR4 haplotypes (non-DR3/DR4) have differences in genetic risk and whether current risk scores are effective for these individuals. We tested for T1D association using 12,716 non-DR3/DR4 samples from five European ancestry cohorts using genotypes imputed into the Michigan HLA and TOPMed v2 reference panels and identified independent signals using stepwise conditional analyses. There were 21 signals with significant effects on T1D, including 13 at the MHC locus and 8 genome-wide, of which 5 were novel. Across all T1D signals, 14 lead variants exhibited significant heterogeneity in effects on T1D in the non-DR3/DR4 subset compared to the full T1D population, including HLA-B*39 and the PTPN22 locus. Nearly 50% of non-DR3/DR4 T1D would be misclassified by a current T1D genetic risk score (T1D GRS1), which is weighted heavily by DR3/DR4 tag variants. We developed a new GRS using the 21 independent signals and determined whether this GRS could better discriminate T1D in individuals with non-DR3/DR4 haplotypes. The non-DR3/DR4 T1D GRS effectively differentiated T1D case and control samples (AUC=0.87) including in samples not in the initial discovery set (AUC=0.86), and improved prediction compared to GRS1 (AUC=0.52 using all controls, AUC=0.79 using only non-DR3/DR4 controls). These findings reveal a subset of T1D cases with heterogeneity in genetic risk and which are not as well represented by current risk scores, and argue that risk scores developed within heterogeneous subsets may enable even more accurate prediction of T1D.
Disclosure
C.Mcgrail: None. K.J.Gaulton: Consultant; Genentech, Inc., Stock/Shareholder; Neurocrine Biosciences, Inc., Vertex Pharmaceuticals Incorporated.
Funding
National Institutes of Health (DK120429, DK122607)
To address the challenge of translating genetic discoveries for type 1 diabetes (T1D) into mechanistic insight, we have developed the T1D Knowledge Portal (T1DKP), an open-access resource for ...hypothesis development and target discovery in T1D.
The intersection of genome-wide association analyses with physiological and functional data indicates that variants regulating islet gene transcription influence type 2 diabetes (T2D) predisposition ...and glucose homeostasis. However, the specific genes through which these regulatory variants act remain poorly characterized. We generated expression quantitative trait locus (eQTL) data in 118 human islet samples using RNA-sequencing and high-density genotyping. We identified fourteen loci at which cis-exon-eQTL signals overlapped active islet chromatin signatures and were coincident with established T2D and/or glycemic trait associations. At some, these data provide an experimental link between GWAS signals and biological candidates, such as DGKB and ADCY5. At others, the cis-signals implicate genes with no prior connection to islet biology, including WARS and ZMIZ1. At the ZMIZ1 locus, we show that perturbation of ZMIZ1 expression in human islets and beta-cells influences exocytosis and insulin secretion, highlighting a novel role for ZMIZ1 in the maintenance of glucose homeostasis. Together, these findings provide a significant advance in the mechanistic insights of T2D and glycemic trait association loci.
Purpose of Review
Deciphering the mechanisms of type 2 diabetes (T2DM) risk loci can greatly inform on disease pathology. This review discusses current knowledge of mechanisms through which genetic ...variants influence T2DM risk and considerations for future studies.
Recent Findings
Over 100 T2DM risk loci to date have been identified. Candidate causal variants at risk loci map predominantly to non-coding sequence. Physiological, epigenomic and gene expression data suggest that variants at many known T2DM risk loci affect pancreatic islet regulation, although variants at other loci also affect protein function and regulatory processes in adipose, pre-adipose, liver, skeletal muscle and brain. The effects of T2DM variants on regulatory activity in these tissues appear largely, but not exclusively, due to altered transcription factor binding. Putative target genes of T2DM variants have been defined at an increasing number of loci and some, such as
FTO,
may entail several genes and multiple tissues. Gene networks in islets and adipocytes have been implicated in T2DM risk, although the molecular pathways of risk genes remain largely undefined.
Summary
Efforts to fully define the mechanisms of T2DM risk loci are just beginning. Continued identification of risk mechanisms will benefit from combining genetic fine-mapping with detailed phenotypic association data, high-throughput epigenomics data from diabetes-relevant tissue, functional screening of candidate genes and genome editing of cellular and animal models.