In this study, we used whole-genome sequencing and gene expression profiling of 215 human induced pluripotent stem cell (iPSC) lines from different donors to identify genetic variants associated with ...RNA expression for 5,746 genes. We were able to predict causal variants for these expression quantitative trait loci (eQTLs) that disrupt transcription factor binding and validated a subset of them experimentally. We also identified copy-number variant (CNV) eQTLs, including some that appear to affect gene expression by altering the copy number of intergenic regulatory regions. In addition, we were able to identify effects on gene expression of rare genic CNVs and regulatory single-nucleotide variants and found that reactivation of gene expression on the X chromosome depends on gene chromosomal position. Our work highlights the value of iPSCs for genetic association analyses and provides a unique resource for investigating the genetic regulation of gene expression in pluripotent cells.
Display omitted
•Profiling of 215 hiPSC lines enables eQTL mapping of gene expression variation•iPSC eQTLs are enriched in stem cell gene regulatory regions and affect TF binding•Copy-number eQTLs in intergenic regulatory regions also affect expression•Whole-genome sequencing highlights the influence of rare and copy-number variants
Working as part of the NextGen consortium, DeBoever et al. use whole-genome and RNA sequencing to map expression quantitative trait loci in a set of 215 human induced pluripotent stem cell lines. These genotype-expression associations provide a foundation for understanding the genetic regulation of gene expression in pluripotent cells.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
The impact of genetic regulatory variation active in early pancreatic development on adult pancreatic disease and traits is not well understood. Here, we generate a panel of 107 fetal-like ...iPSC-derived pancreatic progenitor cells (iPSC-PPCs) from whole genome-sequenced individuals and identify 4065 genes and 4016 isoforms whose expression and/or alternative splicing are affected by regulatory variation. We integrate eQTLs identified in adult islets and whole pancreas samples, which reveal 1805 eQTL associations that are unique to the fetal-like iPSC-PPCs and 1043 eQTLs that exhibit regulatory plasticity across the fetal-like and adult pancreas tissues. Colocalization with GWAS risk loci for pancreatic diseases and traits show that some putative causal regulatory variants are active only in the fetal-like iPSC-PPCs and likely influence disease by modulating expression of disease-associated genes in early development, while others with regulatory plasticity likely exert their effects in both the fetal and adult pancreas by modulating expression of different disease genes in the two developmental stages.
Stem cells exist in vitro in a spectrum of interconvertible pluripotent states. Analyzing hundreds of hiPSCs derived from different individuals, we show the proportions of these pluripotent states ...vary considerably across lines. We discover 13 gene network modules (GNMs) and 13 regulatory network modules (RNMs), which are highly correlated with each other suggesting that the coordinated co-accessibility of regulatory elements in the RNMs likely underlie the coordinated expression of genes in the GNMs. Epigenetic analyses reveal that regulatory networks underlying self-renewal and pluripotency are more complex than previously realized. Genetic analyses identify thousands of regulatory variants that overlapped predicted transcription factor binding sites and are associated with chromatin accessibility in the hiPSCs. We show that the master regulator of pluripotency, the NANOG-OCT4 Complex, and its associated network are significantly enriched for regulatory variants with large effects, suggesting that they play a role in the varying cellular proportions of pluripotency states between hiPSCs. Our work bins tens of thousands of regulatory elements in hiPSCs into discrete regulatory networks, shows that pluripotency and self-renewal processes have a surprising level of regulatory complexity, and suggests that genetic factors may contribute to cell state transitions in human iPSC lines.
Induced pluripotent stem cells (iPSCs) show variable methylation patterns between lines, some of which reflect aberrant differences relative to embryonic stem cells (ESCs). To examine whether this ...aberrant methylation results from genetic variation or non-genetic mechanisms, we generated human iPSCs from monozygotic twins to investigate how genetic background, clone, and passage number contribute. We found that aberrantly methylated CpGs are enriched in regulatory regions associated with MYC protein motifs and affect gene expression. We classified differentially methylated CpGs as being associated with genetic and/or non-genetic factors (clone and passage), and we found that aberrant methylation preferentially occurs at CpGs associated with clone-specific effects. We further found that clone-specific effects play a strong role in recurrent aberrant methylation at specific CpG sites across different studies. Our results argue that a non-genetic biological mechanism underlies aberrant methylation in iPSCs and that it is likely based on a probabilistic process involving MYC that takes place during or shortly after reprogramming.
Display omitted
•Aberrant methylation in human iPSCs is enriched in functionally relevant regions•Aberrantly methylated regions show changes in gene expression•Genetic analysis of twins highlights a strong link to clone-specific variation•Association with binding sites suggests a role of MYC in aberrant methylation
Working as part of the NHLBI NextGen consortium, Panopoulos et al. use iPSCs derived from monozygotic twins to examine factors regulating aberrant CpG methylation. Their findings suggest that aberrant methylation is likely due to a clone-associated biological mechanism involving Myc proteins and gene expression changes.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
Large-scale collections of induced pluripotent stem cells (iPSCs) could serve as powerful model systems for examining how genetic variation affects biology and disease. Here we describe the iPSCORE ...resource: a collection of systematically derived and characterized iPSC lines from 222 ethnically diverse individuals that allows for both familial and association-based genetic studies. iPSCORE lines are pluripotent with high genomic integrity (no or low numbers of somatic copy-number variants) as determined using high-throughput RNA-sequencing and genotyping arrays, respectively. Using iPSCs from a family of individuals, we show that iPSC-derived cardiomyocytes demonstrate gene expression patterns that cluster by genetic background, and can be used to examine variants associated with physiological and disease phenotypes. The iPSCORE collection contains representative individuals for risk and non-risk alleles for 95% of SNPs associated with human phenotypes through genome-wide association studies. Our study demonstrates the utility of iPSCORE for examining how genetic variants influence molecular and physiological traits in iPSCs and derived cell lines.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
The causal variants and genes underlying thousands of cardiac GWAS signals have yet to be identified. Here, we leverage spatiotemporal information on 966 RNA-seq cardiac samples and perform an ...expression quantitative trait locus (eQTL) analysis detecting eQTLs considering both eGenes and eIsoforms. We identify 2,578 eQTLs associated with a specific developmental stage-, tissue- and/or cell type. Colocalization between eQTL and GWAS signals of five cardiac traits identified variants with high posterior probabilities for being causal in 210 GWAS loci. Pulse pressure GWAS loci are enriched for colocalization with fetal- and smooth muscle- eQTLs; pulse rate with adult- and cardiac muscle- eQTLs; and atrial fibrillation with cardiac muscle- eQTLs. Fine mapping identifies 79 credible sets with five or fewer SNPs, of which 15 were associated with spatiotemporal eQTLs. Our study shows that many cardiac GWAS variants impact traits and disease in a developmental stage-, tissue- and/or cell type-specific fashion.
Mucinous neoplasms of the appendix (MNA) are rare tumors which may progress from benign to malignant disease with an aggressive biological behavior. MNA is often diagnosed after metastasis to the ...peritoneal surfaces resulting in mucinous carcinomatosis peritonei (MCP). Genetic alterations in MNA are poorly characterized due to its low incidence, the hypo-cellularity of MCPs, and a lack of relevant pre-clinical models. As such, application of targeted therapies to this disease is limited to those developed for colorectal cancer and not based on molecular rationale.
We sequenced the whole exomes of 10 MCPs of appendiceal origin to identify genome-wide somatic mutations and copy number aberrations and validated significant findings in 19 additional cases.
Our study demonstrates that MNA has a different molecular makeup than colorectal cancer. Most tumors have co-existing oncogenic mutations in KRAS (26/29) and GNAS (20/29) and are characterized by downstream PKA activation. High-grade tumors are GNAS wild-type (5/6), suggesting they do not progress from low-grade tumors. MNAs do share some genetic alterations with colorectal cancer including gain of 1q (5/10), Wnt, and TGFβ pathway alterations. In contrast, mutations in TP53 (1/10) and APC (0/10), common in colorectal cancer, are rare in MNA. Concurrent activation of the KRAS and GNAS mediated signaling pathways appears to be shared with pancreatic intraductal papillary mucinous neoplasm.
MNA genome-wide mutational analysis reveals genetic alterations distinct from colorectal cancer, in support of its unique pathophysiology and suggests new targeted therapeutic opportunities.
Full text
Available for:
IZUM, KILJ, NUK, PILJ, PNG, SAZU, UL, UM, UPUK
Reprogramming somatic cells to induced pluripotent stem cells (iPSCs) offers the possibility of studying the molecular mechanisms underlying human diseases in cell types difficult to extract from ...living patients, such as neurons and cardiomyocytes. To date, studies have been published that use small panels of iPSC-derived cell lines to study monogenic diseases. However, to study complex diseases, where the genetic variation underlying the disorder is unknown, a sizable number of patient-specific iPSC lines and controls need to be generated. Currently the methods for deriving and characterizing iPSCs are time consuming, expensive, and, in some cases, descriptive but not quantitative. Here we set out to develop a set of simple methods that reduce cost and increase throughput in the characterization of iPSC lines. Specifically, we outline methods for high-throughput quantification of surface markers, gene expression analysis of in vitro differentiation potential, and evaluation of karyotype with markedly reduced cost.
Display omitted
•Combining three high-throughput methods provides low-cost characterization of iPSCs•iPSC line heterogeneity is assessed by fluorescent cell barcoding flow cytometry•12-gene qPCR enables gene expression analysis of in vitro differentiation potential•SNP arrays provide inexpensive high-resolution digital karyotyping
Working as part of the NHLBI NextGen consortium, D'Antonio and colleagues developed three simple methods that reduce cost and increase throughput in the characterization of iPSCs. These methods include: (1) fluorescent cell barcoding flow cytometry to investigate heterogeneity; (2) gene expression analysis to examine in vitro differentiation potential; and (3) high-resolution digital karyotyping to detect chromosomal aberrations.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
Efforts to identify driver mutations in cancer have largely focused on genes, whereas non-coding sequences remain relatively unexplored. Here we develop a statistical method based on characteristics ...known to influence local mutation rate and a series of enrichment filters in order to identify distal regulatory elements harboring putative driver mutations in breast cancer. We identify ten DNase I hypersensitive sites that are significantly mutated in breast cancers and associated with the aberrant expression of neighboring genes. A pan-cancer analysis shows that three of these elements are significantly mutated across multiple cancer types and have mutation densities similar to protein-coding driver genes. Functional characterization of the most highly mutated DNase I hypersensitive sites in breast cancer (using in silico and experimental approaches) confirms that they are regulatory elements and affect the expression of cancer genes. Our study suggests that mutations of regulatory elements in tumors likely play an important role in cancer development.Cancer driver mutations can occur within noncoding genomic sequences. Here, the authors develop a statistical approach to identify candidate noncoding driver mutations in DNase I hypersensitive sites in breast cancer and experimentally demonstrate they are regulatory elements of known cancer genes.