Phenotypes extracted from Electronic Health Records (EHRs) are increasingly prevalent in genetic studies. EHRs contain hundreds of distinct clinical laboratory test results, providing a trove of ...health data beyond diagnoses. Such lab data is complex and lacks a ubiquitous coding scheme, making it more challenging than diagnosis data. Here we describe the first large-scale cross-health system genome-wide association study (GWAS) of EHR-based quantitative laboratory-derived phenotypes. We meta-analyzed 70 lab traits matched between the BioVU cohort from the Vanderbilt University Health System and the Michigan Genomics Initiative (MGI) cohort from Michigan Medicine. We show high replication of known association for these traits, validating EHR-based measurements as high-quality phenotypes for genetic analysis. Notably, our analysis provides the first replication for 699 previous GWAS associations across 46 different traits. We discovered 31 novel associations at genome-wide significance for 22 distinct traits, including the first reported associations for two lab-based traits. We replicated 22 of these novel associations in an independent tranche of BioVU samples. The summary statistics for all association tests are freely available to benefit other researchers. Finally, we performed mirrored analyses in BioVU and MGI to assess competing analytic practices for EHR lab traits. We find that using the mean of all available lab measurements provides a robust summary value, but alternate summarizations can improve power in certain circumstances. This study provides a proof-of-principle for cross health system GWAS and is a framework for future studies of quantitative EHR lab traits.
Clonal hematopoiesis in sickle cell disease Liggett, L Alexander; Cato, Liam D; Weinstock, Joshua S ...
The Journal of clinical investigation,
02/2022, Volume:
132, Issue:
4
Journal Article
Peer reviewed
Open access
BACKGROUNDCurative gene therapies for sickle cell disease (SCD) are currently undergoing clinical evaluation. The occurrence of myeloid malignancies in these trials has prompted safety concerns. ...Individuals with SCD are predisposed to myeloid malignancies, but the underlying causes remain undefined. Clonal hematopoiesis (CH) is a premalignant condition that also confers significant predisposition to myeloid cancers. While it has been speculated that CH may play a role in SCD-associated cancer predisposition, limited data addressing this issue have been reported.METHODSHere, we leveraged 74,190 whole-genome sequences to robustly study CH in SCD. Somatic mutation calling methods were used to assess CH in all samples and comparisons between individuals with and without SCD were performed.RESULTSWhile we had sufficient power to detect a greater than 2-fold increased rate of CH, we found no detectable variation in rate or clone properties between individuals affected by SCD and controls. The rate of CH in individuals with SCD was unaltered by hydroxyurea use.CONCLUSIONSWe did not observe an increased risk for acquiring detectable CH in SCD, at least as measured by whole-genome sequencing. These results should help guide ongoing efforts and further studies that seek to better define the risk factors underlying myeloid malignancy predisposition in SCD and help ensure that curative therapies can be more safely applied.FUNDINGNew York Stem Cell Foundation and the NIH.
Background Presence of clonal hematopoiesis of indeterminate potential (CHIP) is associated with a higher risk of atherosclerotic cardiovascular disease, cancer, and mortality. The relationship ...between a healthy lifestyle and CHIP is unknown. Methods and Results This analysis included 8709 postmenopausal women (mean age, 66.5 years) enrolled in the WHI (Women's Health Initiative), free of cancer or cardiovascular disease, with deep-coverage whole genome sequencing data available. Information on lifestyle factors (body mass index, smoking, physical activity, and diet quality) was obtained, and a healthy lifestyle score was created on the basis of healthy criteria met (0 point least healthy to 4 points most healthy). CHIP was derived on the basis of a prespecified list of leukemogenic driver mutations. The prevalence of CHIP was 8.6%. A higher healthy lifestyle score was not associated with CHIP (multivariable-adjusted odds ratio OR 95% CI, 0.99 0.80-1.23 and 1.13 0.93-1.37) for the upper (3 or 4 points) and middle category (2 points), respectively, versus referent (0 or 1 point). Across score components, a normal and overweight body mass index compared with obese was significantly associated with a lower odds for CHIP (OR, 0.71 95% CI, 0.57-0.88 and 0.83 95% CI, 0.68-1.01, respectively;
-trend 0.0015). Having never smoked compared with being a current smoker tended to be associated with lower odds for CHIP. Conclusions A healthy lifestyle, based on a composite score, was not related to CHIP among postmenopausal women. However, across individual lifestyle factors, having a normal body mass index was strongly associated with a lower prevalence of CHIP. These findings support the idea that certain healthy lifestyle factors are associated with a lower frequency of CHIP.
Clonal hematopoiesis (CH) is characterized by the acquisition of a somatic mutation in a hematopoietic stem cell that results in a clonal expansion. These driver mutations can be single nucleotide ...variants in cancer driver genes or larger structural rearrangements called mosaic chromosomal alterations (mCAs). The factors that influence the variations in mCA fitness and ultimately result in different clonal expansion rates are not well understood. We used the Passenger-Approximated Clonal Expansion Rate (PACER) method to estimate clonal expansion rate as PACER scores for 6,381 individuals in the NHLBI TOPMed cohort with gain, loss, and copy-neutral loss of heterozygosity mCAs. Our mCA fitness estimates, derived by aggregating per-individual PACER scores, were correlated (R
= 0.49) with an alternative approach that estimated fitness of mCAs in the UK Biobank using population-level distributions of clonal fraction. Among individuals with JAK2 V617F clonal hematopoiesis of indeterminate potential or mCAs affecting the JAK2 gene on chromosome 9, PACER score was strongly correlated with erythrocyte count. In a cross-sectional analysis, genome-wide association study of estimates of mCA expansion rate identified a TCL1A locus variant associated with mCA clonal expansion rate, with suggestive variants in NRIP1 and TERT.
Array genotyping is a cost‐effective and widely used tool that enables assessment of up to millions of genetic markers in hundreds of thousands of individuals. Genotyping array data are typically ...highly accurate but sensitive to mixing of DNA samples from multiple individuals before or during genotyping. Contaminated samples can lead to genotyping errors and consequently cause false positive signals or reduce power of association analyses. Here, we propose a new method to identify contaminated samples and the sources of contamination within a genotyping batch. Through analysis of array intensity and genotype data from intentionally mixed samples and 22,366 samples of the Michigan Genomics Initiative, an ongoing biobank‐based study, we show that our method can reliably estimate contamination. We also show that identifying sources of contamination can implicate problematic sample processing steps and guide process improvements. Compared to existing methods, our approach can estimate the proportion of contaminating DNA more accurately, eliminate the need for external databases of allele frequencies, and provide contamination estimates that are more robust to the ancestral origin of the contaminating sample.
The power of genetic association analyses can be increased by jointly meta‐analyzing multiple correlated phenotypes. Here, we develop a meta‐analysis framework, Meta‐MultiSKAT, that uses summary ...statistics to test for association between multiple continuous phenotypes and variants in a region of interest. Our approach models the heterogeneity of effects between studies through a kernel matrix and performs a variance component test for association. Using a genotype kernel, our approach can test for rare‐variants and the combined effects of both common and rare‐variants. To achieve robust power, within Meta‐MultiSKAT, we developed fast and accurate omnibus tests combining different models of genetic effects, functional genomic annotations, multiple correlated phenotypes, and heterogeneity across studies. In addition, Meta‐MultiSKAT accommodates situations where studies do not share exactly the same set of phenotypes or have differing correlation patterns among the phenotypes. Simulation studies confirm that Meta‐MultiSKAT can maintain the type‐I error rate at the exome‐wide level of 2.5 × 10−6. Further simulations under different models of association show that Meta‐MultiSKAT can improve the power of detection from 23% to 38% on average over single phenotype‐based meta‐analysis approaches. We demonstrate the utility and improved power of Meta‐MultiSKAT in the meta‐analyses of four white blood cell subtype traits from the Michigan Genomics Initiative (MGI) and SardiNIA studies.
Clonal hematopoiesis results from somatic mutations in cancer driver genes in hematopoietic stem cells. We sought to identify novel drivers of clonal expansion using an unbiased analysis of ...sequencing data from 84,683 persons and identified common mutations in the 5-methylcytosine reader,
, as well as in
,
, and
. We also identified these mutations at low frequency in myelodysplastic syndrome patients.
edited mouse hematopoietic stem and progenitor cells exhibited a competitive advantage
and increased genome-wide intron retention.
mutations potentially link DNA methylation and RNA splicing, the two most commonly mutated pathways in clonal hematopoiesis and MDS.
Age-related changes to the genome-wide DNA methylation (DNAm) pattern observed in blood are well-documented. Clonal hematopoiesis of indeterminate potential (CHIP), characterized by the age-related ...acquisition and expansion of leukemogenic mutations in hematopoietic stem cells (HSCs), is associated with blood cancer and coronary artery disease (CAD). Epigenetic regulators DNMT3A and TET2 are the two most frequently mutated CHIP genes. Here, we present results from an epigenome-wide association study for CHIP in 582 Cardiovascular Health Study (CHS) participants, with replication in 2655 Atherosclerosis Risk in Communities (ARIC) Study participants. We show that DNMT3A and TET2 CHIP have distinct and directionally opposing genome-wide DNAm association patterns consistent with their regulatory roles, albeit both promoting self-renewal of HSCs. Mendelian randomization analyses indicate that a subset of DNAm alterations associated with these two leading CHIP genes may promote the risk for CAD.