Hypertrophic cardiomyopathy (HCM) exhibits genetic heterogeneity that is dominated by variation in eight sarcomeric genes. Genetic variation in a large number of non-sarcomeric genes has also been ...implicated in HCM but not formally assessed. Here we used very large case and control cohorts to determine the extent to which variation in non-sarcomeric genes contributes to HCM.
We sequenced known and putative HCM genes in a new large prospective HCM cohort (n = 804) and analysed data alongside the largest published series of clinically genotyped HCM patients (n = 6179), previously published HCM cohorts and reference population samples from the exome aggregation consortium (ExAC, n = 60 706) to assess variation in 31 genes implicated in HCM. We found no significant excess of rare (minor allele frequency < 1:10 000 in ExAC) protein-altering variants over controls for most genes tested and conclude that novel variants in these genes are rarely interpretable, even for genes with previous evidence of co-segregation (e.g. ACTN2). To provide an aid for variant interpretation, we integrated HCM gene sequence data with aggregated pedigree and functional data and suggest a means of assessing gene pathogenicity in HCM using this evidence.
We show that genetic variation in the majority of non-sarcomeric genes implicated in HCM is not associated with the condition, reinforce the fact that the sarcomeric gene variation is the primary cause of HCM known to date and underscore that the aetiology of HCM is unknown in the majority of patients.
Alcoholic cardiomyopathy (ACM) is defined by a dilated and impaired left ventricle due to chronic excess alcohol consumption. It is largely unknown which factors determine cardiac toxicity on ...exposure to alcohol.
This study sought to evaluate the role of variation in cardiomyopathy-associated genes in the pathophysiology of ACM, and to examine the effects of alcohol intake and genotype on dilated cardiomyopathy (DCM) severity.
The authors characterized 141 ACM cases, 716 DCM cases, and 445 healthy volunteers. The authors compared the prevalence of rare, protein-altering variants in 9 genes associated with inherited DCM. They evaluated the effect of genotype and alcohol consumption on phenotype in DCM.
Variants in well-characterized DCM-causing genes were more prevalent in patients with ACM than control subjects (13.5% vs. 2.9%; p = 1.2 ×10−5), but similar between patients with ACM and DCM (19.4%; p = 0.12) and with a predominant burden of titin truncating variants (TTNtv) (9.9%). Separately, we identified an interaction between TTN genotype and excess alcohol consumption in a cohort of DCM patients not meeting ACM criteria. On multivariate analysis, DCM patients with a TTNtv who consumed excess alcohol had an 8.7% absolute reduction in ejection fraction (95% confidence interval: −2.3% to −15.1%; p < 0.007) compared with those without TTNtv and excess alcohol consumption. The presence of TTNtv did not predict phenotype, outcome, or functional recovery on treatment in ACM patients.
TTNtv represent a prevalent genetic predisposition for ACM, and are also associated with a worse left ventricular ejection fraction in DCM patients who consume alcohol above recommended levels. Familial evaluation and genetic testing should be considered in patients presenting with ACM.
Display omitted
Dilated cardiomyopathy (DCM) is genetically heterogeneous, with >100 purported disease genes tested in clinical laboratories. However, many genes were originally identified based on candidate-gene ...studies that did not adequately account for background population variation. Here we define the frequency of rare variation in 2538 patients with DCM across protein-coding regions of 56 commonly tested genes and compare this to both 912 confirmed healthy controls and a reference population of 60 706 individuals to identify clinically interpretable genes robustly associated with dominant monogenic DCM.
We used the TruSight Cardio sequencing panel to evaluate the burden of rare variants in 56 putative DCM genes in 1040 patients with DCM and 912 healthy volunteers processed with identical sequencing and bioinformatics pipelines. We further aggregated data from 1498 patients with DCM sequenced in diagnostic laboratories and the Exome Aggregation Consortium database for replication and meta-analysis.
Truncating variants in
and
were associated with DCM in all comparisons. Variants in
, and
were significantly enriched in specific patient subsets, with the last 2 genes potentially contributing primarily to early-onset forms of DCM. Overall, rare variants in these 12 genes potentially explained 17% of cases in the outpatient clinic cohort representing a broad range of adult patients with DCM and 26% of cases in the diagnostic referral cohort enriched in familial and early-onset DCM. Although the absence of a significant excess in other genes cannot preclude a limited role in disease, such genes have limited diagnostic value because novel variants will be uninterpretable and their diagnostic yield is minimal.
In the largest sequenced DCM cohort yet described, we observe robust disease association with 12 genes, highlighting their importance in DCM and translating into high interpretability in diagnostic testing. The other genes analyzed here will need to be rigorously evaluated in ongoing curation efforts to determine their validity as Mendelian DCM genes but have limited value in diagnostic testing in DCM at present. This data will contribute to community gene curation efforts and will reduce erroneous and inconclusive findings in diagnostic testing.
Internationally adopted variant interpretation guidelines from the American College of Medical Genetics and Genomics (ACMG) are generic and require disease-specific refinement. Here we developed ...CardioClassifier ( http://www.cardioclassifier.org ), a semiautomated decision-support tool for inherited cardiac conditions (ICCs).
CardioClassifier integrates data retrieved from multiple sources with user-input case-specific information, through an interactive interface, to support variant interpretation. Combining disease- and gene-specific knowledge with variant observations in large cohorts of cases and controls, we refined 14 computational ACMG criteria and created three ICC-specific rules.
We benchmarked CardioClassifier on 57 expertly curated variants and show full retrieval of all computational data, concordantly activating 87.3% of rules. A generic annotation tool identified fewer than half as many clinically actionable variants (64/219 vs. 156/219, Fisher's P = 1.1 × 10
), with important false positives, illustrating the critical importance of disease and gene-specific annotations. CardioClassifier identified putatively disease-causing variants in 33.7% of 327 cardiomyopathy cases, comparable with leading ICC laboratories. Through addition of manually curated data, variants found in over 40% of cardiomyopathy cases are fully annotated, without requiring additional user-input data.
CardioClassifier is an ICC-specific decision-support tool that integrates expertly curated computational annotations with case-specific data to generate fast, reproducible, and interactive variant pathogenicity reports, according to best practice guidelines.
Sequence-based variation in gene expression is a key driver of disease risk. Common variants regulating expression in cis have been mapped in many expression quantitative trait locus (eQTL) studies, ...typically in single tissues from unrelated individuals. Here, we present a comprehensive analysis of gene expression across multiple tissues conducted in a large set of mono- and dizygotic twins that allows systematic dissection of genetic (cis and trans) and non-genetic effects on gene expression. Using identity-by-descent estimates, we show that at least 40% of the total heritable cis effect on expression cannot be accounted for by common cis variants, a finding that reveals the contribution of low-frequency and rare regulatory variants with respect to both transcriptional regulation and complex trait susceptibility. We show that a substantial proportion of gene expression heritability is trans to the structural gene, and we identify several replicating trans variants that act predominantly in a tissue-restricted manner and may regulate the transcription of many genes.
Epigenetic modifications such as DNA methylation play a key role in gene regulation and disease susceptibility. However, little is known about the genome-wide frequency, localization, and function of ...methylation variation and how it is regulated by genetic and environmental factors. We utilized the Multiple Tissue Human Expression Resource (MuTHER) and generated Illumina 450K adipose methylome data from 648 twins. We found that individual CpGs had low variance and that variability was suppressed in promoters. We noted that DNA methylation variation was highly heritable (h2median = 0.34) and that shared environmental effects correlated with metabolic phenotype-associated CpGs. Analysis of methylation quantitative-trait loci (metQTL) revealed that 28% of CpGs were associated with nearby SNPs, and when overlapping them with adipose expression quantitative-trait loci (eQTL) from the same individuals, we found that 6% of the loci played a role in regulating both gene expression and DNA methylation. These associations were bidirectional, but there were pronounced negative associations for promoter CpGs. Integration of metQTL with adipose reference epigenomes and disease associations revealed significant enrichment of metQTL overlapping metabolic-trait or disease loci in enhancers (the strongest effects were for high-density lipoprotein cholesterol and body mass index BMI). We followed up with the BMI SNP rs713586, a cg01884057 metQTL that overlaps an enhancer upstream of ADCY3, and used bisulphite sequencing to refine this region. Our results showed widespread population invariability yet sequence dependence on adipose DNA methylation but that incorporating maps of regulatory elements aid in linking CpG variation to gene regulation and disease risk in a tissue-dependent manner.
Accurate discrimination of benign and pathogenic rare variation remains a priority for clinical genome interpretation. State-of-the-art machine learning variant prioritization tools are imprecise and ...ignore important parameters defining gene–disease relationships, e.g., distinct consequences of gain-of-function versus loss-of-function variants. We hypothesized that incorporating disease-specific information would improve tool performance.
We developed a disease-specific variant classifier, CardioBoost, that estimates the probability of pathogenicity for rare missense variants in inherited cardiomyopathies and arrhythmias. We assessed CardioBoost’s ability to discriminate known pathogenic from benign variants, prioritize disease-associated variants, and stratify patient outcomes.
CardioBoost has high global discrimination accuracy (precision recall area under the curve AUC 0.91 for cardiomyopathies; 0.96 for arrhythmias), outperforming existing tools (4–24% improvement). CardioBoost obtains excellent accuracy (cardiomyopathies 90.2%; arrhythmias 91.9%) for variants classified with >90% confidence, and increases the proportion of variants classified with high confidence more than twofold compared with existing tools. Variants classified as disease-causing are associated with both disease status and clinical severity, including a 21% increased risk (95% confidence interval CI 11–29%) of severe adverse outcomes by age 60 in patients with hypertrophic cardiomyopathy.
A disease-specific variant classifier outperforms state-of-the-art genome-wide tools for rare missense variants in inherited cardiac conditions (https://www.cardiodb.org/cardioboost/), highlighting broad opportunities for improved pathogenicity prediction through disease specificity.
While there have been studies exploring regulatory variation in one or more tissues, the complexity of tissue-specificity in multiple primary tissues is not yet well understood. We explore in depth ...the role of cis-regulatory variation in three human tissues: lymphoblastoid cell lines (LCL), skin, and fat. The samples (156 LCL, 160 skin, 166 fat) were derived simultaneously from a subset of well-phenotyped healthy female twins of the MuTHER resource. We discover an abundance of cis-eQTLs in each tissue similar to previous estimates (858 or 4.7% of genes). In addition, we apply factor analysis (FA) to remove effects of latent variables, thus more than doubling the number of our discoveries (1,822 eQTL genes). The unique study design (Matched Co-Twin Analysis--MCTA) permits immediate replication of eQTLs using co-twins (93%-98%) and validation of the considerable gain in eQTL discovery after FA correction. We highlight the challenges of comparing eQTLs between tissues. After verifying previous significance threshold-based estimates of tissue-specificity, we show their limitations given their dependency on statistical power. We propose that continuous estimates of the proportion of tissue-shared signals and direct comparison of the magnitude of effect on the fold change in expression are essential properties that jointly provide a biologically realistic view of tissue-specificity. Under this framework we demonstrate that 30% of eQTLs are shared among the three tissues studied, while another 29% appear exclusively tissue-specific. However, even among the shared eQTLs, a substantial proportion (10%-20%) have significant differences in the magnitude of fold change between genotypic classes across tissues. Our results underline the need to account for the complexity of eQTL tissue-specificity in an effort to assess consequences of such variants for complex traits.
International guidelines for variant interpretation in Mendelian disease set stringent criteria to report a variant as (likely) pathogenic, prioritising control of false-positive rate over test ...sensitivity and diagnostic yield. Genetic testing is also more likely informative in individuals with well-characterised variants from extensively studied European-ancestry populations. Inherited cardiomyopathies are relatively common Mendelian diseases that allow empirical calibration and assessment of this framework.
We compared rare variants in large hypertrophic cardiomyopathy (HCM) cohorts (up to 6179 cases) to reference populations to identify variant classes with high prior likelihoods of pathogenicity, as defined by etiological fraction (EF). We analysed the distribution of variants using a bespoke unsupervised clustering algorithm to identify gene regions in which variants are significantly clustered in cases.
Analysis of variant distribution identified regions in which variants are significantly enriched in cases and variant location was a better discriminator of pathogenicity than generic computational functional prediction algorithms. Non-truncating variant classes with an EF ≥ 0.95 were identified in five established HCM genes. Applying this approach leads to an estimated 14-20% increase in cases with actionable HCM variants, i.e. variants classified as pathogenic/likely pathogenic that might be used for predictive testing in probands' relatives.
When found in a patient confirmed to have disease, novel variants in some genes and regions are empirically shown to have a sufficiently high probability of pathogenicity to support a "likely pathogenic" classification, even without additional segregation or functional data. This could increase the yield of high confidence actionable variants, consistent with the framework and recommendations of current guidelines. The techniques outlined offer a consistent and unbiased approach to variant interpretation for Mendelian disease genetic testing. We propose adaptations to ACMG/AMP guidelines to incorporate such evidence in a quantitative and transparent manner.
Most genome-wide methylation studies (EWAS) of multifactorial disease traits use targeted arrays or enrichment methodologies preferentially covering CpG-dense regions, to characterize sufficiently ...large samples. To overcome this limitation, we present here a new customizable, cost-effective approach, methylC-capture sequencing (MCC-Seq), for sequencing functional methylomes, while simultaneously providing genetic variation information. To illustrate MCC-Seq, we use whole-genome bisulfite sequencing on adipose tissue (AT) samples and public databases to design AT-specific panels. We establish its efficiency for high-density interrogation of methylome variability by systematic comparisons with other approaches and demonstrate its applicability by identifying novel methylation variation within enhancers strongly correlated to plasma triglyceride and HDL-cholesterol, including at CD36. Our more comprehensive AT panel assesses tissue methylation and genotypes in parallel at ∼4 and ∼3 M sites, respectively. Our study demonstrates that MCC-Seq provides comparable accuracy to alternative approaches but enables more efficient cataloguing of functional and disease-relevant epigenetic and genetic variants for large-scale EWAS.