Motivation: Genome-wide association studies (GWASs) have been widely used to map loci contributing to variation in complex traits and risk of diseases in humans. Accurate specification of familial ...relationships is crucial for family-based GWAS, as well as in population-based GWAS with unknown (or unrecognized) family structure. The family structure in a GWAS should be routinely investigated using the SNP data prior to the analysis of population structure or phenotype. Existing algorithms for relationship inference have a major weakness of estimating allele frequencies at each SNP from the entire sample, under a strong assumption of homogeneous population structure. This assumption is often untenable. Results: Here, we present a rapid algorithm for relationship inference using high-throughput genotype data typical of GWAS that allows the presence of unknown population substructure. The relationship of any pair of individuals can be precisely inferred by robust estimation of their kinship coefficient, independent of sample composition or population structure (sample invariance). We present simulation experiments to demonstrate that the algorithm has sufficient power to provide reliable inference on millions of unrelated pairs and thousands of relative pairs (up to 3rd-degree relationships). Application of our robust algorithm to HapMap and GWAS datasets demonstrates that it performs properly even under extreme population stratification, while algorithms assuming a homogeneous population give systematically biased results. Our extremely efficient implementation performs relationship inference on millions of pairs of individuals in a matter of minutes, dozens of times faster than the most efficient existing algorithm known to us. Availability: Our robust relationship inference algorithm is implemented in a freely available software package, KING, available for download at http://people.virginia.edu/∼wc9c/KING. Contact: wmchen@virginia.edu Supplementary information: Supplementary data are available at Bioinformatics online.
The genetic basis of left ventricular (LV) image-derived phenotypes, which play a vital role in the diagnosis, management, and risk stratification of cardiovascular diseases, is unclear at present.
...The LV parameters were measured from the cardiovascular magnetic resonance studies of the UK Biobank. Genotyping was done using Affymetrix arrays, augmented by imputation. We performed genome-wide association studies of 6 LV traits-LV end-diastolic volume, LV end-systolic volume, LV stroke volume, LV ejection fraction, LV mass, and LV mass to end-diastolic volume ratio. The replication analysis was performed in the MESA study (Multi-Ethnic Study of Atherosclerosis). We identified the candidate genes at genome-wide significant loci based on the evidence from extensive bioinformatic analyses. Polygenic risk scores were constructed from the summary statistics of LV genome-wide association studies to predict the heart failure events.
The study comprised 16 923 European UK Biobank participants (mean age 62.5 years; 45.8% men) without prevalent myocardial infarction or heart failure. We discovered 14 genome-wide significant loci (3 loci each for LV end-diastolic volume, LV end-systolic volume, and LV mass to end-diastolic volume ratio; 4 loci for LV ejection fraction, and 1 locus for LV mass) at a stringent
<1×10
. Three loci were replicated at Bonferroni significance and 7 loci at nominal significance (
<0.05 with concordant direction of effect) in the MESA study (n=4383). Follow-up bioinformatic analyses identified 28 candidate genes that were enriched in the cardiac developmental pathways and regulation of the LV contractile mechanism. Eight genes (
, and
) supported by at least 2 independent lines of in silico evidence were implicated in the cardiac morphogenesis and heart failure development. The polygenic risk scores of LV phenotypes were predictive of heart failure in a holdout UK Biobank sample of 3106 cases and 224 134 controls (odds ratio 1.41, 95% CI 1.26 - 1.58, for the top quintile versus the bottom quintile of the LV end-systolic volume risk score).
We report 14 genetic loci and indicate several candidate genes that not only enhance our understanding of the genetic architecture of prognostically important LV phenotypes but also shed light on potential novel therapeutic targets for LV remodeling.
The
promoter variant (rs35705950) and telomere length are linked to pulmonary fibrosis and CT-based qualitative assessments of interstitial abnormalities, but their associations with longitudinal ...quantitative changes of the lung interstitium among community-dwelling adults are unknown.
We used data from participants in the Multi-Ethnic Study of Atherosclerosis with high-attenuation areas (HAAs, Examinations 1-6 (2000-2018)) and
genotype (n=4552) and telomere length (n=4488) assessments. HAA was defined as the per cent of imaged lung with attenuation of -600 to -250 Hounsfield units. We used linear mixed-effects models to examine associations of
risk allele (T) and telomere length with longitudinal changes in HAAs. Joint models were used to examine associations of longitudinal changes in HAAs with death and interstitial lung disease (ILD).
The
risk allele (T) was associated with an absolute change in HAAs of 2.60% (95% CI 0.36% to 4.86%) per 10 years overall. This association was stronger among those with a telomere length below an age-adjusted percentile of 5% (p value for interaction=0.008). A 1% increase in HAAs per year was associated with 7% increase in mortality risk (rate ratio (RR)=1.07, 95% CI 1.02 to 1.12) for overall death and 34% increase in ILD (RR=1.34, 95% CI 1.20 to 1.50). Longer baseline telomere length was cross-sectionally associated with less HAAs from baseline scans, but not with longitudinal changes in HAAs.
Longitudinal increases in HAAs were associated with the
risk allele and a higher risk of death and ILD.
Polygenic risk scores (PRS) are valuable to translate the results of genome-wide association studies (GWAS) into clinical practice. To date, most GWAS have been based on individuals of ...European-ancestry leading to poor performance in populations of non-European ancestry.
We introduce the polygenic transcriptome risk score (PTRS), which is based on predicted transcript levels (rather than SNPs), and explore the portability of PTRS across populations using UK Biobank data.
We show that PTRS has a significantly higher portability (Wilcoxon p=0.013) in the African-descent samples where the loss of performance is most acute with better performance than PRS when used in combination.
Whereas several interventions can effectively lower lipid levels in people at risk for atherosclerotic cardiovascular disease (ASCVD), cardiovascular event risks remain, suggesting an unmet medical ...need to identify factors contributing to cardiovascular event risk. Monocytes and macrophages play central roles in atherosclerosis, but studies have yet to provide a detailed view of macrophage populations involved in increased ASCVD risk.
A novel macrophage foaming analytics tool, AtheroSpectrum, was developed using 2 quantitative indices depicting lipid metabolism and the inflammatory status of macrophages. A machine learning algorithm was developed to analyze gene expression patterns in the peripheral monocyte transcriptome of MESA participants (Multi-Ethnic Study of Atherosclerosis; set 1; n=911). A list of 30 genes was generated and integrated with traditional risk factors to create an ASCVD risk prediction model (30-gene cardiovascular disease risk score CR-30), which was subsequently validated in the remaining MESA participants (set 2; n=228); performance of CR-30 was also tested in 2 independent human atherosclerotic tissue transcriptome data sets (GTEx Genotype-Tissue Expression and GSE43292).
Using single-cell transcriptomic profiles (GSE97310, GSE116240, GSE97941, and FR-FCM-Z23S), AtheroSpectrum detected 2 distinct programs in plaque macrophages-homeostatic foaming and inflammatory pathogenic foaming-the latter of which was positively associated with severity of atherosclerosis in multiple studies. A pool of 2209 pathogenic foaming genes was extracted and screened to select a subset of 30 genes correlated with cardiovascular event in MESA set 1. A cardiovascular disease risk score model (CR-30) was then developed by incorporating this gene set with traditional variables sensitive to cardiovascular event in MESA set 1 after cross-validation generalizability analysis. The performance of CR-30 was then tested in MESA set 2 (
=2.60×10
; area under the receiver operating characteristic curve, 0.742) and 2 independent data sets (GTEx:
=7.32×10
; area under the receiver operating characteristic curve, 0.664; GSE43292:
=7.04×10
; area under the receiver operating characteristic curve, 0.633). Model sensitivity tests confirmed the contribution of the 30-gene panel to the prediction model (likelihood ratio test;
=31,
=0.03).
Our novel computational program (AtheroSpectrum) identified a specific gene expression profile associated with inflammatory macrophage foam cells. A subset of 30 genes expressed in circulating monocytes jointly contributed to prediction of symptomatic atherosclerotic vascular disease. Incorporating a pathogenic foaming gene set with known risk factors can significantly strengthen the power to predict ASCVD risk. Our programs may facilitate both mechanistic investigations and development of therapeutic and prognostic strategies for ASCVD risk.
Higher 25-hydroxyvitamin D (25(OH)D) concentrations in serum has a positive association with pulmonary function. Investigating genome-wide interactions with 25(OH)D may reveal new biological insights ...into pulmonary function.
We aimed to identify novel genetic variants associated with pulmonary function by accounting for 25(OH)D interactions.
We included 211,264 participants from the observational United Kingdom Biobank study with pulmonary function tests (PFTs), genome-wide genotypes, and 25(OH)D concentrations from 4 ancestral backgrounds—European, African, East Asian, and South Asian. Among PFTs, we focused on forced expiratory volume in the first second (FEV1) and forced vital capacity (FVC) because both were previously associated with 25(OH)D. We performed genome-wide association study (GWAS) analyses that accounted for variant×25(OH)D interaction using the joint 2 degree-of-freedom (2df) method, stratified by participants’ smoking history and ancestry, and meta-analyzed results. We evaluated interaction effects to determine how variant-PFT associations were modified by 25(OH)D concentrations and conducted pathway enrichment analysis to examine the biological relevance of our findings.
Our GWAS meta-analyses, accounting for interaction with 25(OH)D, revealed 30 genetic variants significantly associated with FEV1 or FVC (P2df <5.00×10-8) that were not previously reported for PFT-related traits. These novel variant signals were enriched in lung function-relevant pathways, including the p38 MAPK pathway. Among variants with genome-wide-significant 2df results, smoking-stratified meta-analyses identified 5 variants with 25(OH)D interactions that influenced FEV1 in both smoking groups (never smokers P1df interaction<2.65×10-4; ever smokers P1df interaction<1.71×10-5); rs3130553, rs2894186, rs79277477, and rs3130929 associations were only evident in never smokers, and the rs4678408 association was only found in ever smokers.
Genetic variant associations with lung function can be modified by 25(OH)D, and smoking history can further modify variant×25(OH)D interactions. These results expand the known genetic architecture of pulmonary function and add evidence that gene-environment interactions, including with 25(OH)D and smoking, influence lung function.
To investigate whether hyperpolarised xenon-129 MRI (HXeMRI) enables regional and physiological resolution of diffusing capacity limitations in chronic obstructive pulmonary disease (COPD), we ...evaluated 34 COPD subjects and 11 healthy volunteers. We report significant correlations between airflow abnormality quantified by HXeMRI and per cent predicted forced expiratory volume in 1 s; HXeMRI gas transfer capacity to red blood cells and carbon monoxide diffusion capacity (%DLCO); and HXeMRI gas transfer capacity to interstitium and per cent emphysema quantified by multidetector chest CT. We further demonstrate the capability of HXeMRI to distinguish varying pathology underlying COPD in subjects with low %DLCO and minimal emphysema.
Chronic obstructive pulmonary disease (COPD) has been associated with numerous genetic variants, yet the extent to which its genetic risk is mediated by variation in lung structure remains unknown.
...To characterize associations between a genetic risk score (GRS) associated with COPD susceptibility and lung structure on computed tomography (CT).
We analyzed data from MESA Lung (Multi-Ethnic Study of Atherosclerosis Lung Study), a U.S. general population-based cohort, and SPIROMICS (Subpopulations and Intermediate Outcome Measures in COPD Study). A weighted GRS was calculated from 83 SNPs that were previously associated with lung function. Lung density, spatially matched airway dimensions, and airway counts were assessed on full-lung CT. Generalized linear models were adjusted for age, age squared, sex, height, principal components of genetic ancestry, smoking status, pack-years, CT model, milliamperes, and total lung volume.
MESA Lung and SPIROMICS contributed 2,517 and 2,339 participants, respectively. Higher GRS was associated with lower lung function and increased COPD risk, as well as lower lung density, smaller airway lumens, and fewer small airways, without effect modification by smoking. Adjustment for CT lung structure, particularly small airway measures, attenuated associations between the GRS and FEV
/FVC by 100% and 60% in MESA and SPIROMICS, respectively. Lung structure (
< 0.0001), but not the GRS (
> 0.10), improved discrimination of moderate-to-severe COPD cases relative to clinical factors alone.
A GRS associated with COPD susceptibility was associated with CT lung structure. Lung structure may be an important mediator of heritability and determinant of personalized COPD risk.
The identification of quantitative trait loci (QTL) and their interactions is a crucial step toward the discovery of genes responsible for variation in experimental crosses. The problem is best ...viewed as one of model selection, and the most important aspect of the problem is the comparison of models of different sizes. We present a penalized likelihood approach, with penalties on QTL and pairwise interactions chosen to control false positive rates. This extends the work of Broman and Speed to allow for pairwise interactions among QTL. A conservative version of our penalized LOD score provides strict control over the rate of extraneous QTL and interactions; a more liberal criterion is more lenient on interactions but seeks to maintain control over the rate of inclusion of false loci. The key advance is that one needs only to specify a target false positive rate rather than a prior on the number of QTL and interactions. We illustrate the use of our model selection criteria as exploratory tools; simulation studies demonstrate reasonable power to detect QTL. Our liberal criterion is comparable in power to two Bayesian approaches.