Relatives provide the basic material for the study of inheritance of human disease. However, the methodologies for the estimation of heritability and the interpretation of the results have been ...controversial. The debate arises from the plethora of methods used, the validity of the methodological assumptions and the inconsistent and sometimes erroneous genetic interpretations made. We will discuss how to estimate disease heritability, how to interpret it, how biases in heritability estimates arise and how heritability relates to other measures of familial disease aggregation.
Genome-wide association studies (GWASs) have become the focus of the statistical analysis of complex traits in humans, successfully shedding light on several aspects of genetic architecture and ...biological aetiology. Single-nucleotide polymorphisms (SNPs) are usually modelled as having additive, cumulative and independent effects on the phenotype. Although evidently a useful approach, it is often argued that this is not a realistic biological model and that epistasis (that is, the statistical interaction between SNPs) should be included. The purpose of this Review is to summarize recent directions in methodology for detecting epistasis and to discuss evidence of the role of epistasis in human complex trait variation. We also discuss the relevance of epistasis in the context of GWASs and potential hazards in the interpretation of statistical interaction terms.
The heritability of Major Depressive Disorder (MDD) has been estimated at 37% based largely on twin studies that rely on contested assumptions. More recently, the heritability of MDD has been ...estimated on large populations from registries such as the Swedish, Finnish, and Chinese cohorts. Family-based designs utilise a number of different relationships and provide an alternative means of estimating heritability. Generation Scotland: Scottish Family Health Study (GS:SFHS) is a large (n = 20,198), family-based population study designed to identify the genetic determinants of common diseases, including Major Depressive Disorder. Two thousand seven hundred and six individuals were SCID diagnosed with MDD, 13.5% of the cohort, from which we inferred a population prevalence of 12.2% (95% credible interval: 11.4% to 13.1%). Increased risk of MDD was associated with being female, unemployed due to a disability, current smokers, former drinkers, and living in areas of greater social deprivation. The heritability of MDD in GS:SFHS was between 28% and 44%, estimated from a pedigree model. The genetic correlation of MDD between sexes, age of onset, and illness course were examined and showed strong genetic correlations. The genetic correlation between males and females with MDD was 0.75 (0.43 to 0.99); between earlier (≤ age 40) and later (> age 40) onset was 0.85 (0.66 to 0.98); and between single and recurrent episodic illness course was 0.87 (0.72 to 0.98). We found that the heritability of recurrent MDD illness course was significantly greater than the heritability of single MDD illness course. The study confirms a moderate genetic contribution to depression, with a small contribution of the common family environment (variance proportion = 0.07, CI: 0.01 to 0.15), and supports the relationship of MDD with previously identified risk factors. This study did not find robust support for genetic differences in MDD due to sex, age of onset, or illness course. However, we found an intriguing difference in heritability between recurrent and single MDD illness course. These findings establish GS:SFHS as a valuable cohort for the genetic investigation of MDD.
Genome-wide association studies have detected many loci underlying susceptibility to disease, but most of the genetic factors that contribute to disease susceptibility remain unknown. Here we provide ...evidence that part of the 'missing heritability' can be explained by an overestimation of heritability. We estimated the heritability of 12 complex human diseases using family history of disease in 1,555,906 individuals of white ancestry from the UK Biobank. Estimates using simple family-based statistical models were inflated on average by ∼47% when compared with those from structural equation modeling (SEM), which specifically accounted for shared familial environmental factors. In addition, heritabilities estimated using SNP data explained an average of 44.2% of the simple family-based estimates across diseases and an average of 57.3% of the SEM-estimated heritabilities, accounting for almost all of the SEM heritability for hypertension. Our results show that both genetics and familial environment make substantial contributions to familial clustering of disease.
Pedigree-based analyses of intelligence have reported that genetic differences account for 50-80% of the phenotypic variation. For personality traits these effects are smaller, with 34-48% of the ...variance being explained by genetic differences. However, molecular genetic studies using unrelated individuals typically report a heritability estimate of around 30% for intelligence and between 0 and 15% for personality variables. Pedigree-based estimates and molecular genetic estimates may differ because current genotyping platforms are poor at tagging causal variants, variants with low minor allele frequency, copy number variants, and structural variants. Using ~20,000 individuals in the Generation Scotland family cohort genotyped for ~700,000 single-nucleotide polymorphisms (SNPs), we exploit the high levels of linkage disequilibrium (LD) found in members of the same family to quantify the total effect of genetic variants that are not tagged in GWAS of unrelated individuals. In our models, genetic variants in low LD with genotyped SNPs explain over half of the genetic variance in intelligence, education, and neuroticism. By capturing these additional genetic effects our models closely approximate the heritability estimates from twin studies for intelligence and education, but not for neuroticism and extraversion. We then replicated our finding using imputed molecular genetic data from unrelated individuals to show that ~50% of differences in intelligence, and ~40% of the differences in education, can be explained by genetic effects when a larger number of rare SNPs are included. From an evolutionary genetic perspective, a substantial contribution of rare genetic variants to individual differences in intelligence, and education is consistent with mutation-selection balance.
Genome-wide DNA methylation (DNAm) profiling has allowed for the development of molecular predictors for a multitude of traits and diseases. Such predictors may be more accurate than the ...self-reported phenotypes and could have clinical applications.
Here, penalized regression models are used to develop DNAm predictors for ten modifiable health and lifestyle factors in a cohort of 5087 individuals. Using an independent test cohort comprising 895 individuals, the proportion of phenotypic variance explained in each trait is examined for DNAm-based and genetic predictors. Receiver operator characteristic curves are generated to investigate the predictive performance of DNAm-based predictors, using dichotomized phenotypes. The relationship between DNAm scores and all-cause mortality (n = 212 events) is assessed via Cox proportional hazards models. DNAm predictors for smoking, alcohol, education, and waist-to-hip ratio are shown to predict mortality in multivariate models. The predictors show moderate discrimination of obesity, alcohol consumption, and HDL cholesterol. There is excellent discrimination of current smoking status, poorer discrimination of college-educated individuals and those with high total cholesterol, LDL with remnant cholesterol, and total:HDL cholesterol ratios.
DNAm predictors correlate with lifestyle factors that are associated with health and mortality. They may supplement DNAm-based predictors of age to identify the lifestyle profiles of individuals and predict disease risk.
Interactions among loci or between genes and environmental factors make a substantial contribution to variation in complex traits such as disease susceptibility. Nonetheless, many studies that ...attempt to identify the genetic basis of complex traits ignore the possibility that loci interact. We argue that epistasis should be accounted for in complex trait studies; we critically assess current study designs for detecting epistasis and discuss how these might be adapted for use in additional populations, including humans.
The 10th anniversary 'Genomics of Common Diseases' meeting was held in Baltimore, September 25-28, 2016. Professor Chris Haley reports from the meeting on progress and challenges in the field.
Genome-wide association studies have successfully identified thousands of loci for a range of human complex traits and diseases. The proportion of phenotypic variance explained by significant ...associations is, however, limited. Given the same dense SNP panels, mixed model analyses capture a greater proportion of phenotypic variance than single SNP analyses but the total is generally still less than the genetic variance estimated from pedigree studies. Combining information from pedigree relationships and SNPs, we examined 16 complex anthropometric and cardiometabolic traits in a Scottish family-based cohort comprising up to 20,000 individuals genotyped for ~520,000 common autosomal SNPs. The inclusion of related individuals provides the opportunity to also estimate the genetic variance associated with pedigree as well as the effects of common family environment. Trait variation was partitioned into SNP-associated and pedigree-associated genetic variation, shared nuclear family environment, shared couple (partner) environment and shared full-sibling environment. Results demonstrate that trait heritabilities vary widely but, on average across traits, SNP-associated and pedigree-associated genetic effects each explain around half the genetic variance. For most traits the recently-shared environment of couples is also significant, accounting for ~11% of the phenotypic variance on average. On the other hand, the environment shared largely in the past by members of a nuclear family or by full-siblings, has a more limited impact. Our findings point to appropriate models to use in future studies as pedigree-associated genetic effects and couple environmental effects have seldom been taken into account in genotype-based analyses. Appropriate description of the trait variation could help understand causes of intra-individual variation and in the detection of contributing loci and environmental factors.
Genome-wide association studies can be applied to identify useful SNPs associated with complex traits. Furthermore, regional genomic mapping can be used to estimate regional variance and clarify the ...genomic relationships within and outside regions but has not previously been applied to milk traits in cattle. We applied both single SNP analysis and regional genomic mapping to investigate SNPs or regions associated with milk yield traits in dairy cattle. The de-regressed breeding values of three traits, total yield (kg) of milk (MLK), fat (FAT), and protein (PRT) in 305 days, from 2,590 Holstein sires in Japan were analyzed. All sires were genotyped with 40,646 single-nucleotide polymorphism (SNP) markers. A genome-wide significant region (P < 0.01) common to all three traits was identified by regional genomic mapping on chromosome (BTA) 14. In contrast, single SNP analysis identified significant SNPs only for MLK and FAT (P < 0.01), but not PRT in the same region. Regional genomic mapping revealed an additional significant region (P < 0.01) for FAT on BTA5 that was not identified by single SNP analysis. The additive whole-genomic effects estimated in the regional genomic mapping analysis for the three traits were positively correlated with one another (0.830-0.924). However, the regional genomic effects obtained by using a window size of 20 SNPs for FAT on BTA14 were negatively correlated (P < 0.01) with the regional genomic effect for MLK (-0.940) and PRT (-0.878). The BTA14 regional effect for FAT also showed significant negative correlations (P < 0.01) with the whole genomic effects for MLK (-0.153), FAT (-0.172), and PRT (-0.181). These negative genomic correlations between loci are consistent with the negative linkage disequilibrium expected for traits under directional selection. Such antagonistic correlations may hamper the fixation of the FAT increasing alleles on BTA14. In summary, regional genomic mapping found more regions associated with milk production traits than did single SNP analysis. In addition, the existence of non-zero covariances between regional and whole genomic effects may influence the detection of regional effects, and antagonistic correlations could hamper the fixation of major genes under intensive selection.