The rising prevalence of childhood obesity has been postulated as an explanation for the increasing rate of individuals diagnosed with type 1 diabetes (T1D). In this study, we use Mendelian ...randomization (MR) to provide evidence that childhood body size has an effect on T1D risk (OR = 2.05 per change in body size category, 95% CI = 1.20 to 3.50, P = 0.008), which remains after accounting for body size at birth and during adulthood using multivariable MR (OR = 2.32, 95% CI = 1.21 to 4.42, P = 0.013). We validate this direct effect of childhood body size using data from a large-scale T1D meta-analysis based on n = 15,573 cases and n = 158,408 controls (OR = 1.94, 95% CI = 1.21 to 3.12, P = 0.006). We also provide evidence that childhood body size influences risk of asthma, eczema and hypothyroidism, although multivariable MR suggested that these effects are mediated by body size in later life. Our findings support a causal role for higher childhood body size on risk of being diagnosed with T1D, whereas its influence on the other immune-associated diseases is likely explained by a long-term effect of remaining overweight for many years over the lifecourse.
New sequencing technologies allow genomic variation to be surveyed in much greater detail than previously possible. While detailed analysis of a single individual typically requires deep sequencing, ...when many individuals are sequenced it is possible to combine shallow sequence data across individuals to generate accurate calls in shared stretches of chromosome. Here, we show that, as progressively larger numbers of individuals are sequenced, increasingly accurate genotype calls can be generated for a given sequence depth. We evaluate the implications of low-coverage sequencing for complex trait association studies. We systematically compare study designs based on genotyping of tagSNPs, sequencing of many individuals at depths ranging between 2× and 30×, and imputation of variants discovered by sequencing a subset of individuals into the remainder of the sample. We show that sequencing many individuals at low depth is an attractive strategy for studies of complex trait genetics. For example, for disease-associated variants with frequency >0.2%, sequencing 3000 individuals at 4× depth provides similar power to deep sequencing of >2000 individuals at 30× depth but requires only ~20% of the sequencing effort. We also show low-coverage sequencing can be used to build a reference panel that can drive imputation into additional samples to increase power further. We provide guidance for investigators wishing to combine results from sequenced, genotyped, and imputed samples.
We developed a novel software package, XCAVATOR, for the identification of genomic regions involved in copy number variants/alterations (CNVs/CNAs) from short and long reads whole-genome sequencing ...experiments.
By using simulated and real datasets we showed that our tool, based on read count approach, is capable to predict the boundaries and the absolute number of DNA copies CNVs/CNAs with high resolutions. To demonstrate the power of our software we applied it to the analysis Illumina and Pacific Bioscencies data and we compared its performance to other ten state of the art tools.
All the analyses we performed demonstrate that XCAVATOR is capable to detect germline and somatic CNVs/CNAs outperforming all the other tools we compared. XCAVATOR is freely available at http://sourceforge.net/projects/xcavator/ .
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
DNA sequencing identifies common and rare genetic variants for association studies, but studies typically focus on variants in nuclear DNA and ignore the mitochondrial genome. In fact, analyzing ...variants in mitochondrial DNA (mtDNA) sequences presents special problems, which we resolve here with a general solution for the analysis of mtDNA in next-generation sequencing studies. The new program package comprises 1) an algorithm designed to identify mtDNA variants (i.e., homoplasmies and heteroplasmies), incorporating sequencing error rates at each base in a likelihood calculation and allowing allele fractions at a variant site to differ across individuals; and 2) an estimation of mtDNA copy number in a cell directly from whole-genome sequencing data. We also apply the methods to DNA sequence from lymphocytes of ~2,000 SardiNIA Project participants. As expected, mothers and offspring share all homoplasmies but a lesser proportion of heteroplasmies. Both homoplasmies and heteroplasmies show 5-fold higher transition/transversion ratios than variants in nuclear DNA. Also, heteroplasmy increases with age, though on average only ~1 heteroplasmy reaches the 4% level between ages 20 and 90. In addition, we find that mtDNA copy number averages ~110 copies/lymphocyte and is ~54% heritable, implying substantial genetic regulation of the level of mtDNA. Copy numbers also decrease modestly but significantly with age, and females on average have significantly more copies than males. The mtDNA copy numbers are significantly associated with waist circumference (p-value = 0.0031) and waist-hip ratio (p-value = 2.4×10-5), but not with body mass index, indicating an association with central fat distribution. To our knowledge, this is the largest population analysis to date of mtDNA dynamics, revealing the age-imposed increase in heteroplasmy, the relatively high heritability of copy number, and the association of copy number with metabolic traits.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Adult height is one of the earliest putative examples of polygenic adaptation in humans. However, this conclusion was recently challenged because residual uncorrected stratification from large-scale ...consortium studies was considered responsible for the previously noted genetic difference. It thus remains an open question whether height loci exhibit signals of polygenic adaptation in any human population. We re-examined this question, focusing on one of the shortest European populations, the Sardinians, in addition to mainland European populations. We utilized height-associated loci from the Biobank Japan (BBJ) dataset to further alleviate concerns of biased ascertainment of GWAS loci and showed that the Sardinians remain significantly shorter than expected under neutrality (∼0.22 standard deviation shorter than Utah residents with ancestry from northern and western Europe CEU on the basis of polygenic height scores, p = 3.89 × 10−4). We also found the trajectory of polygenic height scores between the Sardinian and the British populations diverged over at least the last 10,000 years (p = 0.0082), consistent with a signature of polygenic adaptation driven primarily by the Sardinian population. Although the polygenic score-based analysis showed a much subtler signature in mainland European populations, we found a clear and robust adaptive signature in the UK population by using a haplotype-based statistic, the trait singleton density score (tSDS), driven by the height-increasing alleles (p = 9.1 × 10−4). In summary, by ascertaining height loci in a distant East Asian population, we further supported the evidence of polygenic adaptation at height-associated loci among the Sardinians. In mainland Europeans, the adaptive signature was detected in haplotype-based analysis but not in polygenic score-based analysis.
Aims/hypothesis
Given the potential shared aetiology between type 1 and type 2 diabetes, we aimed to identify any genetic regions associated with both diseases. For associations where there is a ...shared signal and the allele that increases risk to one disease also increases risk to the other, inference about shared aetiology could be made, with the potential to develop therapeutic strategies to treat or prevent both diseases simultaneously. Alternatively, if a genetic signal co-localises with divergent effect directions, it could provide valuable biological insight into how the association affects the two diseases differently.
Methods
Using publicly available type 2 diabetes summary statistics from a genome-wide association study (GWAS) meta-analysis of European ancestry individuals (74,124 cases and 824,006 controls) and type 1 diabetes GWAS summary statistics from a meta-analysis of studies on individuals from the UK and Sardinia (7467 cases and 10,218 controls), we identified all regions of 0.5 Mb that contained variants associated with both diseases (false discovery rate <0.01). In each region, we performed forward stepwise logistic regression to identify independent association signals, then examined co-localisation of each type 1 diabetes signal with each type 2 diabetes signal using
coloc
. Any association with a co-localisation posterior probability of ≥0.9 was considered a genuine shared association with both diseases.
Results
Of the 81 association signals from 42 genetic regions that showed association with both type 1 and type 2 diabetes, four association signals co-localised between both diseases (posterior probability ≥0.9): (1) chromosome 16q23.1, near
CTRB1
/
BCAR1
, which has been previously identified; (2) chromosome 11p15.5, near the
INS
gene; (3) chromosome 4p16.3, near
TMEM129
and (4) chromosome 1p31.3, near
PGM1
. In each of these regions, the effect of genetic variants on type 1 diabetes was in the opposite direction to the effect on type 2 diabetes. Use of additional datasets also supported the previously identified co-localisation on chromosome 9p24.2, near the
GLIS3
gene, in this case with a concordant direction of effect.
Conclusions/interpretation
Four of five association signals that co-localise between type 1 diabetes and type 2 diabetes are in opposite directions, suggesting a complex genetic relationship between the two diseases.
Graphical abstract
We report sequencing-based whole-genome association analyses to evaluate the impact of rare and founder variants on stature in 6,307 individuals on the island of Sardinia. We identify two variants ...with large effects. One variant, which introduces a stop codon in the GHR gene, is relatively frequent in Sardinia (0.87% versus <0.01% elsewhere) and in the homozygous state causes Laron syndrome involving short stature. We find that this variant reduces height in heterozygotes by an average of 4.2 cm (-0.64 s.d.). The other variant, in the imprinted KCNQ1 gene (minor allele frequency (MAF) = 7.7% in Sardinia versus <1% elsewhere) reduces height by an average of 1.83 cm (-0.31 s.d.) when maternally inherited. Additionally, polygenic scores indicate that known height-decreasing alleles are at systematically higher frequencies in Sardinians than would be expected by genetic drift. The findings are consistent with selection for shorter stature in Sardinia and a suggestive human example of the proposed 'island effect' reducing the size of large mammals.
Celotno besedilo
Dostopno za:
DOBA, IJS, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SBMB, UILJ, UKNU, UL, UM, UPUK
Identifying the genes that influence levels of pro-inflammatory molecules can help to elucidate the mechanisms underlying this process. We first conducted a two-stage genome-wide association scan ...(GWAS) for the key inflammatory biomarkers Interleukin-6 (IL-6), the general measure of inflammation erythrocyte sedimentation rate (ESR), monocyte chemotactic protein-1 (MCP-1), and high-sensitivity C-reactive protein (hsCRP) in a large cohort of individuals from the founder population of Sardinia. By analysing 731,213 autosomal or X chromosome SNPs and an additional ∼1.9 million imputed variants in 4,694 individuals, we identified several SNPs associated with the selected quantitative trait loci (QTLs) and replicated all the top signals in an independent sample of 1,392 individuals from the same population. Next, to increase power to detect and resolve associations, we further genotyped the whole cohort (6,145 individuals) for 293,875 variants included on the ImmunoChip and MetaboChip custom arrays. Overall, our combined approach led to the identification of 9 genome-wide significant novel independent signals-5 of which were identified only with the custom arrays-and provided confirmatory evidence for an additional 7. Novel signals include: for IL-6, in the ABO gene (rs657152, p = 2.13×10(-29)); for ESR, at the HBB (rs4910472, p = 2.31×10(-11)) and UCN119B/SPPL3 (rs11829037, p = 8.91×10(-10)) loci; for MCP-1, near its receptor CCR2 (rs17141006, p = 7.53×10(-13)) and in CADM3 (rs3026968, p = 7.63×10(-13)); for hsCRP, within the CRP gene (rs3093077, p = 5.73×10(-21)), near DARC (rs3845624, p = 1.43×10(-10)), UNC119B/SPPL3 (rs11829037, p = 1.50×10(-14)), and ICOSLG/AIRE (rs113459440, p = 1.54×10(-08)) loci. Confirmatory evidence was found for IL-6 in the IL-6R gene (rs4129267); for ESR at CR1 (rs12567990) and TMEM57 (rs10903129); for MCP-1 at DARC (rs12075); and for hsCRP at CRP (rs1205), HNF1A (rs225918), and APOC-I (rs4420638). Our results improve the current knowledge of genetic variants underlying inflammation and provide novel clues for the understanding of the molecular mechanisms regulating this complex process.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Complex trait genome-wide association studies (GWAS) provide an efficient strategy for evaluating large numbers of common variants in large numbers of individuals and for identifying trait-associated ...variants. Nevertheless, GWAS often leave much of the trait heritability unexplained. We hypothesized that some of this unexplained heritability might be due to common and rare variants that reside in GWAS identified loci but lack appropriate proxies in modern genotyping arrays. To assess this hypothesis, we re-examined 7 genes (APOE, APOC1, APOC2, SORT1, LDLR, APOB, and PCSK9) in 5 loci associated with low-density lipoprotein cholesterol (LDL-C) in multiple GWAS. For each gene, we first catalogued genetic variation by re-sequencing 256 Sardinian individuals with extreme LDL-C values. Next, we genotyped variants identified by us and by the 1000 Genomes Project (totaling 3,277 SNPs) in 5,524 volunteers. We found that in one locus (PCSK9) the GWAS signal could be explained by a previously described low-frequency variant and that in three loci (PCSK9, APOE, and LDLR) there were additional variants independently associated with LDL-C, including a novel and rare LDLR variant that seems specific to Sardinians. Overall, this more detailed assessment of SNP variation in these loci increased estimates of the heritability of LDL-C accounted for by these genes from 3.1% to 6.5%. All association signals and the heritability estimates were successfully confirmed in a sample of ∼10,000 Finnish and Norwegian individuals. Our results thus suggest that focusing on variants accessible via GWAS can lead to clear underestimates of the trait heritability explained by a set of loci. Further, our results suggest that, as prelude to large-scale sequencing efforts, targeted re-sequencing efforts paired with large-scale genotyping will increase estimates of complex trait heritability explained by known loci.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Family samples, which can be enriched for rare causal variants by focusing on families with multiple extreme individuals and which facilitate detection of de novo mutation events, provide an ...attractive resource for next-generation sequencing studies. Here, we describe, implement, and evaluate a likelihood-based framework for analysis of next generation sequence data in family samples. Our framework is able to identify variant sites accurately and to assign individual genotypes, and can handle de novo mutation events, increasing the sensitivity and specificity of variant calling and de novo mutation detection. Through simulations we show explicit modeling of family relationships is especially useful for analyses of low-frequency variants and that genotype accuracy increases with the number of individuals sequenced per family. Compared with the standard approach of ignoring relatedness, our methods identify and accurately genotype more variants, and have high specificity for detecting de novo mutation events. The improvement in accuracy using our methods over the standard approach is particularly pronounced for low-frequency variants. Furthermore the family-aware calling framework dramatically reduces Mendelian inconsistencies and is beneficial for family-based analysis. We hope our framework and software will facilitate continuing efforts to identify genetic factors underlying human diseases.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK