Despite thousands of genetic loci identified to date, a large proportion of genetic variation predisposing to complex disease and traits remains unaccounted for. Advances in sequencing technology ...enable focused explorations on the contribution of low-frequency and rare variants to human traits. Here we review experimental approaches and current knowledge on the contribution of these genetic variants in complex disease and discuss challenges and opportunities for personalised medicine.
Genetic architecture describes the characteristics of genetic variation that are responsible for heritable phenotypic variability. It depends on the number of genetic variants affecting a trait, ...their frequencies in the population, the magnitude of their effects and their interactions with each other and the environment. Defining the genetic architecture of a complex trait or disease is central to the scientific and clinical goals of human genetics, which are to understand disease aetiology and aid in disease screening, diagnosis, prognosis and therapy. Recent technological advances have enabled genome-wide association studies and emerging next-generation sequencing studies to begin to decipher the nature of the heritable contribution to traits and disease. Here, we describe the types of genetic architecture that have been observed, how architecture can be measured and why an improved understanding of genetic architecture is central to future advances in the field.
Glycated hemoglobin A
1c
(HbA
1c
) indicates the percentage of total hemoglobin that is bound by glucose, produced from the nonenzymatic chemical modification by glucose of hemoglobin molecules ...carried in erythrocytes. HbA
1c
represents a surrogate marker of average blood glucose concentration over the previous 8 to 12 weeks, or the average lifespan of the erythrocyte, and thus represents a more stable indicator of glycemic status compared with fasting glucose. HbA
1c
levels are genetically determined, with heritability of 47% to 59%. Over the past few years, inroads into understanding genetic predisposition by glycemic and nonglycemic factors have been achieved through genome-wide analyses. Here I review current research aimed at discovering genetic determinants of HbA
1c
levels, discussing insights into biologic factors influencing variability in the general and diabetic population, and across different ethnicities. Furthermore, I discuss briefly the relevance of findings for diabetes monitoring and diagnosis.
Epigenomic studies that use next generation sequencing experiments typically rely on the alignment of reads to a reference sequence. However, because of genetic diversity and the diploid nature of ...the human genome, we hypothesize that using a generic reference could lead to incorrectly mapped reads and bias downstream results.
We show that accounting for genetic variation using a modified reference genome or a de novo assembled genome can alter histone H3K4me1 and H3K27ac ChIP-seq peak calls either by creating new personal peaks or by the loss of reference peaks. Using permissive cutoffs, modified reference genomes are found to alter approximately 1% of peak calls while de novo assembled genomes alter up to 5% of peaks. We also show statistically significant differences in the amount of reads observed in regions associated with the new, altered, and unchanged peaks. We report that short insertions and deletions (indels), followed by single nucleotide variants (SNVs), have the highest probability of modifying peak calls. We show that using a graph personalized genome represents a reasonable compromise between modified reference genomes and de novo assembled genomes. We demonstrate that altered peaks have a genomic distribution typical of other peaks.
Analyzing epigenomic datasets with personalized and graph genomes allows the recovery of new peaks enriched for indels and SNVs. These altered peaks are more likely to differ between individuals and, as such, could be relevant in the study of various human phenotypes.
Many existing cohorts contain a range of relatedness between genotyped individuals, either by design or by chance. Haplotype estimation in such cohorts is a central step in many downstream analyses. ...Using genotypes from six cohorts from isolated populations and two cohorts from non-isolated populations, we have investigated the performance of different phasing methods designed for nominally 'unrelated' individuals. We find that SHAPEIT2 produces much lower switch error rates in all cohorts compared to other methods, including those designed specifically for isolated populations. In particular, when large amounts of IBD sharing is present, SHAPEIT2 infers close to perfect haplotypes. Based on these results we have developed a general strategy for phasing cohorts with any level of implicit or explicit relatedness between individuals. First SHAPEIT2 is run ignoring all explicit family information. We then apply a novel HMM method (duoHMM) to combine the SHAPEIT2 haplotypes with any family information to infer the inheritance pattern of each meiosis at all sites across each chromosome. This allows the correction of switch errors, detection of recombination events and genotyping errors. We show that the method detects numbers of recombination events that align very well with expectations based on genetic maps, and that it infers far fewer spurious recombination events than Merlin. The method can also detect genotyping errors and infer recombination events in otherwise uninformative families, such as trios and duos. The detected recombination events can be used in association scans for recombination phenotypes. The method provides a simple and unified approach to haplotype estimation, that will be of interest to researchers in the fields of human, animal and plant genetics.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Sequence-based variation in gene expression is a key driver of disease risk. Common variants regulating expression in cis have been mapped in many expression quantitative trait locus (eQTL) studies, ...typically in single tissues from unrelated individuals. Here, we present a comprehensive analysis of gene expression across multiple tissues conducted in a large set of mono- and dizygotic twins that allows systematic dissection of genetic (cis and trans) and non-genetic effects on gene expression. Using identity-by-descent estimates, we show that at least 40% of the total heritable cis effect on expression cannot be accounted for by common cis variants, a finding that reveals the contribution of low-frequency and rare regulatory variants with respect to both transcriptional regulation and complex trait susceptibility. We show that a substantial proportion of gene expression heritability is trans to the structural gene, and we identify several replicating trans variants that act predominantly in a tissue-restricted manner and may regulate the transcription of many genes.
Celotno besedilo
Dostopno za:
DOBA, IJS, IZUM, KILJ, NUK, PILJ, PNG, SAZU, UILJ, UKNU, UL, UM, UPUK
The production of megakaryocytes (MKs)--the precursors of blood platelets--from human pluripotent stem cells (hPSCs) offers exciting clinical opportunities for transfusion medicine. Here we describe ...an original approach for the large-scale generation of MKs in chemically defined conditions using a forward programming strategy relying on the concurrent exogenous expression of three transcription factors: GATA1, FLI1 and TAL1. The forward programmed MKs proliferate and differentiate in culture for several months with MK purity over 90% reaching up to 2 × 10(5) mature MKs per input hPSC. Functional platelets are generated throughout the culture allowing the prospective collection of several transfusion units from as few as 1 million starting hPSCs. The high cell purity and yield achieved by MK forward programming, combined with efficient cryopreservation and good manufacturing practice (GMP)-compatible culture, make this approach eminently suitable to both in vitro production of platelets for transfusion and basic research in MK and platelet biology.
We report the first genome-wide association study (GWAS) whose sample size (1,053 Swedish subjects) is sufficiently powered to detect genome-wide significance (p<1.5 x 10(-7)) for polymorphisms that ...modestly alter therapeutic warfarin dose. The anticoagulant drug warfarin is widely prescribed for reducing the risk of stroke, thrombosis, pulmonary embolism, and coronary malfunction. However, Caucasians vary widely (20-fold) in the dose needed for therapeutic anticoagulation, and hence prescribed doses may be too low (risking serious illness) or too high (risking severe bleeding). Prior work established that approximately 30% of the dose variance is explained by single nucleotide polymorphisms (SNPs) in the warfarin drug target VKORC1 and another approximately 12% by two non-synonymous SNPs (*2, *3) in the cytochrome P450 warfarin-metabolizing gene CYP2C9. We initially tested each of 325,997 GWAS SNPs for association with warfarin dose by univariate regression and found the strongest statistical signals (p<10(-78)) at SNPs clustering near VKORC1 and the second lowest p-values (p<10(-31)) emanating from CYP2C9. No other SNPs approached genome-wide significance. To enhance detection of weaker effects, we conducted multiple regression adjusting for known influences on warfarin dose (VKORC1, CYP2C9, age, gender) and identified a single SNP (rs2108622) with genome-wide significance (p = 8.3 x 10(-10)) that alters protein coding of the CYP4F2 gene. We confirmed this result in 588 additional Swedish patients (p<0.0029) and, during our investigation, a second group provided independent confirmation from a scan of warfarin-metabolizing genes. We also thoroughly investigated copy number variations, haplotypes, and imputed SNPs, but found no additional highly significant warfarin associations. We present power analysis of our GWAS that is generalizable to other studies, and conclude we had 80% power to detect genome-wide significance for common causative variants or markers explaining at least 1.5% of dose variance. These GWAS results provide further impetus for conducting large-scale trials assessing patient benefit from genotype-based forecasting of warfarin dose.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Epigenetic modifications such as DNA methylation play a key role in gene regulation and disease susceptibility. However, little is known about the genome-wide frequency, localization, and function of ...methylation variation and how it is regulated by genetic and environmental factors. We utilized the Multiple Tissue Human Expression Resource (MuTHER) and generated Illumina 450K adipose methylome data from 648 twins. We found that individual CpGs had low variance and that variability was suppressed in promoters. We noted that DNA methylation variation was highly heritable (h2median = 0.34) and that shared environmental effects correlated with metabolic phenotype-associated CpGs. Analysis of methylation quantitative-trait loci (metQTL) revealed that 28% of CpGs were associated with nearby SNPs, and when overlapping them with adipose expression quantitative-trait loci (eQTL) from the same individuals, we found that 6% of the loci played a role in regulating both gene expression and DNA methylation. These associations were bidirectional, but there were pronounced negative associations for promoter CpGs. Integration of metQTL with adipose reference epigenomes and disease associations revealed significant enrichment of metQTL overlapping metabolic-trait or disease loci in enhancers (the strongest effects were for high-density lipoprotein cholesterol and body mass index BMI). We followed up with the BMI SNP rs713586, a cg01884057 metQTL that overlaps an enhancer upstream of ADCY3, and used bisulphite sequencing to refine this region. Our results showed widespread population invariability yet sequence dependence on adipose DNA methylation but that incorporating maps of regulatory elements aid in linking CpG variation to gene regulation and disease risk in a tissue-dependent manner.
Using a nontargeted metabolomics approach of 447 fasting plasma metabolites, we searched for novel molecular markers that arise before and after hyperglycemia in a large population-based cohort of ...2,204 females (115 type 2 diabetic T2D case subjects, 192 individuals with impaired fasting glucose IFG, and 1,897 control subjects) from TwinsUK. Forty-two metabolites from three major fuel sources (carbohydrates, lipids, and proteins) were found to significantly correlate with T2D after adjusting for multiple testing; of these, 22 were previously reported as associated with T2D or insulin resistance. Fourteen metabolites were found to be associated with IFG. Among the metabolites identified, the branched-chain keto-acid metabolite 3-methyl-2-oxovalerate was the strongest predictive biomarker for IFG after glucose (odds ratio OR 1.65 95% CI 1.39-1.95, P = 8.46 × 10(-9)) and was moderately heritable (h(2) = 0.20). The association was replicated in an independent population (n = 720, OR 1.68 1.34-2.11, P = 6.52 × 10(-6)) and validated in 189 twins with urine metabolomics taken at the same time as plasma (OR 1.87 1.27-2.75, P = 1 × 10(-3)). Results confirm an important role for catabolism of branched-chain amino acids in T2D and IFG. In conclusion, this T2D-IFG biomarker study has surveyed the broadest panel of nontargeted metabolites to date, revealing both novel and known associated metabolites and providing potential novel targets for clinical prediction and a deeper understanding of causal mechanisms.