Gallstones are responsible for one of the most common diseases in the Western world and are commonly treated with cholecystectomy. We perform a meta-analysis of two genome-wide association studies of ...gallstone disease in Iceland and the UK, totaling 27,174 cases and 736,838 controls, uncovering 21 novel gallstone-associated variants at 20 loci. Two distinct low frequency missense variants in SLC10A2, encoding the apical sodium-dependent bile acid transporter (ASBT), associate with an increased risk of gallstone disease (Pro290Ser: OR = 1.36 1.25-1.49, P = 2.1 × 10
, MAF = 1%; Val98Ile: OR = 1.15 1.10-1.20, P = 1.8 × 10
, MAF = 4%). We demonstrate that lower bile acid transport by ASBT is accompanied by greater risk of gallstone disease and highlight the role of the intestinal compartment of the enterohepatic circulation of bile acids in gallstone disease susceptibility. Additionally, two low frequency missense variants in SERPINA1 and HNF4A and 17 common variants represent novel associations with gallstone disease.
Assessing thyroid cancer risk using polygenic risk scores Liyanarachchi, Sandya; Gudmundsson, Julius; Ferkingstad, Egil ...
Proceedings of the National Academy of Sciences - PNAS,
03/2020, Letnik:
117, Številka:
11
Journal Article
Recenzirano
Odprti dostop
Genome-wide association studies (GWASs) have identified at least 10 single-nucleotide polymorphisms (SNPs) associated with papillary thyroid cancer (PTC) risk. Most of these SNPs are common variants ...with small to moderate effect sizes. Here we assessed the combined genetic effects of these variants on PTC risk by using summarized GWAS results to build polygenic risk score (PRS) models in three PTC study groups from Ohio (1,544 patients and 1,593 controls), Iceland (723 patients and 129,556 controls), and the United Kingdom (534 patients and 407,945 controls). A PRS based on the 10 established PTC SNPs showed a stronger predictive power compared with the clinical factors model, with a minimum increase of area under the receiver-operating curve of 5.4 percentage points (P ≤ 1.0 × 10−9). Adding an extended PRS based on 592,475 common variants did not significantly improve the prediction power compared with the 10-SNP model, suggesting that most of the remaining undiscovered genetic risk in thyroid cancer is due to rare, moderate- to high-penetrance variants rather than to common low-penetrance variants. Based on the 10-SNP PRS, individuals in the top decile group of PRSs have a close to sevenfold greater risk (95% CI, 5.4–8.8) compared with the bottom decile group. In conclusion, PRSs based on a small number of common germline variants emphasize the importance of heritable low-penetrance markers in PTC.
Iron is essential for many biological functions and iron deficiency and overload have major health implications. We performed a meta-analysis of three genome-wide association studies from Iceland, ...the UK and Denmark of blood levels of ferritin (N = 246,139), total iron binding capacity (N = 135,430), iron (N = 163,511) and transferrin saturation (N = 131,471). We found 62 independent sequence variants associating with iron homeostasis parameters at 56 loci, including 46 novel loci. Variants at DUOX2, F5, SLC11A2 and TMPRSS6 associate with iron deficiency anemia, while variants at TF, HFE, TFR2 and TMPRSS6 associate with iron overload. A HBS1L-MYB intergenic region variant associates both with increased risk of iron overload and reduced risk of iron deficiency anemia. The DUOX2 missense variant is present in 14% of the population, associates with all iron homeostasis biomarkers, and increases the risk of iron deficiency anemia by 29%. The associations implicate proteins contributing to the main physiological processes involved in iron homeostasis: iron sensing and storage, inflammation, absorption of iron from the gut, iron recycling, erythropoiesis and bleeding/menstruation.
Abstract
Predicting all-cause mortality risk is challenging and requires extensive medical data. Recently, large-scale proteomics datasets have proven useful for predicting health-related outcomes. ...Here, we use measurements of levels of 4,684 plasma proteins in 22,913 Icelanders to develop all-cause mortality predictors both for short- and long-term risk. The participants were 18-101 years old with a mean follow up of 13.7 (sd. 4.7) years. During the study period, 7,061 participants died. Our proposed predictor outperformed, in survival prediction, a predictor based on conventional mortality risk factors. We could identify the 5% at highest risk in a group of 60-80 years old, where 88% died within ten years and 5% at the lowest risk where only 1% died. Furthermore, the predicted risk of death correlates with measures of frailty in an independent dataset. Our results show that the plasma proteome can be used to assess general health and estimate the risk of death.
Clustering is a popular technique for explorative analysis of data, as it can reveal subgroupings and similarities between data in an unsupervised manner. While clustering is routinely applied to ...gene expression data, there is a lack of appropriate general methodology for clustering of sequence-level genomic and epigenomic data, e.g. ChIP-based data. We here introduce a general methodology for clustering data sets of coordinates relative to a genome assembly, i.e. genomic tracks. By defining appropriate feature extraction approaches and similarity measures, we allow biologically meaningful clustering to be performed for genomic tracks using standard clustering algorithms. An implementation of the methodology is provided through a tool, ClusTrack, which allows fine-tuned clustering analyses to be specified through a web-based interface. We apply our methods to the clustering of occupancy of the H3K4me1 histone modification in samples from a range of different cell types. The majority of samples form meaningful subclusters, confirming that the definitions of features and similarity capture biological, rather than technical, variation between the genomic tracks. Input data and results are available, and can be reproduced, through a Galaxy Pages document at http://hyperbrowser.uio.no/hb/u/hb-superuser/p/clustrack. The clustering functionality is available as a Galaxy tool, under the menu option "Specialized analyzis of tracks", and the submenu option "Cluster tracks based on genome level similarity", at the Genomic HyperBrowser server: http://hyperbrowser.uio.no/hb/.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Hemoglobin is the essential oxygen-carrying molecule in humans and is regulated by cellular iron and oxygen sensing mechanisms. To search for novel variants associated with hemoglobin concentration, ...we performed genome-wide association studies of hemoglobin concentration using a combined set of 684,122 individuals from Iceland and the UK. Notably, we found seven novel variants, six rare coding and one common, at the ACO1 locus associating with either decreased or increased hemoglobin concentration. Of these variants, the missense Cys506Ser and the stop-gained Lys334Ter mutations are specific to eight and ten generation pedigrees, respectively, and have the two largest effects in the study (Effect
= -1.61 SD, CI
= -1.98, -1.35; Effect
= 0.63 SD, CI
= 0.36, 0.91). We also find Cys506Ser to associate with increased risk of persistent anemia (OR = 17.1, P = 2 × 10
). The strong bidirectional effects seen in this study implicate ACO1, a known iron sensing molecule, as a major homeostatic regulator of hemoglobin concentration.
The characteristic lobulated nuclear morphology of granulocytes is partially determined by composition of nuclear envelope proteins. Abnormal nuclear morphology is primarily observed as an increased ...number of hypolobulated immature neutrophils, called band cells, during infection or in rare envelopathies like Pelger-Huët anomaly. To search for sequence variants affecting nuclear morphology of granulocytes, we performed a genome-wide association study using band neutrophil fraction from 88,101 Icelanders. We describe 13 sequence variants affecting band neutrophil fraction at nine loci. Five of the variants are at the Lamin B receptor (LBR) locus, encoding an inner nuclear membrane protein. Mutations in LBR are linked to Pelger-Huët anomaly. In addition, we identify cosegregation of a rare stop-gain sequence variant in LBR and Pelger Huët anomaly in an Icelandic eight generation pedigree, initially reported in 1963. Two of the other loci include genes which, like LBR, play a role in the nuclear membrane function and integrity. These GWAS results highlight the role proteins of the inner nuclear membrane have as important for neutrophil nuclear morphology.
The plasma proteome can help bridge the gap between the genome and diseases. Here we describe genome-wide association studies (GWASs) of plasma protein levels measured with 4,907 aptamers in 35,559 ...Icelanders. We found 18,084 associations between sequence variants and levels of proteins in plasma (protein quantitative trait loci; pQTL), of which 19% were with rare variants (minor allele frequency (MAF) < 1%). We tested plasma protein levels for association with 373 diseases and other traits and identified 257,490 associations. We integrated pQTL and genetic associations with diseases and other traits and found that 12% of 45,334 lead associations in the GWAS Catalog are with variants in high linkage disequilibrium with pQTL. We identified 938 genes encoding potential drug targets with variants that influence levels of possible biomarkers. Combining proteomics, genomics and transcriptomics, we provide a valuable resource that can be used to improve understanding of disease pathogenesis and to assist with drug discovery and development.
Transcription factors in disease-relevant pathways represent potential drug targets, by impacting a distinct set of pathways that may be modulated through gene regulation. The influence of ...transcription factors is typically studied on a per disease basis, and no current resources provide a global overview of the relations between transcription factors and disease. Furthermore, existing pipelines for related large-scale analysis are tailored for particular sources of input data, and there is a need for generic methodology for integrating complementary sources of genomic information.
We here present a large-scale analysis of multiple diseases versus multiple transcription factors, with a global map of over-and under-representation of 446 transcription factors in 1010 diseases. This map, referred to as the differential disease regulome, provides a first global statistical overview of the complex interrelationships between diseases, genes and controlling elements. The map is visualized using the Google map engine, due to its very large size, and provides a range of detailed information in a dynamic presentation format.The analysis is achieved through a novel methodology that performs a pairwise, genome-wide comparison on the cartesian product of two distinct sets of annotation tracks, e.g. all combinations of one disease and one TF.The methodology was also used to extend with maps using alternative data sets related to transcription and disease, as well as data sets related to Gene Ontology classification and histone modifications. We provide a web-based interface that allows users to generate other custom maps, which could be based on precisely specified subsets of transcription factors and diseases, or, in general, on any categorical genome annotation tracks as they are improved or become available.
We have created a first resource that provides a global overview of the complex relations between transcription factors and disease. As the accuracy of the disease regulome depends mainly on the quality of the input data, forthcoming ChIP-seq based binding data for many TFs will provide improved maps. We further believe our approach to genome analysis could allow an advance from the current typical situation of one-time integrative efforts to reproducible and upgradable integrative analysis. The differential disease regulome and its associated methodology is available at http://hyperbrowser.uio.no.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Nonalcoholic fatty liver (NAFL) and its sequelae are growing health problems. We performed a genome-wide association study of NAFL, cirrhosis and hepatocellular carcinoma, and integrated the findings ...with expression and proteomic data. For NAFL, we utilized 9,491 clinical cases and proton density fat fraction extracted from 36,116 liver magnetic resonance images. We identified 18 sequence variants associated with NAFL and 4 with cirrhosis, and found rare, protective, predicted loss-of-function variants in MTARC1 and GPAM, underscoring them as potential drug targets. We leveraged messenger RNA expression, splicing and predicted coding effects to identify 16 putative causal genes, of which many are implicated in lipid metabolism. We analyzed levels of 4,907 plasma proteins in 35,559 Icelanders and 1,459 proteins in 47,151 UK Biobank participants, identifying multiple proteins involved in disease pathogenesis. We show that proteomics can discriminate between NAFL and cirrhosis. The present study provides insights into the development of noninvasive evaluation of NAFL and new therapeutic options.