Prostate cancer (PrCa) is the second most prevalent malignancy in men worldwide. Observational studies have linked the use of low-density lipoprotein cholesterol (LDL-c) lowering therapies with ...reduced risk of PrCa, which may potentially be attributable to confounding factors. In this study, we performed a drug target Mendelian randomisation (MR) analysis to evaluate the association of genetically proxied inhibition of LDL-c-lowering drug targets on risk of PrCa.
Single-nucleotide polymorphisms (SNPs) associated with LDL-c (P < 5 × 10-8) from the Global Lipids Genetics Consortium genome-wide association study (GWAS) (N = 1,320,016) and located in and around the HMGCR, NPC1L1, and PCSK9 genes were used to proxy the therapeutic inhibition of these targets. Summary-level data regarding the risk of total, advanced, and early-onset PrCa were obtained from the PRACTICAL consortium. Validation analyses were performed using genetic instruments from an LDL-c GWAS conducted on male UK Biobank participants of European ancestry (N = 201,678), as well as instruments selected based on liver-derived gene expression and circulation plasma levels of targets. We also investigated whether putative mediators may play a role in findings for traits previously implicated in PrCa risk (i.e., lipoprotein a (Lp(a)), body mass index (BMI), and testosterone). Applying two-sample MR using the inverse-variance weighted approach provided strong evidence supporting an effect of genetically proxied inhibition of PCSK9 (equivalent to a standard deviation (SD) reduction in LDL-c) on lower risk of total PrCa (odds ratio (OR) = 0.85, 95% confidence interval (CI) = 0.76 to 0.96, P = 9.15 × 10-3) and early-onset PrCa (OR = 0.70, 95% CI = 0.52 to 0.95, P = 0.023). Genetically proxied HMGCR inhibition provided a similar central effect estimate on PrCa risk, although with a wider 95% CI (OR = 0.83, 95% CI = 0.62 to 1.13, P = 0.244), whereas genetically proxied NPC1L1 inhibition had an effect on higher PrCa risk with a 95% CI that likewise included the null (OR = 1.34, 95% CI = 0.87 to 2.04, P = 0.180). Analyses using male-stratified instruments provided consistent results. Secondary MR analyses supported a genetically proxied effect of liver-specific PCSK9 expression (OR = 0.90 per SD reduction in PCSK9 expression, 95% CI = 0.86 to 0.95, P = 5.50 × 10-5) and circulating plasma levels of PCSK9 (OR = 0.93 per SD reduction in PCSK9 protein levels, 95% CI = 0.87 to 0.997, P = 0.04) on PrCa risk. Colocalization analyses identified strong evidence (posterior probability (PPA) = 81.3%) of a shared genetic variant (rs553741) between liver-derived PCSK9 expression and PrCa risk, whereas weak evidence was found for HMGCR (PPA = 0.33%) and NPC1L1 expression (PPA = 0.38%). Moreover, genetically proxied PCSK9 inhibition was strongly associated with Lp(a) levels (Beta = -0.08, 95% CI = -0.12 to -0.05, P = 1.00 × 10-5), but not BMI or testosterone, indicating a possible role for Lp(a) in the biological mechanism underlying the association between PCSK9 and PrCa. Notably, we emphasise that our estimates are based on a lifelong exposure that makes direct comparisons with trial results challenging.
Our study supports a strong association between genetically proxied inhibition of PCSK9 and a lower risk of total and early-onset PrCa, potentially through an alternative mechanism other than the on-target effect on LDL-c. Further evidence from clinical studies is needed to confirm this finding as well as the putative mediatory role of Lp(a).
Mendelian randomization (MR) has been used to estimate the causal effect of body mass index (BMI) on particular traits thought to be affected by BMI. However, BMI may also be a modifiable, causal ...risk factor for outcomes where there is no prior reason to suggest that a causal effect exists. We performed a MR phenome-wide association study (MR-pheWAS) to search for the causal effects of BMI in UK Biobank (n = 334 968), using the PHESANT open-source phenome scan tool. A subset of identified associations were followed up with a formal two-stage instrumental variable analysis in UK Biobank, to estimate the causal effect of BMI on these phenotypes. Of the 22 922 tests performed, our MR-pheWAS identified 587 associations below a stringent P value threshold corresponding to a 5% estimated false discovery rate. These included many previously identified causal effects, for instance, an adverse effect of higher BMI on risk of diabetes and hypertension. We also identified several novel effects, including protective effects of higher BMI on a set of psychosocial traits, identified initially in our preliminary MR-pheWAS in circa 115,000 UK Biobank participants and replicated in a different subset of circa 223,000 UK Biobank participants. Our comprehensive MR-pheWAS identified potential causal effects of BMI on a large and diverse set of phenotypes. This included both previously identified causal effects, and novel effects such as a protective effect of higher BMI on feelings of nervousness.
The human proteome is a major source of therapeutic targets. Recent genetic association analyses of the plasma proteome enable systematic evaluation of the causal consequences of variation in plasma ...protein levels. Here we estimated the effects of 1,002 proteins on 225 phenotypes using two-sample Mendelian randomization (MR) and colocalization. Of 413 associations supported by evidence from MR, 130 (31.5%) were not supported by results of colocalization analyses, suggesting that genetic confounding due to linkage disequilibrium is widespread in naïve phenome-wide association studies of proteins. Combining MR and colocalization evidence in cis-only analyses, we identified 111 putatively causal effects between 65 proteins and 52 disease-related phenotypes ( https://www.epigraphdb.org/pqtl/ ). Evaluation of data from historic drug development programs showed that target-indication pairs with MR and colocalization support were more likely to be approved, evidencing the value of this approach in identifying and prioritizing potential therapeutic targets.
High blood pressure is a major risk factor for cardiovascular disease and is influenced by both environmental and genetic factors. Epigenetic processes including DNA methylation potentially mediate ...the relationship between genetic factors, the environment and cardiovascular disease. Despite an increased risk of hypertension and cardiovascular disease in individuals of South Asians compared to Europeans, it is not clear whether associations between blood pressure and DNA methylation differ between these groups.
We performed an epigenome-wide association study and differentially methylated region (DMR) analysis to identify DNA methylation sites and regions that were associated with systolic blood pressure, diastolic blood pressure and hypertension. We analyzed samples from 364 European and 348 South Asian men (first generation migrants to the UK) from the Southall And Brent REvisited cohort, measuring DNA methylation from blood using the Illumina Infinium® HumanMethylation450 BeadChip.
One CpG site was found to be associated with DBP in trans-ancestry analyses (i.e. both ethnic groups combined), while in Europeans alone seven CpG sites were associated with DBP. No associations were identified between DNA methylation and either SBP or hypertension. Comparison of effect sizes between South Asian and European EWAS for DBP, SBP and hypertension revealed little concordance between analyses. DMR analysis identified several regions with known relationships with CVD and its risk factors.
This study identified differentially methylated sites and regions associated with blood pressure and revealed ethnic differences in these associations. These findings may point to molecular pathways which may explain the elevated cardiovascular disease risk experienced by those of South Asian ancestry when compared to Europeans.
Abstract
Summary
We present FATHMM-XF, a method for predicting pathogenic point mutations in the human genome. Drawing on an extensive feature set, FATHMM-XF outperforms competitors on benchmark ...tests, particularly in non-coding regions where the majority of pathogenic mutations are likely to be found.
Availability and implementation
The FATHMM-XF web server is available at http://fathmm.biocompute.org.uk/fathmm-xf/, and as tracks on the Genome Tolerance Browser: http://gtb.biocompute.org.uk. Predictions are provided for human genome version GRCh37/hg19. The data used for this project can be downloaded from: http://fathmm.biocompute.org.uk/fathmm-xf/
Supplementary information
Supplementary data are available at Bioinformatics online.
GWAS summary statistics are fundamental for a variety of research applications yet no common storage format has been widely adopted. Existing tabular formats ambiguously or incompletely store ...information about genetic variants and associations, lack essential metadata and are typically not indexed yielding poor query performance and increasing the possibility of errors in data interpretation and post-GWAS analyses. To address these issues, we adapted the variant call format to store GWAS summary statistics (GWAS-VCF) and developed open-source tools to use this format in downstream analyses. We provide open access to over 10,000 complete GWAS summary datasets converted to this format ( https://gwas.mrcieu.ac.uk ).
Technological advances have enabled the identification of an increasingly large spectrum of single nucleotide variants within the human genome, many of which may be associated with monogenic disease ...or complex traits. Here, we propose an integrative approach, named FATHMM-MKL, to predict the functional consequences of both coding and non-coding sequence variants. Our method utilizes various genomic annotations, which have recently become available, and learns to weight the significance of each component annotation source.
We show that our method outperforms current state-of-the-art algorithms, CADD and GWAVA, when predicting the functional consequences of non-coding variants. In addition, FATHMM-MKL is comparable to the best of these algorithms when predicting the impact of coding variants. The method includes a confidence measure to rank order predictions.
ABSTRACT
Summary
The field of literature-based discovery is growing in step with the volume of literature being produced. From modern natural language processing algorithms to high quality entity ...tagging, the methods and their impact are developing rapidly. One annotation object that arises from these approaches, the subject–predicate–object triple, is proving to be very useful in representing knowledge. We have implemented efficient search methods and an application programming interface, to create fast and convenient functions to utilize triples extracted from the biomedical literature by SemMedDB. By refining these data, we have identified a set of triples that focus on the mechanistic aspects of the literature, and provide simple methods to explore both enriched triples from single queries, and overlapping triples across two query lists.
Availability and Implementation
https://melodi-presto.mrcieu.ac.uk/.
Supplementary information
Supplementary data are available at Bioinformatics online.
Recent advancements in sequencing technologies have led to the discovery of numerous variants in the human genome. However, understanding their precise roles in diseases remains challenging due to ...their complex functional mechanisms. Various methodologies have emerged to predict the pathogenic significance of these genetic variants. Typically, these methods employ an integrative approach, leveraging diverse data sources that provide important insights into genomic function. Despite the abundance of publicly available data sources and databases, the process of navigating, extracting, and pre-processing features for machine learning models can be highly challenging and time-consuming. Furthermore, researchers often invest substantial effort in feature extraction, only to later discover that these features lack informativeness.
In this article, we introduce DrivR-Base, an innovative resource that efficiently extracts and integrates molecular information (features) related to single nucleotide variants. These features encompass information about the genomic positions and the associated protein positions of a variant. They are derived from a wide array of databases and tools, including structural properties obtained from AlphaFold, regulatory information sourced from ENCODE, and predicted variant consequences from Variant Effect Predictor. DrivR-Base is easily deployable via a Docker container to ensure reproducibility and ease of access across diverse computational environments. The resulting features can be used as input for machine learning models designed to predict the pathogenic impact of human genome variants in disease. Moreover, these feature sets have applications beyond this, including haploinsufficiency prediction and the development of drug repurposing tools. We describe the resource's development, practical applications, and potential for future expansion and enhancement.
DrivR-Base source code is available at https://github.com/amyfrancis97/DrivR-Base.
Results from genome-wide association studies (GWAS) can be used to infer causal relationships between phenotypes, using a strategy known as 2-sample Mendelian randomization (2SMR) and bypassing the ...need for individual-level data. However, 2SMR methods are evolving rapidly and GWAS results are often insufficiently curated, undermining efficient implementation of the approach. We therefore developed MR-Base (<ext-link ext-link-type="uri" xlink:href="http://www.mrbase.org">http://www.mrbase.org</ext-link>): a platform that integrates a curated database of complete GWAS results (no restrictions according to statistical significance) with an application programming interface, web app and R packages that automate 2SMR. The software includes several sensitivity analyses for assessing the impact of horizontal pleiotropy and other violations of assumptions. The database currently comprises 11 billion single nucleotide polymorphism-trait associations from 1673 GWAS and is updated on a regular basis. Integrating data with software ensures more rigorous application of hypothesis-driven analyses and allows millions of potential causal relationships to be efficiently evaluated in phenome-wide association studies.