Biomolecular pathways and networks are dynamic and complex, and the perturbations to them which cause disease are often multiple, heterogeneous and contingent. Pathway and network visualizations, ...rendered on a computer or published on paper, however, tend to be static, lacking in detail, and ill-equipped to explore the variety and quantities of data available today, and the complex causes we seek to understand.
RCytoscape integrates R (an open-ended programming environment rich in statistical power and data-handling facilities) and Cytoscape (powerful network visualization and analysis software). RCytoscape extends Cytoscape's functionality beyond what is possible with the Cytoscape graphical user interface. To illustrate the power of RCytoscape, a portion of the Glioblastoma multiforme (GBM) data set from the Cancer Genome Atlas (TCGA) is examined. Network visualization reveals previously unreported patterns in the data suggesting heterogeneous signaling mechanisms active in GBM Proneural tumors, with possible clinical relevance.
Progress in bioinformatics and computational biology depends upon exploratory and confirmatory data analysis, upon inference, and upon modeling. These activities will eventually permit the prediction and control of complex biological systems. Network visualizations--molecular maps--created from an open-ended programming environment rich in statistical power and data-handling facilities, such as RCytoscape, will play an essential role in this progression.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
CytoscapeRPC is a plugin for Cytoscape which allows users to create, query and modify Cytoscape networks from any programming language which supports XML-RPC. This enables them to access Cytoscape ...functionality and visualize their data interactively without leaving the programming environment with which they are familiar.
Install through the Cytoscape plugin manager or visit the web page: http://wiki.nbic.nl/index.php/CytoscapeRPC for the user tutorial and download.
j.j.bot@tudelft.nl; j.j.bot@tudelft.nl.
Tumorigenesis is a multi-step process in which normal cells transform into malignant tumors following the accumulation of genetic mutations that enable them to evade the growth control checkpoints ...that would normally suppress their growth or result in apoptosis. It is therefore important to identify those combinations of mutations that collaborate in cancer development and progression. DNA copy number alterations (CNAs) are one of the ways in which cancer genes are deregulated in tumor cells. We hypothesized that synergistic interactions between cancer genes might be identified by looking for regions of co-occurring gain and/or loss. To this end we developed a scoring framework to separate truly co-occurring aberrations from passenger mutations and dominant single signals present in the data. The resulting regions of high co-occurrence can be investigated for between-region functional interactions. Analysis of high-resolution DNA copy number data from a panel of 95 hematological tumor cell lines correctly identified co-occurring recombinations at the T-cell receptor and immunoglobulin loci in T- and B-cell malignancies, respectively, showing that we can recover truly co-occurring genomic alterations. In addition, our analysis revealed networks of co-occurring genomic losses and gains that are enriched for cancer genes. These networks are also highly enriched for functional relationships between genes. We further examine sub-networks of these networks, core networks, which contain many known cancer genes. The core network for co-occurring DNA losses we find seems to be independent of the canonical cancer genes within the network. Our findings suggest that large-scale, low-intensity copy number alterations may be an important feature of cancer development or maintenance by affecting gene dosage of a large interconnected network of functionally related genes.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
We show that epigenome- and transcriptome-wide association studies (EWAS and TWAS) are prone to significant inflation and bias of test statistics, an unrecognized phenomenon introducing spurious ...findings if left unaddressed. Neither GWAS-based methodology nor state-of-the-art confounder adjustment methods completely remove bias and inflation. We propose a Bayesian method to control bias and inflation in EWAS and TWAS based on estimation of the empirical null distribution. Using simulations and real data, we demonstrate that our method maximizes power while properly controlling the false positive rate. We illustrate the utility of our method in large-scale EWAS and TWAS meta-analyses of age and smoking.
Different exposures, including diet, physical activity, or external conditions can contribute to genotype-environment interactions (G×E). Although high-dimensional environmental data are increasingly ...available and multiple exposures have been implicated with G×E at the same loci, multi-environment tests for G×E are not established. Here, we propose the structured linear mixed model (StructLMM), a computationally efficient method to identify and characterize loci that interact with one or more environments. After validating our model using simulations, we applied StructLMM to body mass index in the UK Biobank, where our model yields previously known and novel G×E signals. Finally, in an application to a large blood eQTL dataset, we demonstrate that StructLMM can be used to study interactions with hundreds of environmental variables.
Background:
Smoking-associated DNA methylation levels identified through epigenome-wide association studies (EWASs) are generally ascribed to smoking-reactive mechanisms, but the contribution of a ...shared genetic predisposition to smoking and DNA methylation levels is typically not accounted for.
Methods:
We exploited a strong within-family design, that is, the discordant monozygotic twin design, to study reactiveness of DNA methylation in blood cells to smoking and reversibility of methylation patterns upon quitting smoking. Illumina HumanMethylation450 BeadChip data were available for 769 monozygotic twin pairs (mean age = 36 years, range = 18–78, 70% female), including pairs discordant or concordant for current or former smoking.
Results:
In pairs discordant for current smoking, 13 differentially methylated CpGs were found between current smoking twins and their genetically identical co-twin who never smoked. Top sites include multiple CpGs in
CACNA1D
and
GNG12
, which encode subunits of a calcium voltage-gated channel and G protein, respectively. These proteins interact with the nicotinic acetylcholine receptor, suggesting that methylation levels at these CpGs might be reactive to nicotine exposure. All 13 CpGs have been previously associated with smoking in unrelated individuals and data from monozygotic pairs discordant for former smoking indicated that methylation patterns are to a large extent reversible upon smoking cessation. We further showed that differences in smoking level exposure for monozygotic twins who are both current smokers but differ in the number of cigarettes they smoke are reflected in their DNA methylation profiles.
Conclusions:
In conclusion, by analysing data from monozygotic twins, we robustly demonstrate that DNA methylation level in human blood cells is reactive to cigarette smoking.
Funding:
We acknowledge funding from the National Institute on Drug Abuse grant DA049867, the Netherlands Organization for Scientific Research (NWO): Biobanking and Biomolecular Research Infrastructure (BBMRI-NL, NWO 184.033.111) and the BBRMI-NL-financed BIOS Consortium (NWO 184.021.007), NWO Large Scale infrastructures X-Omics (184.034.019), Genotype/phenotype database for behaviour genetic and genetic epidemiological studies (ZonMw Middelgroot 911-09-032); Netherlands Twin Registry Repository: researching the interplay between genome and environment (NWO-Groot 480-15-001/674); the Avera Institute, Sioux Falls (USA), and the National Institutes of Health (NIH R01 HD042157-01A1, MH081802, Grand Opportunity grants 1RC2 MH089951 and 1RC2 MH089995); epigenetic data were generated at the Human Genomics Facility (HuGe-F) at ErasmusMC Rotterdam. Cotinine assaying was sponsored by the Neuroscience Campus Amsterdam. DIB acknowledges the Royal Netherlands Academy of Science Professor Award (PAH/6635).
The genetic information of people who smoke present distinctive characteristics. In particular, previous research has revealed differences in patterns of DNA methylation, a type of chemical modification that helps cells switch certain genes on or off. However, most of these studies could not establish for sure whether these changes were caused by smoking, predisposed individuals to smoke, or were driven by underlying genetic variation in the DNA sequence itself.
To investigate this question, van Dongen et al. examined DNA methylation data from the blood cells of over 700 pairs of identical twins. These individuals share the exact same genetic information, making it possible to better evaluate the impact of lifestyle on DNA modifications.
The analyses identified differences in methylation at 13 DNA locations in pairs of twins where one was a current smoker and their sibling had never smoked. Two of the genes code for proteins involved in the response to nicotine, the primary addictive chemical in cigarette smoke. The differences were smaller if one of the twins had stopped smoking, suggesting that quitting can help to reverse some of these changes.
These findings confirm that DNA methylation in blood cells is influenced by cigarette smoke, which could help to better understand smoking-associated diseases. They also demonstrate how useful identical twins studies can be to identify methylation changes that are markers of lifestyle.
Background: Smoking impacts DNA methylation, but data are lacking on smoking-related differential methylation by sex or dietary intake, recent smoking cessation (<1 year), persistence of differential ...methylation from in utero smoking exposure, and effects of environmental tobacco smoke (ETS). Methods: We meta-analysed data from up to 15,014 adults across 5 cohorts with DNA methylation measured in blood using Illumina's EPIC array for current smoking (2560 exposed), quit < 1 year (500 exposed), in utero (286 exposed), and ETS exposure (676 exposed). We also evaluated the interaction of current smoking with sex or diet (fibre, folate, and vitamin C). Findings: Using false discovery rate (FDR < 0.05), 65,857 CpGs were differentially methylated in relation to current smoking, 4025 with recent quitting, 594 with in utero exposure, and 6 with ETS. Most current smoking CpGs attenuated within a year of quitting. CpGs related to in utero exposure in adults were enriched for those previously observed in newborns. Differential methylation by current smoking at 4–71 CpGs may be modified by sex or dietary intake. Nearly half (35–50%) of differentially methylated CpGs on the 450 K array were associated with blood gene expression. Current smoking and in utero smoking CpGs implicated 3049 and 1067 druggable targets, including chemotherapy drugs. Interpretation: Many smoking-related methylation sites were identified with Illumina’s EPIC array. Most signals revert to levels observed in never smokers within a year of cessation. Many in utero smoking CpGs persist into adulthood. Smoking-related druggable targets may provide insights into cancer treatment response and shared mechanisms across smoking-related diseases. Funding: Intramural Research Program of the National Institutes of Health, Norwegian Ministry of Health and Care Services and the Ministry of Education and Research, Chief Scientist Office of the Scottish Government Health Directorates and the Scottish Funding Council, Medical Research Council UK and the Wellcome Trust.
Motivation: We propose an efficient method to infer combinatorial association logic networks from multiple genome-wide measurements from the same sample. We demonstrate our method on a genetical ...genomics dataset, in which we search for Boolean combinations of multiple genetic loci that associate with transcript levels. Results: Our method provably finds the global solution and is very efficient with runtimes of up to four orders of magnitude faster than the exhaustive search. This enables permutation procedures for determining accurate false positive rates and allows selection of the most parsimonious model. When applied to transcript levels measured in myeloid cells from 24 genotyped recombinant inbred mouse strains, we discovered that nine gene clusters are putatively modulated by a logical combination of trait loci rather than a single locus. A literature survey supports and further elucidates one of these findings. Due to our approach, optimal solutions for multi-locus logic models and accurate estimates of the associated false discovery rates become feasible. Our algorithm, therefore, offers a valuable alternative to approaches employing complex, albeit suboptimal optimization strategies to identify complex models. Availability: The MATLAB code of the prototype implementation is available on: http://bioinformatics.tudelft.nl/ or http://bioinformatics.nki.nl/ Contact: m.j.t.reinders@tudelft.nl; l.wessels@nki.nl
Motivation: Cancers are caused by an accumulation of multiple independent mutations that collectively deregulate cellular pathways, e.g. such as those regulating cell division and cell-death. The ...publicly available Retroviral Tagged Cancer Gene Database (RTCGD) contains the data of many insertional mutagenesis screens, in which the virally induced mutations result in tumor formation in mice. The insertion loci therefore indicate the location of putative cancer genes. Additionally, the presence of multiple independent insertions within one tumor hints towards a cooperation between the insertionally mutated genes. In this study we focus on the detection of statistically significant co-mutations. Results: We propose a two-dimensional Gaussian Kernel Convolution method (2DGKC), a computational technique that identifies the cooperating mutations in insertional mutagenesis data. We define the Common Co-occurrence of Insertions (CCI), signifying the co-mutations that are statistically significant across all different screens in the RTCGD. Significance estimates are made on multiple scales, and the results visualized in a scale space, thereby providing valuable extra information on the putative cooperation. The multidimensional analysis of the insertion data results in the discovery of 86 statistically significant co-mutations, indicating the presence of cooperating oncogenes that play a role in tumor development. Since oncogenes may cooperate with several members of a parallel pathway, we combined the co-occurrence data with gene family information to find significant cooperations between oncogenes and families of genes. We show, for instance, the interchangeable cooperation of Myc insertions with insertions in the Pim family. Availability: A list of the resulting CCIs is available at: http://ict.ewi.tudelft.nl/~jeroen/CCI/CCI_list.txt Contact: m.j.t.reinders@tudelft.nl
Abstract
Tumorigenesis is a multi-step process of successive (epi)genetic mutations enabling the transformation of normal cells. There has always been a large interest in finding relationships ...between mutations that interact to promote cancer development. Copy number alterations (CNAs) are one of the ways in which a tumor cell can affect cancer-related genes. We hypothesized that synergistic or redundant oncogenes can be found by looking for regions of co-occurring or mutually exclusive gain and/or loss in array comparative genomic hybridization (aCGH) datasets.
To find interacting CNAs we developed a genome-wide framework to separate truly related aberrations from the noisy passenger and dominant single signals present in the data. We designed a score for both co-occurring and mutually exclusive CNAs. This score was then evaluated for every pair of genomic locations. To reduce noise, we used Gaussian convolution on the 2D score-space. From the peaks in this space, we determine networks of related copy number changes. Since our analysis is done in a 2D genome-by-genome space we applied a distributed computing approach to solve the computational complexity.
We applied our approach to two different aCGH datasets. A dataset of 95 cell lines derived from hematological malignancies was examined for copy number co-occurrences and a dataset of 68 mouse mammary tumors was examined for mutual exclusiveness.
In the hematological cell-line dataset we are able to identify T- and B-cell specific co-occurring losses, showing that we can recover interacting CNAs that are known to be present in the data. Further analyses revealed large networks of copy number loss and gain. The genomic regions associated with these networks were significantly enriched for cancer-associated genes and functionally related genes, hinting at simultaneous deregulation of genes with similar cellular roles. Our findings suggest that perhaps large scale, low intensity CNAs may be an important feature of cancer. They might affect gene dosages of a large interconnected network of functionally related genes, whose loss or gain is not necessarily driven by a few canonical cancer genes.
Analysis of aCGH data of mouse mammary tumors revealed several highly mutually exclusive DNA amplifications. We were able to identify several candidate genes from these amplicons using gene expression data and protein immuno-histochemistry. Our discovery of several mutually exclusive high-level amplicons shows that there might be a redundant role for amplification of these strong proliferation-associated genes.
Although many single aberrations in tumors have been studied and investigated, our study describes, to our knowledge, the first high-resolution genome-wide search for interacting aberrations. Our approach offers a methodology to reveal novel, functional networks of copy number changes associated with oncogenesis.
Note: This abstract was not presented at the AACR 101st Annual Meeting 2010 because the presenter was unable to attend.
Citation Format: {Authors}. {Abstract title} abstract. In: Proceedings of the 101st Annual Meeting of the American Association for Cancer Research; 2010 Apr 17-21; Washington, DC. Philadelphia (PA): AACR; Cancer Res 2010;70(8 Suppl):Abstract nr 2140.