ASSA-PBN: A Toolbox for Probabilistic Boolean Networks Mizera, Andrzej; Pang, Jun; Su, Cui ...
IEEE/ACM transactions on computational biology and bioinformatics,
2018-July-Aug.-1, 2018 Jul-Aug, 2018-7-1, 20180701, Volume:
15, Issue:
4
Journal Article
Peer reviewed
As a well-established computational framework, probabilistic Boolean networks (PBNs) are widely used for modelling, simulation, and analysis of biological systems. To analyze the steady-state ...dynamics of PBNs is of crucial importance to explore the characteristics of biological systems. However, the analysis of large PBNs, which often arise in systems biology, is prone to the infamous state-space explosion problem. Therefore, the employment of statistical methods often remains the only feasible solution. We present <inline-formula><tex-math notation="LaTeX"> {\mathsf{ASSA-PBN}}</tex-math> <inline-graphic xlink:href="mizera-ieq1-2773477.gif"/> </inline-formula>, a software toolbox for modelling, simulation, and analysis of PBNs. <inline-formula> <tex-math notation="LaTeX">{\mathsf{ASSA-PBN}}</tex-math> <inline-graphic xlink:href="mizera-ieq2-2773477.gif"/> </inline-formula> provides efficient statistical methods with three parallel techniques to speed up the computation of steady-state probabilities. Moreover, particle swarm optimisation (PSO) and differential evolution (DE) are implemented for the estimation of PBN parameters. Additionally, we implement in-depth analyses of PBNs, including long-run influence analysis, long-run sensitivity analysis, computation of one-parameter profile likelihoods, and the visualization of one-parameter profile likelihoods. A PBN model of apoptosis is used as a case study to illustrate the main functionalities of <inline-formula><tex-math notation="LaTeX">{\mathsf{ASSA-PBN}}</tex-math> <inline-graphic xlink:href="mizera-ieq3-2773477.gif"/> </inline-formula> and to demonstrate the capabilities of <inline-formula><tex-math notation="LaTeX">{\mathsf{ASSA-PBN}}</tex-math> <inline-graphic xlink:href="mizera-ieq4-2773477.gif"/> </inline-formula> to effectively analyse biological systems modelled as PBNs.
Computational approaches for predicting drug-disease associations by integrating gene expression and biological network provide great insights to the complex relationships among drugs, targets, ...disease genes, and diseases at a system level. Hepatocellular carcinoma (HCC) is one of the most common malignant tumors with a high rate of morbidity and mortality. We provide an integrative framework to predict novel drugs for HCC based on multi-source random walk (PD-MRW). Firstly, based on gene expression and protein interaction network, we construct a gene-gene weighted interaction network (GWIN). Then, based on multisource random walk in GWIN, we build a drug-drug similarity network. Finally, based on the known drugs for HCC, we score all drugs in the drug-drug similarity network. The robustness of our predictions, their overlap with those reported in Comparative Toxicogenomics Database (CTD) and literatures, and their enriched KEGG pathway demonstrate our approach can effectively identify new drug indications. Specifically, regorafenib (Rank = 9 in top-20 list) is proven to be effective in Phase I and II clinical trials of HCC, and the Phase III trial is ongoing. And, it has 11 overlapping pathways with HCC with lower p-values. Focusing on a particular disease, we believe our approach is more accurate and possesses better scalability.
Cervical cancer is the third most common malignancy in women worldwide. It remains a leading cause of cancer-related death for women in developing countries. In order to contribute to the treatment ...of the cervical cancer, in our work, we try to find a few key genes resulting in the cervical cancer. Employing functions of several bioinformatics tools, we selected 143 differentially expressed genes (DEGs) associated with the cervical cancer. The results of bioinformatics analysis show that these DEGs play important roles in the development of cervical cancer. Through comparing two differential co-expression networks (DCNs) at two different states, we found a common sub-network and two differential sub-networks as well as some hub genes in three sub-networks. Moreover, some of the hub genes have been reported to be related to the cervical cancer. Those hub genes were analyzed from Gene Ontology function enrichment, pathway enrichment and protein binding three aspects. The results can help us understand the development of the cervical cancer and guide further experiments about the cervical cancer.
Mitosis detection plays an important role in the analysis of cell status and behavior and is therefore widely utilized in many biological research and medical applications. In this article, we ...propose a deep reinforcement learning-based progressive sequence saliency discovery network (PSSD)for mitosis detection in time-lapse phase contrast microscopy images. By discovering the salient frames when cell state changes in the sequence, PSSD can more effectively model the mitosis process for mitosis detection. We formulate the discovery of salient frames as a Markov Decision Process (MDP)that progressively adjusts the selection positions of salient frames in the sequence, and further leverage deep reinforcement learning to learn the policy in the salient frame discovery process. The proposed method consists of two parts: 1)the saliency discovery module that selects the salient frames from the input cell image sequence by progressively adjusting the selection positions of salient frames; 2)the mitosis identification module that takes a sequence of salient frames and performs temporal information fusion for mitotic sequence classification. Since the policy network of the saliency discovery module is trained under the guidance of the mitosis identification module, PSSD can comprehensively explore the salient frames that are beneficial for mitosis detection. To our knowledge, this is the first work to implement deep reinforcement learning to the mitosis detection problem. In the experiment, we evaluate the proposed method on the largest mitosis detection dataset, C2C12-16. Experiment results show that compared with the state-of-the-arts, the proposed method can achieve significant improvement for both mitosis identification and temporal localization on C2C12-16.
In recent years, a remarkable amount of protein-protein interaction (PPI) data are being available owing to the advance made in experimental high-throughput technologies. However, the experimentally ...detected PPI data usually contain a large amount of spurious links, which could contaminate the analysis of the biological significance of protein links and lead to incorrect biological discoveries, thereby posing new challenges to both computational and biological scientists. In this paper, we develop a new embedding algorithm called local similarity preserving embedding (LSPE) to rank the interaction possibility of protein links. By going beyond limitations of current geometric embedding methods for network denoising and emphasizing the local information of PPI networks, LSPE can avoid the unstableness of previous methods. We demonstrate experimental results on benchmark PPI networks and show that LSPE was the overall leader, outperforming the state-of-the-art methods in topological false links elimination problems.
RNA-Protein binding plays important roles in the field of gene expression. With the development of high throughput sequencing, several conventional methods and deep learning-based methods have been ...proposed to predict the binding preference of RNA-protein binding. These methods can hardly meet the need of consideration of the dependencies between subsequence and the various motif lengths of different translation factors (TFs). To overcome such limitations, we propose a predictive model that utilizes a combination of multi-scale convolutional layers and bidirectional gated recurrent unit (GRU) layer. Multi-scale convolution layer has the ability to capture the motif features of different lengths, and bidirectional GRU layer is able to capture the dependencies among subsequence. Experimental results show that the proposed method performs better than four state-of-the-art methods in this field. In addition, we investigate the effect of model structure on model performance by performing our proposed method with a different convolution layer and a different number of kernel size. We also demonstrate the effectiveness of bidirectional GRU in improving model performance through comparative experiments.
In this study, in order to take advantage of complementary information from different types of data for better disease status diagnosis, we combined gene expression with DNA methylation data and ...generated a fused network, based on which the stages of Kidney Renal Cell Carcinoma (KIRC) can be better identified. It is well recognized that a network is important for investigating the connectivity of disease groups. We exploited the potential of the network's features to identify the KIRC stage. We first constructed a patient network from each type of data. We then built a fused network based on network fusion method. Based on the link weights of patients, we used a generalized linear model to predict the group of KIRC subjects. Finally, the group prediction method was applied to test the power of network-based features. The performance (e.g., the accuracy of identifying cancer stages) when using the fused network from two types of data is shown to be superior to that when using two patient networks from only one data type. The work provides a good example for using network based features from multiple data types for a more comprehensive diagnosis.
RNA-Protein binding is involved in many different biological processes. With the progress of technology, more and more data are available for research. Based on these data, many prediction methods ...have been proposed to predict RNA-Protein binding preference. Some of these methods use only RNA sequence features for prediction, and some methods use multiple features for prediction. But, the performance of these methods is not satisfactory. In this study, we propose an improved capsule network to predict RNA-protein binding preferences, which can use both RNA sequence features and structure features. Experimental results show that our proposed method iCapsule performs better than three baseline methods in this field. We used both RNA sequence features and structure features in the model, so we tested the effect of primary capsule layer changes on model performance. In addition, we also studied the impact of model structure on model performance by performing our proposed method with different number of convolution layers and different kernel sizes.
In biomedical data, the imbalanced data problem occurs frequently and causes poor prediction performance for minority classes. It is because the trained classifiers are mostly derived from the ...majority class. In this paper, we describe an ensemble learning method combined with active example selection to resolve the imbalanced data problem. Our method consists of three key components: 1) an active example selection algorithm to choose informative examples for training the classifier, 2) an ensemble learning method to combine variations of classifiers derived by active example selection, and 3) an incremental learning scheme to speed up the iterative training procedure for active example selection. We evaluate the method on six real-world imbalanced data sets in biomedical domains, showing that the proposed method outperforms both the random under sampling and the ensemble with under sampling methods. Compared to other approaches to solving the imbalanced data problem, our method excels by 0.03-0.15 points in AUC measure.