Abstract
We develop an auto-reservoir computing framework, Auto-Reservoir Neural Network (ARNN), to efficiently and accurately make multi-step-ahead predictions based on a short-term high-dimensional ...time series. Different from traditional reservoir computing whose reservoir is an external dynamical system irrelevant to the target system, ARNN directly transforms the observed high-dimensional dynamics as its reservoir, which maps the high-dimensional/spatial data to the future temporal values of a target variable based on our spatiotemporal information (STI) transformation. Thus, the multi-step prediction of the target variable is achieved in an accurate and computationally efficient manner. ARNN is successfully applied to both representative models and real-world datasets, all of which show satisfactory performance in the multi-step-ahead prediction, even when the data are perturbed by noise and when the system is time-varying. Actually, such ARNN transformation equivalently expands the sample size and thus has great potential in practical applications in artificial intelligence and machine learning.
Single-cell RNA sequencing (scRNA-seq) is able to give an insight into the gene-gene associations or transcriptional networks among cell populations based on the sequencing of a large number of ...cells. However, traditional network methods are limited to the grouped cells instead of each single cell, and thus the heterogeneity of single cells will be erased. We present a new method to construct a cell-specific network (CSN) for each single cell from scRNA-seq data (i.e. one network for one cell), which transforms the data from 'unstable' gene expression form to 'stable' gene association form on a single-cell basis. In particular, it is for the first time that we can identify the gene associations/network at a single-cell resolution level. By CSN method, scRNA-seq data can be analyzed for clustering and pseudo-trajectory from network perspective by any existing method, which opens a new way to scRNA-seq data analyses. In addition, CSN is able to find differential gene associations for each single cell, and even 'dark' genes that play important roles at the network level but are generally ignored by traditional differential gene expression analyses. In addition, CSN can be applied to construct individual network of each sample bulk RNA-seq data. Experiments on various scRNA-seq datasets validated the effectiveness of CSN in terms of accuracy and robustness.
Abstract
Simultaneous profiling transcriptomic and chromatin accessibility information in the same individual cells offers an unprecedented resolution to understand cell states. However, ...computationally effective methods for the integration of these inherent sparse and heterogeneous data are lacking. Here, we present a single-cell multimodal variational autoencoder model, which combines three types of joint-learning strategies with a probabilistic Gaussian Mixture Model to learn the joint latent features that accurately represent these multilayer profiles. Studies on both simulated datasets and real datasets demonstrate that it has more preferable capability (i) dissecting cellular heterogeneity in the joint-learning space, (ii) denoising and imputing data and (iii) constructing the association between multilayer omics data, which can be used for understanding transcriptional regulatory mechanisms.
Quantitatively identifying direct dependencies between variables is an important task in data analysis, in particular for reconstructing various types of networks and causal relations in science and ...engineering. One of the most widely used criteria is partial correlation, but it can only measure linearly direct association and miss nonlinear associations. However, based on conditional independence, conditional mutual information (CMI) is able to quantify nonlinearly direct relationships among variables from the observed data, superior to linear measures, but suffers from a serious problem of underestimation, in particular for those variables with tight associations in a network, which severely limits its applications. In this work, we propose a new concept, “partial independence,” with a new measure, “part mutual information” (PMI), which not only can overcome the problem of CMI but also retains the quantification properties of both mutual information (MI) and CMI. Specifically, we first defined PMI to measure nonlinearly direct dependencies between variables and then derived its relations with MI and CMI. Finally, we used a number of simulated data as benchmark examples to numerically demonstrate PMI features and further real gene expression data from Escherichia coli and yeast to reconstruct gene regulatory networks, which all validated the advantages of PMI for accurately quantifying nonlinearly direct associations in networks.
A complex disease generally results not from malfunction of individual molecules but from dysfunction of the relevant system or network, which dynamically changes with time and conditions. Thus, ...estimating a condition-specific network from a single sample is crucial to elucidating the molecular mechanisms of complex diseases at the system level. However, there is currently no effective way to construct such an individual-specific network by expression profiling of a single sample because of the requirement of multiple samples for computing correlations. We developed here with a statistical method, i.e. a sample-specific network (SSN) method, which allows us to construct individual-specific networks based on molecular expressions of a single sample. Using this method, we can characterize various human diseases at a network level. In particular, such SSNs can lead to the identification of individual-specific disease modules as well as driver genes, even without gene sequencing information. Extensive analysis by using the Cancer Genome Atlas data not only demonstrated the effectiveness of the method, but also found new individual-specific driver genes and network patterns for various types of cancer. Biological experiments on drug resistance further validated one important advantage of our method over the traditional methods, i.e. we can even identify such drug resistance genes that actually have no clear differential expression between samples with and without the resistance, due to the additional network information.
Many studies have been carried out for early diagnosis of complex diseases by finding accurate and robust biomarkers specific to respective diseases. In particular, recent rapid advance of ...high‐throughput technologies provides unprecedented rich information to characterize various disease genotypes and phenotypes in a global and also dynamical manner, which significantly accelerates the study of biomarkers from both theoretical and clinical perspectives. Traditionally, molecular biomarkers that distinguish disease samples from normal samples are widely adopted in clinical practices due to their ease of data measurement. However, many of them suffer from low coverage and high false‐positive rates or high false‐negative rates, which seriously limit their further clinical applications. To overcome those difficulties, network biomarkers (or module biomarkers) attract much attention and also achieve better performance because a network (or subnetwork) is considered to be a more robust form to characterize diseases than individual molecules. But, both molecular biomarkers and network biomarkers mainly distinguish disease samples from normal samples, and they generally cannot ensure to identify predisease samples due to their static nature, thereby lacking ability to early diagnosis. Based on nonlinear dynamical theory and complex network theory, a new concept of dynamical network biomarkers (DNBs, or a dynamical network of biomarkers) has been developed, which is different from traditional static approaches, and the DNB is able to distinguish a predisease state from normal and disease states by even a small number of samples, and therefore has great potential to achieve “real” early diagnosis of complex diseases. In this paper, we comprehensively review the recent advances and developments on molecular biomarkers, network biomarkers, and DNBs in particular, focusing on the biomarkers for early diagnosis of complex diseases considering a small number of samples and high‐throughput data (or big data). Detailed comparisons of various types of biomarkers as well as their applications are also discussed.
Abstract
Motivation
The time evolution or dynamic change of many biological systems during disease progression is not always smooth but occasionally abrupt, that is, there is a tipping point during ...such a process at which the system state shifts from the normal state to a disease state. It is challenging to predict such disease state with the measured omics data, in particular when only a single sample is available.
Results
In this study, we developed a novel approach, i.e. single-sample landscape entropy (SLE) method, to identify the tipping point during disease progression with only one sample data. Specifically, by evaluating the disorder of a network projected from a single-sample data, SLE effectively characterizes the criticality of this single sample network in terms of network entropy, thereby capturing not only the signals of the impending transition but also its leading network, i.e. dynamic network biomarkers. Using this method, we can characterize sample-specific state during disease progression and thus achieve the disease prediction of each individual by only one sample. Our method was validated by successfully identifying the tipping points just before the serious disease symptoms from four real datasets of individuals or subjects, including influenza virus infection, lung cancer metastasis, prostate cancer and acute lung injury.
Availability and implementation
https://github.com/rabbitpei/SLE.
Supplementary information
Supplementary data are available at Bioinformatics online.
This paper proposes a practical and effective model for the generation forecasting of a wind farm with an emphasis on its scheduling and trading in a wholesale electricity market. A novel forecasting ...model is developed based on indepth investigations of meteorological information. This model adopts a two-stage hybrid network with Bayesian clustering by dynamics and support vector regression. The proposed structure is robust with different input data types and can deal with the nonstationarity of wind speed and generation series well. Once the network is trained, we can straightforward predict the 48-h ahead wind power generation. To demonstrate the effectiveness, the model is applied and tested on a 74-MW wind farm located in the southwest Oklahoma of the United States.
Developing predictive biomarkers that can detect the tipping point before metastasis of hepatocellular carcinoma (HCC), is critical to prevent further irreversible deterioration. To discover such ...early-warning signals or biomarkers of pulmonary metastasis in HCC, we analyse time-series gene expression data in spontaneous pulmonary metastasis mice HCCLM3-RFP model with our dynamic network biomarker (DNB) method, and identify CALML3 as a core DNB member. All experimental results of gain-of-function and loss-of-function studies show that CALML3 could indicate metastasis initiation and act as a suppressor of metastasis. We also reveal the biological role of CALML3 in metastasis initiation at a network level, including proximal regulation and cascading influences in dysfunctional pathways. Our further experiments and clinical samples show that DNB with CALML3 reduced pulmonary metastasis in liver cancer. Actually, loss of CALML3 predicts shorter overall and relapse-free survival in postoperative HCC patients, thus providing a prognostic biomarker and therapy target in HCC.
The competitive endogenous RNA (ceRNA) hypothesis suggests an intrinsic mechanism to regulate biological processes. However, whether the dynamic changes of ceRNAs can modulate miRNA activities ...remains controversial. Here, we examine the dynamics of ceRNAs during TGF-β-induced epithelial-to-mesenchymal transition (EMT). We observe that TGFBI, a transcript highly induced during EMT in A549 cells, acts as the ceRNA for miR-21 to modulate EMT. We further identify FN1 as the ceRNA for miR-200c in the canonical SNAIL-ZEB-miR200 circuit in MCF10A cells. Experimental assays and computational simulations demonstrate that the dynamically induced ceRNAs are directly coupled with the canonical double negative feedback loops and are critical to the induction of EMT. These results help to establish the relevance of ceRNA in cancer EMT and suggest that ceRNA is an intrinsic component of the EMT regulatory circuit and may represent a potential target to disrupt EMT during tumorigenesis.