Microarrays are commonly used in biology because of their ability to simultaneously measure thousands of genes under different conditions. Due to their structure, typically containing a high amount ...of variables but far fewer samples, scalable network analysis techniques are often employed. In particular, consensus approaches have been recently used that combine multiple microarray studies in order to find networks that are more robust. The purpose of this paper, however, is to combine multiple microarray studies to automatically identify subnetworks that are distinctive to specific experimental conditions rather than common to them all. To better understand key regulatory mechanisms and how they change under different conditions, we derive unique networks from multiple independent networks built using glasso which goes beyond standard correlations. This involves calculating cluster prediction accuracies to detect the most predictive genes for a specific set of conditions. We differentiate between accuracies calculated using cross-validation within a selected cluster of studies (the intra prediction accuracy) and those calculated on a set of independent studies belonging to different study clusters (inter prediction accuracy). Finally, we compare our method's results to related state-of-the art techniques. We explore how the proposed pipeline performs on both synthetic data and real data (wheat and Fusarium). Our results show that subnetworks can be identified reliably that are specific to subsets of studies and that these networks reflect key mechanisms that are fundamental to the experimental conditions in each of those subsets.
We provide evidence of how RAD51AP1 can be of importance as potential biomarker since it is overexpressed in both tissue and peripheral blood of ovarian and lung cancer patients. Silencing of the ...gene can also lead to decrease in cell proliferation in vitro, so a potential therapeutic target.
Abstract
To date, microarray analyses have led to the discovery of numerous individual 'molecular signatures' associated with specific cancers. However, there are serious limitations for the adoption of these multi-gene signatures in the clinical environment for diagnostic or prognostic testing as studies with more power need to be carried out. This may involve larger richer cohorts and more advanced analyses. In this study, we conduct analyses-based on gene regulatory network-to reveal distinct and common biomarkers across cancer types. Using microarray data of triple-negative and medullary breast, ovarian and lung cancers applied to a combination of glasso and Bayesian networks (BNs), we derived a unique network-containing genes that are uniquely involved: small proline-rich protein 1A (SPRR1A), follistatin like 1 (FSTL1), collagen type XII alpha 1 (COL12A1) and RAD51 associated protein 1 (RAD51AP1). RAD51AP1 and FSTL1 are significantly overexpressed in ovarian cancer patients but only RAD51AP1 is upregulated in lung cancer patients compared with healthy controls. The upregulation of RAD51AP1 was mirrored in the bloods of both ovarian and lung cancer patients, and Kaplan-Meier (KM) plots predicted poorer overall survival (OS) in patients with high expression of RAD51AP1. Suppression of RAD51AP1 by RNA interference reduced cell proliferation in vitro in ovarian (SKOV3) and lung (A549) cancer cells. This effect appears to be modulated by a decrease in the expression of mTOR-related genes and pro-metastatic candidate genes. Our data describe how an initial in silico approach can generate novel biomarkers that could potentially support current clinical practice and improve long-term outcomes.
A detailed network describing asparagine metabolism in plants was constructed using published data from Arabidopsis (Arabidopsis thaliana) maize (Zea mays), wheat (Triticum aestivum), pea (Pisum ...sativum), soybean (Glycine max), lupin (Lupus albus), and other species, including animals. Asparagine synthesis and degradation is a major part of amino acid and nitrogen metabolism in plants. The complexity of its metabolism, including limiting and regulatory factors, was represented in a logical sequence in a pathway diagram built using yED graph editor software. The network was used with a Unique Network Identification Pipeline in the analysis of data from 18 publicly available transcriptomic data studies. This identified links between genes involved in asparagine metabolism in wheat roots under drought stress, wheat leaves under drought stress, and wheat leaves under conditions of sulfur and nitrogen deficiency. The network represents a powerful aid for interpreting the interactions not only between the genes in the pathway but also among enzymes, metabolites and smaller molecules. It provides a concise, clear understanding of the complexity of asparagine metabolism that could aid the interpretation of data relating to wider amino acid metabolism and other metabolic processes.
A detailed network describing asparagine metabolism in plants was constructed using yED graph editor software. The network was used with the Unique Network Identification Pipeline in the analysis of data from 18 publicly available transcriptomic data studies. This identified links between genes involved in asparagine metabolism in wheat roots under drought stress, wheat leaves under drought stress, and wheat leaves under conditions of sulfur and nitrogen deficiency.
Consensus approaches have been widely used to identify Gene Regulatory Networks (GRNs) that are common to multiple studies. However, in this research we develop an application that semi-automatically ...identifies key mechanisms that are specific to a particular set of conditions. We analyse four different types of cancer to identify gene pathways unique to each of them. To support the results reliability we calculate the prediction accuracy of each gene for the specified conditions and compare to predictions on other conditions. The most predictive are validated using the GeneCards encyclopaedia1 coupled with a statistical test for validating clusters. Finally, we implement an interface that allows the user to identify unique subnetworks of any selected combination of studies using AND & NOT logic operators. Results show that unique genes and sub-networks can be reliably identified and that they reflect key mechanisms that are fundamental to the cancer types under study.
The survival of any organismis determined by the mechanisms triggered in response to the inputs received. Underlying mechanisms are described by graphical networks that can be inferred from different ...types of data such as microarrays. Deriving robust and reliable networks can be complicated due to the microarray structure of the data characterized by a discrepancy between the number of genes and samples of several orders of magnitude, bias and noise. Researchers overcome this problem by integrating independent data together and deriving the common mechanisms through consensus network analysis. Different conditions generate different inputs to the organism which reacts triggering different mechanisms with similarities and differences. A lot of effort has been spent into identifying the commonalities under different conditions. Highlighting similarities may overshadow the differences which often identify the main characteristics of the triggered mechanisms. In this thesis we introduce the concept of study-specific mechanism. We develop a pipeline to semiautomatically identify study-specific networks called unique-networks through a combination of consensus approach, graphical similarities and network analysis. The main pipeline called UNIP (Unique Networks Identification Pipeline) takes a set of independent studies, builds gene regulatory networks for each of them, calculates an adaptation of the sensitivity measure based on the networks graphical similarities, applies clustering to group the studies who generate the most similar networks into study-clusters and derives the consensus networks. Once each study-cluster is associated with a consensus-network, we identify the links that appear only in the consensus network under consideration but not in the others (unique-connections). Considering the genes involved in the unique-connections we build Bayesian networks to derive the unique-networks. Finally, we exploit the inference tool to calculate each gene prediction-accuracy across all studies to further refine the unique-networks. Biological validation through different software and the literature are explored to validate our method. UNIP is first applied to a set of synthetic data perturbed with different levels of noise to study the performance and verify its reliability. Then, wheat under stress conditions and different types of cancer are explored. Finally, we develop a user-friendly interface to combine the set of studies by using AND and NOT logic operators. Based on the findings, UNIP is a robust and reliable method to analyse large sets of transcriptomic data. It easily detects the main complex relationships between transcriptional expression of genes specific for different conditions and also highlights structures and nodes that could be potential targets for further research.
The survival of any organismis determined by the mechanisms triggered in response to the inputs received. Underlying mechanisms are described by graphical networks that can be inferred from different ...types of data such as microarrays. Deriving robust and reliable networks can be complicated due to the microarray structure of the data characterized by a discrepancy between the number of genes and samples of several orders of magnitude, bias and noise. Researchers overcome this problem by integrating independent data together and deriving the common mechanisms through consensus network analysis. Different conditions generate different inputs to the organism which reacts triggering different mechanisms with similarities and differences. A lot of effort has been spent into identifying the commonalities under different conditions. Highlighting similarities may overshadow the differences which often identify the main characteristics of the triggered mechanisms. In this thesis we introduce the concept of study-specific mechanism. We develop a pipeline to semiautomatically identify study-specific networks called unique-networks through a combination of consensus approach, graphical similarities and network analysis. The main pipeline called UNIP (Unique Networks Identification Pipeline) takes a set of independent studies, builds gene regulatory networks for each of them, calculates an adaptation of the sensitivity measure based on the networks graphical similarities, applies clustering to group the studies who generate the most similar networks into study-clusters and derives the consensus networks. Once each study-cluster is associated with a consensus-network, we identify the links that appear only in the consensus network under consideration but not in the others (unique-connections). Considering the genes involved in the unique-connections we build Bayesian networks to derive the unique-networks. Finally, we exploit the inference tool to calculate each gene prediction-accuracy across all studies to further refine the unique-networks. Biological validation through different software and the literature are explored to validate our method. UNIP is first applied to a set of synthetic data perturbed with different levels of noise to study the performance and verify its reliability. Then, wheat under stress conditions and different types of cancer are explored. Finally, we develop a user-friendly interface to combine the set of studies by using AND and NOT logic operators. Based on the findings, UNIP is a robust and reliable method to analyse large sets of transcriptomic data. It easily detects the main complex relationships between transcriptional expression of genes specific for different conditions and also highlights structures and nodes that could be potential targets for further research.
Microarrays are commonly used in biology because of their ability to simultaneously measure thousands of genes under different conditions. Due to their structure, typically containing a high amount ...of variables but far fewer samples, scalable network analysis techniques are often employed. In particular, consensus approaches have been recently used that combine multiple microarray studies in order to find networks that are more robust. The purpose of this paper, however, is to combine multiple microarray studies to automatically identify subnetworks that are distinctive to specific experimental conditions rather than common to them all. To better understand key regulatory mechanisms and how they change under different conditions, we derive unique networks from multiple independent networks built using glasso which goes beyond standard correlations. This involves calculating cluster prediction accuracies to detect the most predictive genes for a specific set of conditions. We differentiate between accuracies calculated using cross-validation within a selected cluster of studies (the intra prediction accuracy) and those calculated on a set of independent studies belonging to different study clusters (inter prediction accuracy). Finally, we compare our method's results to related state-of-the art techniques. We explore how the proposed pipeline performs on both synthetic data and real data (wheat and Fusarium). Our results show that subnetworks can be identified reliably that are specific to subsets of studies and that these networks reflect key mechanisms that are fundamental to the experimental conditions in each of those subsets.
Emotional contagion, the ability to feel what other individuals feel without necessarily understanding the feeling or knowing its source, is thought to be an important element of social life. In ...humans, emotional contagion has been shown to be stronger in women than men. Emotional contagion has been shown to exist also in rodents, and a growing number of studies explore the neural basis of emotional contagion in male rats and mice. Here we explore whether there are sex differences in emotional contagion in rats. We use an established paradigm in which a demonstrator rat receives footshocks while freezing is measured in both the demonstrator and an observer rat. The two rats can hear, smell and see each other. By comparing pairs of male rats with pairs of female rats, we found (i) that female demonstrators froze less when submitted to footshocks, but that (ii) the emotional contagion response, i.e. the degree of influence across the rats, did not depend on the sex of the rats. This was true whether emotional contagion was quantified based on the slope of a regression linking demonstrator and observer average freezing, or on Granger causality estimates of moment-to-moment freezing. The lack of sex differences in emotional contagion is compatible with an interpretation of emotional contagion as serving selfish danger detection.
AlphaFold 2 (AF2) has placed Molecular Biology in a new era where we can visualize, analyze and interpret the structures and functions of all proteins solely from their primary sequences. We ...performed AF2 structure predictions for various protein systems, including globular proteins, a multi-domain protein, an intrinsically disordered protein (IDP), a randomized protein, two larger proteins (> 1000 AA), a heterodimer and a homodimer protein complex. Our results show that along with the three dimensional (3D) structures, AF2 also decodes protein sequences into residue flexibilities via both the predicted local distance difference test (pLDDT) scores of the models, and the predicted aligned error (PAE) maps. We show that PAE maps from AF2 are correlated with the distance variation (DV) matrices from molecular dynamics (MD) simulations, which reveals that the PAE maps can predict the dynamical nature of protein residues. Here, we introduce the AF2-scores, which are simply derived from pLDDT scores and are in the range of 0, 1. We found that for most protein models, including large proteins and protein complexes, the AF2-scores are highly correlated with the root mean square fluctuations (RMSF) calculated from MD simulations. However, for an IDP and a randomized protein, the AF2-scores do not correlate with the RMSF from MD, especially for the IDP. Our results indicate that the protein structures predicted by AF2 also convey information of the residue flexibility, i.e., protein dynamics.