Abstract
Together with various hosts and environments, ubiquitous microbes interact closely with each other forming an intertwined system or community. Of interest, shifts of the relationships ...between microbes and their hosts or environments are associated with critical diseases and ecological changes. While advances in high-throughput Omics technologies offer a great opportunity for understanding the structures and functions of microbiome, it is still challenging to analyse and interpret the omics data. Specifically, the heterogeneity and diversity of microbial communities, compounded with the large size of the datasets, impose a tremendous challenge to mechanistically elucidate the complex communities. Fortunately, network analyses provide an efficient way to tackle this problem, and several network approaches have been proposed to improve this understanding recently. Here, we systemically illustrate these network theories that have been used in biological and biomedical research. Then, we review existing network modelling methods of microbial studies at multiple layers from metagenomics to metabolomics and further to multi-omics. Lastly, we discuss the limitations of present studies and provide a perspective for further directions in support of the understanding of microbial communities.
Lung cancer remains the most common cause of cancer deaths worldwide, yet there is currently a lack of diagnostic noninvasive biomarkers that could guide treatment decisions. Small molecules (<1,500 ...Da) were measured in urine collected from 469 patients with lung cancer and 536 population controls using unbiased liquid chromatography/mass spectrometry. Clinical putative diagnostic and prognostic biomarkers were validated by quantitation and normalized to creatinine levels at two different time points and further confirmed in an independent sample set, which comprises 80 cases and 78 population controls, with similar demographic and clinical characteristics when compared with the training set. Creatine riboside (IUPAC name: 2-{2-(2R,3R,4S,5R)-3,4-dihydroxy-5-(hydroxymethyl)-oxolan-2-yl-1-methylcarbamimidamido}acetic acid), a novel molecule identified in this study, and N-acetylneuraminic acid (NANA) were each significantly (P < 0.00001) elevated in non-small cell lung cancer and associated with worse prognosis HR = 1.81 (P = 0.0002), and 1.54 (P = 0.025), respectively. Creatine riboside was the strongest classifier of lung cancer status in all and stage I-II cases, important for early detection, and also associated with worse prognosis in stage I-II lung cancer (HR = 1.71, P = 0.048). All measurements were highly reproducible with intraclass correlation coefficients ranging from 0.82 to 0.99. Both metabolites were significantly (P < 0.03) enriched in tumor tissue compared with adjacent nontumor tissue (N = 48), thus revealing their direct association with tumor metabolism. Creatine riboside and NANA may be robust urinary clinical metabolomic markers that are elevated in tumor tissue and associated with early lung cancer diagnosis and worse prognosis.
Drug repurposing is a strategy for identifying new uses of approved or investigational drugs that are outside the scope of the original medical indication. Even though many repurposed drugs have been ...found serendipitously in the past, the increasing availability of large volumes of biomedical data has enabled more systemic, data-driven approaches for drug candidate identification. At National Center of Advancing Translational Sciences (NCATS), we invent new methods to generate new data and information publicly available to spur innovation and scientific discovery. In this study, we aimed to explore and demonstrate biomedical data generated and collected via two NCATS research programs, the Toxicology in the 21st Century program (Tox21) and the Biomedical Data Translator (Translator) for the application of drug repurposing. These two programs provide complementary types of biomedical data from uncovering underlying biological mechanisms with bioassay screening data from Tox21 for chemical clustering, to enrich clustered chemicals with scientific evidence mined from the Translator towards drug repurposing. 129 chemical clusters have been generated and three of them have been further investigated for drug repurposing candidate identification, which is detailed as case studies.
Glioblastoma (GBM) is the most aggressive and common malignant primary brain tumor; however, treatment remains a significant challenge. This study aims to identify drug repurposing or repositioning ...candidates for GBM by developing an integrative rare disease profile network containing heterogeneous types of biomedical data.
We developed a Glioblastoma-based Biomedical Profile Network (GBPN) by extracting and integrating biomedical information pertinent to GBM-related diseases from the NCATS GARD Knowledge Graph (NGKG). We further clustered the GBPN based on modularity classes which resulted in multiple focused subgraphs, named mc_GBPN. We then identified high-influence nodes by performing network analysis over the mc_GBPN and validated those nodes that could be potential drug repurposing or repositioning candidates for GBM.
We developed the GBPN with 1,466 nodes and 107,423 edges and consequently the mc_GBPN with forty-one modularity classes. A list of the ten most influential nodes were identified from the mc_GBPN. These notably include Riluzole, stem cell therapy, cannabidiol, and VK-0214, with proven evidence for treating GBM.
Our GBM-targeted network analysis allowed us to effectively identify potential candidates for drug repurposing or repositioning. Further validation will be conducted by using other different types of biomedical and clinical data and biological experiments. The findings could lead to less invasive treatments for glioblastoma while significantly reducing research costs by shortening the drug development timeline. Furthermore, this workflow can be extended to other disease areas.
As researchers are increasingly able to collect data on a large scale from multiple clinical and omics modalities, multi-omics integration is becoming a critical component of metabolomics research. ...This introduces a need for increased understanding by the metabolomics researcher of computational and statistical analysis methods relevant to multi-omics studies. In this review, we discuss common types of analyses performed in multi-omics studies and the computational and statistical methods that can be used for each type of analysis. We pinpoint the caveats and considerations for analysis methods, including required parameters, sample size and data distribution requirements, sources of a priori knowledge, and techniques for the evaluation of model accuracy. Finally, for the types of analyses discussed, we provide examples of the applications of corresponding methods to clinical and basic research. We intend that our review may be used as a guide for metabolomics researchers to choose effective techniques for multi-omics analyses relevant to their field of study.
Assigning chromatin states genome-wide (e.g. promoters, enhancers, etc.) is commonly performed to improve functional interpretation of these states. However, computational methods to assign chromatin ...state suffer from the following drawbacks: they typically require data from multiple assays, which may not be practically feasible to obtain, and they depend on peak calling algorithms, which require careful parameterization and often exclude the majority of the genome. To address these drawbacks, we propose a novel learning technique built upon the Self-Organizing Map (SOM), Self-Organizing Map with Variable Neighborhoods (SOM-VN), to learn a set of representative shapes from a single, genome-wide, chromatin accessibility dataset to associate with a chromatin state assignment in which a particular RE is prevalent. These shapes can then be used to assign chromatin state using our workflow.
We validate the performance of the SOM-VN workflow on 14 different samples of varying quality, namely one assay each of A549 and GM12878 cell lines and two each of H1 and HeLa cell lines, primary B-cells, and brain, heart, and stomach tissue. We show that SOM-VN learns shapes that are (1) non-random, (2) associated with known chromatin states, (3) generalizable across sets of chromosomes, and (4) associated with magnitude and multimodality. We compare the accuracy of SOM-VN chromatin states against the Clustering Aggregation Tool (CAGT), an unsupervised method that learns chromatin accessibility signal shapes but does not associate these shapes with REs, and we show that overall precision and recall is increased when learning shapes using SOM-VN as compared to CAGT. We further compare enhancer state assignments from SOM-VN in signals above a set threshold to enhancer state assignments from Predicting Enhancers from ATAC-seq Data (PEAS), a deep learning method that assigns enhancer chromatin states to peaks. We show that the precision-recall area under the curve for the assignment of enhancer states is comparable to PEAS.
Our work shows that the SOM-VN workflow can learn relationships between REs and chromatin accessibility signal shape, which is an important step toward the goal of assigning and comparing enhancer state across multiple experiments and phenotypic states.
The computational metabolomics field brings together computer scientists, bioinformaticians, chemists, clinicians, and biologists to maximize the impact of metabolomics across a wide array of ...scientific and medical disciplines. The field continues to expand as modern instrumentation produces datasets with increasing complexity, resolution, and sensitivity. These datasets must be processed, annotated, modeled, and interpreted to enable biological insight. Techniques for visualization, integration (within or between omics), and interpretation of metabolomics data have evolved along with innovation in the databases and knowledge resources required to aid understanding. In this review, we highlight recent advances in the field and reflect on opportunities and innovations in response to the most pressing challenges. This review was compiled from discussions from the 2022 Dagstuhl seminar entitled “Computational Metabolomics: From Spectra to Knowledge”.
Display omitted
•Machine/deep learning enhances information retrieval from complex metabolomics data.•Increasing diversity and resolution of data offer exciting computational challenges.•Adoption of standardized methods and nomenclature is critical for interpretation.•Cross-disciplinary communication and collaboration is fundamental.•There is an acute need for well characterized datasets to support benchmarking.
Electronic cigarette (e-cig) use is continuing to increase, particularly among youth never-smokers, and is used by some smokers to quit. The acute and chronic toxicity of e-cig use is unclear ...generally in the context of increasing reports of inflammatory-type pneumonia in some e-cig users. To assess lung effects of e-cigs without nicotine or flavors, we conducted a pilot study with serial bronchoscopies over 4 weeks in 30 never-smokers, randomized either to a 4-week intervention with the use of e-cigs containing only 50% propylene glycol (PG) and 50% vegetable glycerine or to a no-use control group. Compliance to the e-cig intervention was assessed by participants sending daily puff counts and by urinary PG. Inflammatory cell counts and cytokines were determined in bronchoalveolar lavage (BAL) fluids. Genome-wide expression, miRNA, and mRNA were determined from bronchial epithelial cells. There were no significant differences in changes of BAL inflammatory cell counts or cytokines between baseline and follow-up, comparing the control and e-cig groups. However, in the intervention but not the control group, change in urinary PG as a marker of e-cig use and inhalation was significantly correlated with change in cell counts (cell concentrations, macrophages, and lymphocytes) and cytokines (IL8, IL13, and TNFα), although the absolute magnitude of changes was small. There were no significant changes in mRNA or miRNA gene expression. Although limited by study size and duration, this is the first experimental demonstration of an impact of e-cig use on inflammation in the human lung among never-smokers.
In prostate cancer (PCa), and many other hormone-dependent cancers, there is clear evidence for distorted transcriptional control as disease driver mechanisms. Defining which transcription factor ...(TF) and coregulators are altered and combine to become oncogenic drivers remains a challenge, in part because of the multitude of TFs and coregulators and the diverse genomic space on which they function. The current study was undertaken to identify which TFs and coregulators are commonly altered in PCa. We generated unique lists of TFs (n = 2662), coactivators (COA; n = 766); corepressors (COR; n = 599); mixed function coregulators (MIXED; n = 511), and to address the challenge of defining how these genes are altered we tested how expression, copy number alterations and mutation status varied across seven prostate cancer (PCa) cohorts (three of localized and four advanced disease). Testing of significant changes was undertaken by bootstrapping approaches and the most significant changes were identified. For one commonly and significantly altered gene were stably knocked-down expression and undertook cell biology experiments and RNA-Seq to identify differentially altered gene networks and their association with PCa progression risks. COAS, CORS, MIXED and TFs all displayed significant down-regulated expression (q.value < 0.1) and correlated with protein expression (r 0.4-0.55). In localized PCa, stringent expression filtering identified commonly altered TFs and coregulator genes, including well-established (e.g. ERG) and underexplored (e.g. PPARGC1A, encodes PGC1α). Reduced PPARGC1A expression significantly associated with worse disease-free survival in two cohorts of localized PCa. Stable PGC1α knockdown in LNCaP cells increased growth rates and invasiveness and RNA-Seq revealed a profound basal impact on gene expression (~ 2300 genes; FDR < 0.05, logFC > 1.5), but only modestly impacted PPARγ responses. GSEA analyses of the PGC1α transcriptome revealed that it significantly altered the AR-dependent transcriptome, and was enriched for epigenetic modifiers. PGC1α-dependent genes were overlapped with PGC1α-ChIP-Seq genes and significantly associated in TCGA with higher grade tumors and worse disease-free survival. These methods and data demonstrate an approach to identify cancer-driver coregulators in cancer, and that PGC1α expression is clinically significant yet underexplored coregulator in aggressive early stage PCa.