The availability of increasing volumes of multiomics, imaging, and clinical data in complex diseases such as cancer opens opportunities for the formulation and development of computational imaging ...genomics methods that can link multiomics, imaging, and clinical data.
Here, we present the Imaging-AMARETTO algorithms and software tools to systematically interrogate regulatory networks derived from multiomics data within and across related patient studies for their relevance to radiography and histopathology imaging features predicting clinical outcomes.
To demonstrate its utility, we applied Imaging-AMARETTO to integrate three patient studies of brain tumors, specifically, multiomics with radiography imaging data from The Cancer Genome Atlas (TCGA) glioblastoma multiforme (GBM) and low-grade glioma (LGG) cohorts and transcriptomics with histopathology imaging data from the Ivy Glioblastoma Atlas Project (IvyGAP) GBM cohort. Our results show that Imaging-AMARETTO recapitulates known key drivers of tumor-associated microglia and macrophage mechanisms, mediated by
,
, and
, and neurodevelopmental and stemness mechanisms, mediated by
. Imaging-AMARETTO provides interpretation of their underlying molecular mechanisms in light of imaging biomarkers of clinical outcomes and uncovers novel master drivers,
and
, that establish relationships across these distinct mechanisms.
Our network-based imaging genomics tools serve as hypothesis generators that facilitate the interrogation of known and uncovering of novel hypotheses for follow-up with experimental validation studies. We anticipate that our Imaging-AMARETTO imaging genomics tools will be useful to the community of biomedical researchers for applications to similar studies of cancer and other complex diseases with available multiomics, imaging, and clinical data.
Current integrative genomics projects are enabling new discoveries in cancer research through the ability to combine multiple modalities of data, e.g. gene expression, copy number, RNAi, exon ...resequencing, and epigenetics. However, the tools to access and analyze this data have traditionally been out of the reach of clinicians and research biologists, due to the widely distributed nature of the available data and the significant learning curve required to use the analytical tools. This problem is particularly relevant to the multiple myeloma research community because of the heterogeneity of genomic aberrations underlying this disease and lack of a central repository of multi-modal multiple myeloma data. To address these problems, the Broad Institute has created a pilot Multiple Myeloma Genomics Portal, http://www.broad.mit.edu/mmgp, which serves as an interface between biologists (and clinical investigators), analytical tools, and multiple myeloma datasets. The Portal provides access to a number of advanced gene expression, gene set enrichment, and copy number analyses and visualizations within an easy to use Web interface. These analyses can be performed on the datasets hosted on the Portal, which include previously published curated, high quality genomic multiple myeloma datasets as well as a new reference collection of multiple myeloma samples. The Portal is continuously updated with new datasets, data types, and analytical capabilities as they become available. The Portal's accessibility allows it to serve as a significant venue for investigators from other fields, engaging a broader range of investigators in exploring these data, and its design is readily adaptable to integrative genomics studies of other cancer types.
LSIDs are providing a standard for distributed data identification and resolution that promotes interoperable and reproducible research in computational biology.
Since the Life Science Identifier ...(LSID) data identification and access standard made its official debut in late 2004, several organizations have begun to use LSIDs to simplify the methods used to uniquely name, reference and retrieve distributed data objects and concepts. In this review, the authors build on introductory work that describes the LSID standard by documenting how five early adopters have incorporated the standard into their technology infrastructure and by outlining several common misconceptions and difficulties related to LSID use, including the impact of the byte identity requirement for LSID-identified objects and the opacity recommendation for use of the LSID syntax. The review describes several shortcomings of the LSID standard, such as the lack of a specific metadata standard, along with solutions that could be addressed in future revisions of the specification.
GeneCruiser is a web service allowing users to annotate their genomic data by mapping microarray feature identifiers to gene identifiers from databases, such as UniGene, while providing links to web ...resources, such as the UCSC Genome Browser. It relies on a regularly updated database that retrieves and indexes the mappings between microarray probes and genomic databases. Genes are identified using the Life Sciences Identifier standard. Availability: GeneCruiser is freely available in the following forms: Web service and Web application, http://www.genecruiser.org; GenePattern, GeneCruiser access has been integrated into our microarray analysis platform, GenePattern. http://www.genepattern.org Contact: liefeld@broad.mit.edu
The systematic translation of cancer genomic data into knowledge of tumour biology and therapeutic possibilities remains challenging. Such efforts should be greatly aided by robust preclinical model ...systems that reflect the genomic diversity of human cancers and for which detailed genetic and pharmacological annotation is available1. Here we describe the Cancer Cell Line Encyclopedia (CCLE): a compilation of gene expression, chromosomal copy number and massively parallel sequencing data from947 human cancer cell lines. When coupled with pharmacological profiles for 24 anticancer drugs across 479 of the cell lines, this collection allowed identification of genetic, lineage, and gene-expression-based predictors of drug sensitivity. In addition to known predictors, we found that plasma cell lineage correlated with sensitivity to IGF1 receptor inhibitors; AHR expression was associated with MEK inhibitor efficacy in NRAS-mutant lines; and SLFN11 expression predicted sensitivity to topoisomerase inhibitors. Together, our results indicate that large, annotated cell-line collections may help to enable preclinical stratification schemata for anticancer agents. The generation of genetic predictions of drug response in the preclinical setting and their incorporation into cancer clinical trial design could speed the emergence of 'personalized' therapeutic regimens2. PUBLICATION ABSTRACT
The Multiple Myeloma Research Consortium (MMRC) Genomics Initiative is a three-year program to analyze tumor tissue from hundreds of multiple myeloma (MM) patients via gene expression profiling ...(GEP), comparative genomic hybridization (aCGH), and exon re-sequencing. In addition, RNAi knockdown of selected genes in MM tumor cell lines is being evaluated to identify potential new targets. All genomic data generated is scheduled for placement in an open-access Multiple Myeloma Genomics Portal pre-publication and in near real-time (www.broad.mit.edu/mmgp). Additionally, samples are also destined for drug validation and correlative science on clinical protocols as this study moves forward. This comprehensive project is spearheaded by the MMRC and conducted via collaboration with the Eli and Edythe L. Broad Institute of MIT and Harvard, the Translational Genomics Research Institute (TGen), Mayo Clinic Arizona, and The Dana-Farber Cancer Center. The study is supported by the collection from member institutions of the MMRC of bone marrow aspirates and matched peripheral blood samples from over 1000 patients. Specific genomic technologies that are currently being employed across this sample set include GEP using Affymetrix Human Genome U133A 2.0 Plus Arrays, and, in parallel, efforts to identify regions of genomic gain and loss are using Agilent Human Genome CGH arrays. In contrast to other large-scale genomic projects based on exon-sequencing of targeted gene sets, this project will be the first to perform genome-scale single molecule sequencing (SMS) of DNA from patient specimens. Results will be targeted against candidate classes of genes (e.g. kinases, phosphatases, known oncogenes and tumor suppressors), and genes from GEP or within candidate regions of copy gain or loss identified by the aCGH experiments. Mutations will be further validated in an independent set of patient specimens. Finally we will attempt to identify points of vulnerability of MM through systematic loss-of-function screens in myeloma cell lines using high-throughput RNA interference (using both shRNA and siRNA platforms). Importantly, data generated from this genomics initiative will ultimately be made public pre-publication through the established MMRC Multiple Myeloma Genomics Portal. Data from all aspects of this project (sample collection and analyte isolation, GEP, aCGH, SMS, RNAi and bioinformatics) will be described in this presentation. The power of this study is the comprehensive collection of gene expression, CGH, and genome sequencing on a single reference set of clinically annotated samples. The addition of RNAi screens makes this a very important and unique data resource, which we hope will help expedite the discovery of novel targeted agents for MM scientific community.