Microarray technology has become a standard molecular biology tool. Experimental data have been generated on a huge number of organisms, tissue types, treatment conditions and disease states. The ...Gene Expression Omnibus (Barrett et al., 2005), developed by the National Center for Bioinformatics (NCBI) at the National Institutes of Health is a repository of nearly 140 000 gene expression experiments. The BioConductor project (Gentleman et al., 2004) is an open-source and open-development software project built in the R statistical programming environment (R Development core Team, 2005) for the analysis and comprehension of genomic data. The tools contained in the BioConductor project represent many state-of-the-art methods for the analysis of microarray and genomics data. We have developed a software tool that allows access to the wealth of information within GEO directly from BioConductor, eliminating many the formatting and parsing problems that have made such analyses labor-intensive in the past. The software, called GEOquery, effectively establishes a bridge between GEO and BioConductor. Easy access to GEO data from BioConductor will likely lead to new analyses of GEO data using novel and rigorous statistical and bioinformatic tools. Facilitating analyses and meta-analyses of microarray data will increase the efficiency with which biologically important conclusions can be drawn from published genomic data. Availability: GEOquery is available as part of the BioConductor project. Contact: sdavis2@mail.nih.gov
Diffuse intrinsic pontine glioma (DIPG) is a fatal childhood cancer. We performed a chemical screen in patient-derived DIPG cultures along with RNA-seq analyses and integrated computational modeling ...to identify potentially effective therapeutic strategies. The multi-histone deacetylase inhibitor panobinostat demonstrated therapeutic efficacy both in vitro and in DIPG orthotopic xenograft models. Combination testing of panobinostat and the histone demethylase inhibitor GSK-J4 revealed that the two had synergistic effects. Together, these data suggest a promising therapeutic strategy for DIPG.
The Sequence Read Archive (SRA) is the largest public repository of sequencing data from the next generation of sequencing platforms including Illumina (Genome Analyzer, HiSeq, MiSeq, .etc), Roche ...454 GS System, Applied Biosystems SOLiD System, Helicos Heliscope, PacBio RS, and others.
SRAdb is an attempt to make queries of the metadata associated with SRA submission, study, sample, experiment and run more robust and precise, and make access to sequencing data in the SRA easier. We have parsed all the SRA metadata into a SQLite database that is routinely updated and can be easily distributed. The SRAdb R/Bioconductor package then utilizes this SQLite database for querying and accessing metadata. Full text search functionality makes querying metadata very flexible and powerful. Fastq files associated with query results can be downloaded easily for local analysis. The package also includes an interface from R to a popular genome browser, the Integrated Genomics Viewer.
SRAdb Bioconductor package provides a convenient and integrated framework to query and access SRA metadata quickly and powerfully from within R.
Lung cancer is the leading cancer diagnosis worldwide and the number one cause of cancer deaths. Exposure to cigarette smoke, the primary risk factor in lung cancer, reduces epithelial barrier ...integrity and increases susceptibility to infections. Herein, we hypothesize that somatic mutations together with cigarette smoke generate a dysbiotic microbiota that is associated with lung carcinogenesis. Using lung tissue from 33 controls and 143 cancer cases, we conduct 16S ribosomal RNA (rRNA) bacterial gene sequencing, with RNA-sequencing data from lung cancer cases in The Cancer Genome Atlas serving as the validation cohort.
Overall, we demonstrate a lower alpha diversity in normal lung as compared to non-tumor adjacent or tumor tissue. In squamous cell carcinoma specifically, a separate group of taxa are identified, in which Acidovorax is enriched in smokers. Acidovorax temporans is identified within tumor sections by fluorescent in situ hybridization and confirmed by two separate 16S rRNA strategies. Further, these taxa, including Acidovorax, exhibit higher abundance among the subset of squamous cell carcinoma cases with TP53 mutations, an association not seen in adenocarcinomas.
The results of this comprehensive study show both microbiome-gene and microbiome-exposure interactions in squamous cell carcinoma lung cancer tissue. Specifically, tumors harboring TP53 mutations, which can impair epithelial function, have a unique bacterial consortium that is higher in relative abundance in smoking-associated tumors of this type. Given the significant need for clinical diagnostic tools in lung cancer, this study may provide novel biomarkers for early detection.
The RecQ DNA helicase WRN is a synthetic lethal target for cancer cells with microsatellite instability (MSI), a form of genetic hypermutability that arises from impaired mismatch repair
. Depletion ...of WRN induces widespread DNA double-strand breaks in MSI cells, leading to cell cycle arrest and/or apoptosis. However, the mechanism by which WRN protects MSI-associated cancers from double-strand breaks remains unclear. Here we show that TA-dinucleotide repeats are highly unstable in MSI cells and undergo large-scale expansions, distinct from previously described insertion or deletion mutations of a few nucleotides
. Expanded TA repeats form non-B DNA secondary structures that stall replication forks, activate the ATR checkpoint kinase, and require unwinding by the WRN helicase. In the absence of WRN, the expanded TA-dinucleotide repeats are susceptible to cleavage by the MUS81 nuclease, leading to massive chromosome shattering. These findings identify a distinct biomarker that underlies the synthetic lethal dependence on WRN, and support the development of therapeutic agents that target WRN for MSI-associated cancers.
To understand the genetic mechanisms driving variant and IGHV4-34-expressing hairy-cell leukemias, we performed whole-exome sequencing of leukemia samples from ten affected individuals, including six ...with matched normal samples. We identified activating mutations in the MAP2K1 gene (encoding MEK1) in 5 of these 10 samples and in 10 of 21 samples in a validation set (overall frequency of 15/31), suggesting potential new strategies for treating individuals with these diseases.
CellMiner-SCLC (https://discover.nci.nih.gov/SclcCellMinerCDB/) integrates drug sensitivity and genomic data, including high-resolution methylome and transcriptome from 118 patient-derived small cell ...lung cancer (SCLC) cell lines, providing a resource for research into this “recalcitrant cancer.” We demonstrate the reproducibility and stability of data from multiple sources and validate the SCLC consensus nomenclature on the basis of expression of master transcription factors NEUROD1, ASCL1, POU2F3, and YAP1. Our analyses reveal transcription networks linking SCLC subtypes with MYC and its paralogs and the NOTCH and HIPPO pathways. SCLC subsets express specific surface markers, providing potential opportunities for antibody-based targeted therapies. YAP1-driven SCLCs are notable for differential expression of the NOTCH pathway, epithelial-mesenchymal transition (EMT), and antigen-presenting machinery (APM) genes and sensitivity to mTOR and AKT inhibitors. These analyses provide insights into SCLC biology and a framework for future investigations into subtype-specific SCLC vulnerabilities.
Display omitted
•SCLC-CellMiner is an extensive cell line genomic and pharmacology resource•SCLC cell lines show a methylome consistent with their plasticity and lineage•Transcriptome analyses reveal lineage transcriptional networks and drug predictions•SCLC-Y cells differ from other subgroups by transcriptome and potential therapeutics
Tlemsani et al. provide a unique resource, SCLC-CellMiner, integrating drug sensitivity and multi-omics data from 118 small cell lung cancer (SCLC) cell lines. They demonstrate that SCLCs have differential transcriptional networks driven by lineage-specific transcription factors (NEUROD1, ASCL1, POU2F3, and YAP1). Furthermore, YAP1-driven SCLCs have distinct drug sensitivity profiles.
Sequence polymorphisms linked to human diseases and phenotypes in genome-wide association studies often affect noncoding regions. A SNP within an intron of the gene encoding Interferon Regulatory ...Factor 4 (IRF4), a transcription factor with no known role in melanocyte biology, is strongly associated with sensitivity of skin to sun exposure, freckles, blue eyes, and brown hair color. Here, we demonstrate that this SNP lies within an enhancer of IRF4 transcription in melanocytes. The allele associated with this pigmentation phenotype impairs binding of the TFAP2A transcription factor that, together with the melanocyte master regulator MITF, regulates activity of the enhancer. Assays in zebrafish and mice reveal that IRF4 cooperates with MITF to activate expression of Tyrosinase (TYR), an essential enzyme in melanin synthesis. Our findings provide a clear example of a noncoding polymorphism that affects a phenotype by modulating a developmental gene regulatory network.
Display omitted
•Sequence variation in IRF4 is associated with pigmentation features including freckles•This sequence is located in an enhancer element that affects expression of IRF4•The transcription factors MITF and TFAP2A regulate expression from this element•Together, MITF and IRF4 affect regulation of the pigmentation enzyme TYR
A polymorphism in a noncoding region of IRF4, a transcription factor involved in immune signaling, is found to affect human pigmentation. This polymorphism affects the ability of the transcription factors TFAP2A and MITF to regulate IRF4 levels, which in turn results in reduced levels of the pigmentation enzyme Tyrosinase.
We conducted a basket clinical trial to assess the feasibility of such a design strategy and to independently evaluate the effects of multiple targeted agents against specific molecular aberrations ...in multiple histologic subtypes concurrently.
We enrolled patients with advanced non-small-cell lung cancer (NSCLC), small-cell lung cancer, and thymic malignancies who underwent genomic characterization of oncogenic drivers. Patients were enrolled onto a not-otherwise-specified arm and treated with standard-of-care therapies or one of the following five biomarker-matched treatment groups: erlotinib for EGFR mutations; selumetinib for KRAS, NRAS, HRAS, or BRAF mutations; MK2206 for PIK3CA, AKT, or PTEN mutations; lapatinib for ERBB2 mutations or amplifications; and sunitinib for KIT or PDGFRA mutations or amplification.
Six hundred forty-seven patients were enrolled, and 88% had their tumors tested for at least one gene. EGFR mutation frequency was 22.1% in NSCLC, and erlotinib achieved a response rate of 60% (95% CI, 32.3% to 83.7%). KRAS mutation frequency was 24.9% in NSCLC, and selumetinib failed to achieve its primary end point, with a response rate of 11% (95% CI, 0% to 48%). Completion of accrual to all other arms was not feasible. In NSCLC, patients with EGFR mutations had the longest median survival (3.51 years; 95% CI, 2.89 to 5.5 years), followed by those with ALK rearrangements (2.94 years; 95% CI, 1.66 to 4.61 years), those with KRAS mutations (2.3 years; 95% CI, 2.3 to 2.17 years), those with other genetic abnormalities (2.17 years; 95% CI, 1.3 to 2.74 years), and those without an actionable mutation (1.85 years; 95% CI, 1.61 to 2.13 years).
This basket trial design was not feasible for many of the arms with rare mutations, but it allowed the study of the genetics of less common malignancies.