Visual inspection of histopathology slides is one of the main methods used by pathologists to assess the stage, type and subtype of lung tumors. Adenocarcinoma (LUAD) and squamous cell carcinoma ...(LUSC) are the most prevalent subtypes of lung cancer, and their distinction requires visual inspection by an experienced pathologist. In this study, we trained a deep convolutional neural network (inception v3) on whole-slide images obtained from The Cancer Genome Atlas to accurately and automatically classify them into LUAD, LUSC or normal lung tissue. The performance of our method is comparable to that of pathologists, with an average area under the curve (AUC) of 0.97. Our model was validated on independent datasets of frozen tissues, formalin-fixed paraffin-embedded tissues and biopsies. Furthermore, we trained the network to predict the ten most commonly mutated genes in LUAD. We found that six of them-STK11, EGFR, FAT1, SETBP1, KRAS and TP53-can be predicted from pathology images, with AUCs from 0.733 to 0.856 as measured on a held-out population. These findings suggest that deep-learning models can assist pathologists in the detection of cancer subtype or gene mutations. Our approach can be applied to any cancer type, and the code is available at https://github.com/ncoudray/DeepPATH .
Somatic mutations have been extensively characterized in breast cancer, but the effects of these genetic alterations on the proteomic landscape remain poorly understood. Here we describe quantitative ...mass-spectrometry-based proteomic and phosphoproteomic analyses of 105 genomically annotated breast cancers, of which 77 provided high-quality data. Integrated analyses provided insights into the somatic cancer genome including the consequences of chromosomal loss, such as the 5q deletion characteristic of basal-like breast cancer. Interrogation of the 5q trans-effects against the Library of Integrated Network-based Cellular Signatures, connected loss of CETN3 and SKP1 to elevated expression of epidermal growth factor receptor (EGFR), and SKP1 loss also to increased SRC tyrosine kinase. Global proteomic data confirmed a stromal-enriched group of proteins in addition to basal and luminal clusters, and pathway analysis of the phosphoproteome identified a G-protein-coupled receptor cluster that was not readily identified at the mRNA level. In addition to ERBB2, other amplicon-associated highly phosphorylated kinases were identified, including CDK12, PAK1, PTK2, RIPK2 and TLK2. We demonstrate that proteogenomic analysis of breast cancer elucidates the functional consequences of somatic mutations, narrows candidate nominations for driver genes within large deletions and amplified regions, and identifies therapeutic targets.
Integration of the CPTAC mass spectrometry-based proteomics data into the cBioPortal, consisting of 77 breast, 95 colorectal, and 174 ovarian tumors that already have been profiled by TCGA for ...mutations, copy number alterations, gene expression, and DNA methylation.
Display omitted
Highlights
•Support for mass spectrometry-based proteomics in cBioPortal.•User-friendly web interface, a web API, and an R client to query proteogenomic data.•Integration of Clinical Proteomics Tumor Analysis Consortium data with cBioPortal.
The Clinical Proteomic Tumor Analysis Consortium (CPTAC) has produced extensive mass spectrometry-based proteomics data for selected breast, colon, and ovarian tumors from The Cancer Genome Atlas (TCGA). We have incorporated the CPTAC proteomics data into the cBioPortal to support easy exploration and integrative analysis of these proteomic datasets in the context of the clinical and genomics data from the same tumors. cBioPortal is an open source platform for exploring, visualizing, and analyzing multidimensional cancer genomics and clinical data. The public instance of the cBioPortal (http://cbioportal.org/) hosts more than 200 cancer genomics studies, including all of the data from TCGA. Its biologist-friendly interface provides many rich analysis features, including a graphical summary of gene-level data across multiple platforms, correlation analysis between genes or other data types, survival analysis, and per-patient data visualization. Here, we present the integration of the CPTAC mass spectrometry-based proteomics data into the cBioPortal, consisting of 77 breast, 95 colorectal, and 174 ovarian tumors that already have been profiled by TCGA for mutations, copy number alterations, gene expression, and DNA methylation. As a result, the CPTAC data can now be easily explored and analyzed in the cBioPortal in the context of clinical and genomics data. By integrating CPTAC data into cBioPortal, limitations of TCGA proteomics array data can be overcome while also providing a user-friendly web interface, a web API, and an R client to query the mass spectrometry data together with genomic, epigenomic, and clinical data.
The genomes of virtually all organisms contain repetitive sequences that are generated by the activity of transposable elements (transposons). Transposons are mobile genetic elements that can move ...from one genomic location to another; in this process, they amplify and increase their presence in genomes, sometimes to very high copy numbers. In this Review we discuss new evidence and ideas that the activity of retrotransposons, a major subgroup of transposons overall, influences and even promotes the process of ageing and age-related diseases in complex metazoan organisms, including humans. Retrotransposons have been coevolving with their host genomes since the dawn of life. This relationship has been largely competitive, and transposons have earned epithets such as 'junk DNA' and 'molecular parasites'. Much of our knowledge of the evolution of retrotransposons reflects their activity in the germline and is evident from genome sequence data. Recent research has provided a wealth of information on the activity of retrotransposons in somatic tissues during an individual lifespan, the molecular mechanisms that underlie this activity, and the manner in which these processes intersect with our own physiology, health and well-being.
We describe design, rapid assembly, and characterization of synthetic yeast Sc2.0 chromosome VI (synVI). A mitochondrial defect in the synVI strain mapped to synonymous coding changes within
(
), ...encoding an essential proteasome subunit; Sc2.0 coding changes reduced Pre4 protein accumulation by half. Completing Sc2.0 specifies consolidation of 16 synthetic chromosomes into a single strain. We investigated phenotypic, transcriptional, and proteomewide consequences of Sc2.0 chromosome consolidation in poly-synthetic strains. Another "bug" was discovered through proteomic analysis, associated with alteration of the
transcription start due to transfer RNA deletion and loxPsym site insertion. Despite extensive genetic alterations across 6% of the genome, no major global changes were detected in the poly-synthetic strain "omics" analyses. This work sets the stage for completion of a designer, synthetic eukaryotic genome.
Abdominal aortic aneurysms (AAA) are characterized by extensive extracellular matrix (ECM) fragmentation and inflammation. However, the mechanisms by which these events are coupled thereby fueling ...focal vascular damage are undefined. Here we report through single-cell RNA-sequencing of diseased aorta that the neuronal guidance cue netrin-1 can act at the interface of macrophage-driven injury and ECM degradation. Netrin-1 expression peaks in human and murine aneurysmal macrophages. Targeted deletion of netrin-1 in macrophages protects mice from developing AAA. Through its receptor neogenin-1, netrin-1 induces a robust intracellular calcium flux necessary for the transcriptional regulation and persistent catalytic activation of matrix metalloproteinase-3 (MMP3) by vascular smooth muscle cells. Deficiency in MMP3 reduces ECM damage and the susceptibility of mice to develop AAA. Here, we establish netrin-1 as a major signal that mediates the dynamic crosstalk between inflammation and chronic erosion of the ECM in AAA.
Investigating how chromatin organization determines cell-type-specific gene expression remains challenging. Experimental methods for measuring three-dimensional chromatin organization, such as Hi-C, ...are costly and have technical limitations, restricting their broad application particularly in high-throughput genetic perturbations. We present C.Origami, a multimodal deep neural network that performs de novo prediction of cell-type-specific chromatin organization using DNA sequence and two cell-type-specific genomic features-CTCF binding and chromatin accessibility. C.Origami enables in silico experiments to examine the impact of genetic changes on chromatin interactions. We further developed an in silico genetic screening approach to assess how individual DNA elements may contribute to chromatin organization and to identify putative cell-type-specific trans-acting regulators that collectively determine chromatin architecture. Applying this approach to leukemia cells and normal T cells, we demonstrate that cell-type-specific in silico genetic screening, enabled by C.Origami, can be used to systematically discover novel chromatin regulation circuits in both normal and disease-related biological systems.
Although DNA replication is a fundamental aspect of biology, it is not known what determines where DNA replication starts and stops in the human genome. We directly identified and quantitatively ...compared sites of replication initiation and termination in untransformed human cells. We found that replication preferentially initiates at the transcription start site of genes occupied by high levels of RNA polymerase II, and terminates at their polyadenylation sites, thereby ensuring global co-directionality of transcription and replication, particularly at gene 5' ends. During replication stress, replication initiation is stimulated downstream of genes and termination is redistributed to gene bodies; this globally reorients replication relative to transcription around gene 3' ends. These data suggest that replication initiation and termination are coupled to transcription in human cells, and propose a model for the impact of replication stress on genome integrity.
Association of aberrant glycosylation with melanoma progression is based mainly on analyses of cell lines. Here we present a systems-based study of glycomic changes and corresponding enzymes ...associated with melanoma metastasis in patient samples. Upregulation of core fucosylation (FUT8) and downregulation of α-1,2 fucosylation (FUT1, FUT2) were identified as features of metastatic melanoma. Using both in vitro and in vivo studies, we demonstrate FUT8 is a driver of melanoma metastasis which, when silenced, suppresses invasion and tumor dissemination. Glycoprotein targets of FUT8 were enriched in cell migration proteins including the adhesion molecule L1CAM. Core fucosylation impacted L1CAM cleavage and the ability of L1CAM to support melanoma invasion. FUT8 and its targets represent therapeutic targets in melanoma metastasis.
Display omitted
•Primary and metastatic melanoma display distinct core and α-1,2 fucosylation•FUT8 promotes melanoma metastasis•FUT8 is transcriptionally controlled by TGIF2•FUT8-mediated core fucosylation alters L1CAM proteolytic cleavage and cell invasion
Using a systems-based approach to assess glycosylation in matched primary and metastatic melanoma samples, Agrawal et al. find increased core fucosylation mediated by FUT8 in metastatic melanoma. FUT8 facilitates invasion and tumor dissemination, in part due to reduced cleavage of core-fucosylated L1CAM.
Mycobacterium tuberculosis (Mtb) encodes five type VII secretion systems (T7SS), designated ESX-1–ESX-5, that are critical for growth and pathogenesis. The best characterized is ESX-1, which ...profoundly impacts host cell interactions. In contrast, the ESX-3 T7SS is implicated in metal homeostasis, but efforts to define its function have been limited by an inability to recover deletion mutants. We overcame this impediment using medium supplemented with various iron complexes to recover mutants with deletions encompassing select genes within esx-3 or the entire operon. The esx-3 mutants were defective in uptake of siderophore-bound iron and dramatically accumulated cell-associated mycobactin siderophores. Proteomic analyses of culture filtrate revealed that secretion of EsxG and EsxH was codependent and that EsxG–EsxH also facilitated secretion of several members of the proline-glutamic acid (PE) and proline-proline-glutamic acid (PPE) protein families (named for conserved PE and PPE N-terminal motifs). Substrates that depended on EsxG–EsxH for secretion included PE5, encoded within the esx-3 locus, and the evolutionarily related PE15–PPE20 encoded outside the esx-3 locus. In vivo characterization of the mutants unexpectedly showed that the ESX-3 secretion system plays both iron-dependent and -independent roles in Mtb pathogenesis. PE5–PPE4 was found to be critical for the siderophore-mediated iron-acquisition functions of ESX-3. The importance of this iron-acquisition function was dependent upon host genotype, suggesting a role for ESX-3 secretion in counteracting host defense mechanisms that restrict iron availability. Further, we demonstrate that the ESX-3 T7SS secretes certain effectors that are important for iron uptake while additional secreted effectors modulate virulence in an iron-independent fashion.