Understanding and interpreting classification decisions of automated image classification systems is of high value in many applications, as it allows to verify the reasoning of the system and ...provides additional information to the human expert. Although machine learning methods are solving very successfully a plethora of tasks, they have in most cases the disadvantage of acting as a black box, not providing any information about what made them arrive at a particular decision. This work proposes a general solution to the problem of understanding classification decisions by pixel-wise decomposition of nonlinear classifiers. We introduce a methodology that allows to visualize the contributions of single pixels to predictions for kernel-based classifiers over Bag of Words features and for multilayered neural networks. These pixel contributions can be visualized as heatmaps and are provided to a human expert who can intuitively not only verify the validity of the classification decision, but also focus further analysis on regions of potential interest. We evaluate our method for classifiers trained on PASCAL VOC 2009 images, synthetic image data containing geometric shapes, the MNIST handwritten digits data set and for the pre-trained ImageNet model available as part of the Caffe open source package.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Recent developments in immuno-oncology demonstrate that not only cancer cells, but also the tumor microenvironment can guide precision medicine. A comprehensive and in-depth characterization of the ...tumor microenvironment is challenging since its cell populations are diverse and can be important even if scarce. To identify clinically relevant microenvironmental and cancer features, we applied single-cell RNA sequencing to ten human lung adenocarcinomas and ten normal control tissues. Our analyses revealed heterogeneous carcinoma cell transcriptomes reflecting histological grade and oncogenic pathway activities, and two distinct microenvironmental patterns. The immune-activated CP²E microenvironment was composed of cancer-associated myofibroblasts, proinflammatory monocyte-derived macrophages, plasmacytoid dendritic cells and exhausted CD8+ T cells, and was prognostically unfavorable. In contrast, the inert N³MC microenvironment was characterized by normal-like myofibroblasts, non-inflammatory monocyte-derived macrophages, NK cells, myeloid dendritic cells and conventional T cells, and was associated with a favorable prognosis. Microenvironmental marker genes and signatures identified in single-cell profiles had progonostic value in bulk tumor profiles. In summary, single-cell RNA profiling of lung adenocarcinoma provides additional prognostic information based on the microenvironment, and may help to predict therapy response and to reveal possible target cell populations for future therapeutic approaches.
Full text
Available for:
EMUNI, FIS, FZAB, GEOZS, GIS, IJS, IMTLJ, KILJ, KISLJ, MFDPS, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, SBMB, SBNM, UKNU, UL, UM, UPUK, VKSCE, ZAGLJ
Formalin-fixed paraffin-embedded (FFPE) tissues are a valuable resource for retrospective clinical studies. Here, we evaluate the feasibility of (phospho-)proteomics on FFPE lung tissue regarding ...protein extraction, quantification, pre-analytics, and sample size. After comparing protein extraction protocols, we use the best-performing protocol for the acquisition of deep (phospho-)proteomes from lung squamous cell and adenocarcinoma with >8,000 quantified proteins and >14,000 phosphosites with a tandem mass tag (TMT) approach. With a microscaled approach, we quantify 7,000 phosphosites, enabling the analysis of FFPE biopsies with limited tissue amounts. We also investigate the influence of pre-analytical variables including fixation time and heat-assisted de-crosslinking on protein extraction efficiency and proteome coverage. Our improved workflows provide quantitative information on protein abundance and phosphosite regulation for the most relevant oncogenes, tumor suppressors, and signaling pathways in lung cancer. Finally, we present general guidelines to which methods are best suited for different applications, highlighting TMT methods for comprehensive (phospho-)proteome profiling for focused clinical studies and label-free methods for large cohorts.
Deep learning has recently gained popularity in digital pathology due to its high prediction quality. However, the medical domain requires explanation and insight for a better understanding beyond ...standard quantitative performance evaluation. Recently, many explanation methods have emerged. This work shows how heatmaps generated by these explanation methods allow to resolve common challenges encountered in deep learning-based digital histopathology analyses. We elaborate on biases which are typically inherent in histopathological image data. In the binary classification task of tumour tissue discrimination in publicly available haematoxylin-eosin-stained images of various tumour entities, we investigate three types of biases: (1) biases which affect the entire dataset, (2) biases which are by chance correlated with class labels and (3) sampling biases. While standard analyses focus on patch-level evaluation, we advocate pixel-wise heatmaps, which offer a more precise and versatile diagnostic instrument. This insight is shown to not only be helpful to detect but also to remove the effects of common hidden biases, which improves generalisation within and across datasets. For example, we could see a trend of improved area under the receiver operating characteristic (ROC) curve by 5% when reducing a labelling bias. Explanation techniques are thus demonstrated to be a helpful and highly relevant tool for the development and the deployment phases within the life cycle of real-world applications in digital pathology.
Full text
Available for:
IZUM, KILJ, NUK, PILJ, PNG, SAZU, UL, UM, UPUK
Genetic investigation of tumor heterogeneity and clonal evolution in solid cancers could be assisted by the analysis of liquid biopsies. However, tumors of various entities might release different ...quantities of circulating tumor cells (CTCs) and cell-free DNA (cfDNA) into the bloodstream, potentially limiting the diagnostic potential of liquid biopsy in distinct tumor histologies. Patients with advanced colorectal cancer (CRC), head and neck squamous cell carcinoma (HNSCC), and melanoma (MEL) were enrolled in the study, representing tumors with different metastatic patterns. Mutation profiles of cfDNA, CTCs, and tumor tissue were assessed by panel sequencing, targeting 327 cancer-related genes. In total, 30 tissue, 18 cfDNA, and 7 CTC samples from 18 patients were sequenced. Best concordance between the mutation profile of tissue and cfDNA was achieved in CRC and MEL, possibly due to the remarkable heterogeneity of HNSCC (63%, 55% and 11%, respectively). Concordance especially depended on the amount of cfDNA used for library preparation. While 21 of 27 (78%) tissue mutations were retrieved in high-input cfDNA samples (30-100 ng, N = 8), only 4 of 65 (6%) could be detected in low-input samples (<30 ng, N = 10). CTCs were detected in 13 of 18 patients (72%). However, downstream analysis was limited by poor DNA quality, allowing targeted sequencing of only seven CTC samples isolated from four patients. Only one CTC sample reflected the mutation profile of the respective tumor. Private mutations, which were detected in CTCs but not in tissue, suggested the presence of rare subclones. Our pilot study demonstrated superiority of cfDNA- compared to CTC-based mutation profiling. It was further shown that CTCs may serve as additional means to detect rare subclones possibly involved in treatment resistance. Both findings require validation in a larger patient cohort.
Full text
Available for:
EMUNI, FIS, FZAB, GEOZS, GIS, IJS, IMTLJ, KILJ, KISLJ, MFDPS, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, SBMB, SBNM, UKNU, UL, UM, UPUK, VKSCE, ZAGLJ
Gene or protein expression data are usually represented by metric or at least ordinal variables. In order to translate a continuous variable into a clinical decision, it is necessary to determine a ...cutoff point and to stratify patients into two groups each requiring a different kind of treatment. Currently, there is no standard method or standard software for biomarker cutoff determination. Therefore, we developed Cutoff Finder, a bundle of optimization and visualization methods for cutoff determination that is accessible online. While one of the methods for cutoff optimization is based solely on the distribution of the marker under investigation, other methods optimize the correlation of the dichotomization with respect to an outcome or survival variable. We illustrate the functionality of Cutoff Finder by the analysis of the gene expression of estrogen receptor (ER) and progesterone receptor (PgR) in breast cancer tissues. This distribution of these important markers is analyzed and correlated with immunohistologically determined ER status and distant metastasis free survival. Cutoff Finder is expected to fill a relevant gap in the available biometric software repertoire and will enable faster optimization of new diagnostic biomarkers. The tool can be accessed at http://molpath.charite.de/cutoff.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Follicular dendritic cells (FDCs) regulate B cell function and development of high affinity antibody responses but little is known about their biology. FDCs associate in intricate cellular networks ...within secondary lymphoid organs. In vitro and ex vivo methods, therefore, allow only limited understanding of the genuine immunobiology of FDCs in their native habitat. Herein, we used various multicolor fate mapping systems to investigate the ontogeny and dynamics of lymph node (LN) FDCs in situ. We show that LN FDC networks arise from the clonal expansion and differentiation of marginal reticular cells (MRCs), a population of lymphoid stromal cells lining the LN subcapsular sinus. We further demonstrate that during an immune response, FDCs accumulate in germinal centers and that neither the recruitment of circulating progenitors nor the division of local mature FDCs significantly contributes to this accumulation. Rather, we provide evidence that newly generated FDCs also arise from the proliferation and differentiation of MRCs, thus unraveling a critical function of this poorly defined stromal cell population.
Pulmonary enteric adenocarcinoma is a rare non-small cell lung cancer subtype. It is poorly characterized and cannot be distinguished from metastatic colorectal or upper gastrointestinal ...adenocarcinomas by means of routine pathological methods. As DNA methylation patterns are known to be highly tissue specific, we aimed to develop a methylation-based algorithm to differentiate these entities. To this end, genome-wide methylation profiles of 600 primary pulmonary, colorectal, and upper gastrointestinal adenocarcinomas obtained from The Cancer Genome Atlas and the Gene Expression Omnibus database were used as a reference cohort to train a machine learning algorithm. The resulting classifier correctly classified all samples from a validation cohort consisting of 680 primary pulmonary, colorectal and upper gastrointestinal adenocarcinomas, demonstrating the ability of the algorithm to reliably distinguish these three entities. We then analyzed methylation data of 15 pulmonary enteric adenocarcinomas as well as four pulmonary metastases and four primary colorectal adenocarcinomas with the algorithm. All 15 pulmonary enteric adenocarcinomas were reliably classified as primary pulmonary tumors and all four metastases as well as all four primary colorectal cancer samples were identified as colorectal adenocarcinomas. In a t-distributed stochastic neighbor embedding analysis, the pulmonary enteric adenocarcinoma samples did not form a separate methylation subclass but rather diffusely intermixed with other pulmonary cancers. Additional characterization of the pulmonary enteric adenocarcinoma series using fluorescence in situ hybridization, next-generation sequencing and copy number analysis revealed KRAS mutations in nine of 15 samples (60%) and a high number of structural chromosomal changes. Except for an unusually high rate of chromosome 20 gain (67%), the molecular data was mostly reminiscent of standard pulmonary adenocarcinomas. In conclusion, we provide sound evidence of the pulmonary origin of pulmonary enteric adenocarcinomas and in addition provide a publicly available machine learning-based algorithm to reliably distinguish these tumors from metastatic colorectal cancer.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
Head and neck squamous cell carcinoma (HNSC) patients are at risk of suffering from both pulmonary metastases or a second squamous cell carcinoma of the lung (LUSC). Differentiating pulmonary ...metastases from primary lung cancers is of high clinical importance, but not possible in most cases with current diagnostics. To address this, we performed DNA methylation profiling of primary tumors and trained three different machine learning methods to distinguish metastatic HNSC from primary LUSC. We developed an artificial neural network that correctly classified 96.4% of the cases in a validation cohort of 279 patients with HNSC and LUSC as well as normal lung controls, outperforming support vector machines (95.7%) and random forests (87.8%). Prediction accuracies of more than 99% were achieved for 92.1% (neural network), 90% (support vector machine), and 43% (random forest) of these cases by applying thresholds to the resulting probability scores and excluding samples with low confidence. As independent clinical validation of the approach, we analyzed a series of 51 patients with a history of HNSC and a second lung tumor, demonstrating the correct classifications based on clinicopathological properties. In summary, our approach may facilitate the reliable diagnostic differentiation of pulmonary metastases of HNSC from primary LUSC to guide therapeutic decisions.
Automated image analysis of cells and tissues has been an active research field in medical informatics for decades but has recently attracted increased attention due to developments in computer and ...microscopy hardware and the awareness that scientific and diagnostic pathology require novel approaches to perform objective quantitative analyses of cellular and tissue specimens. Model-based approaches use a priori information on cell shape features to obtain the segmentation, which may introduce a bias favouring the detection of cell nuclei only with certain properties. In this study we present a novel contour-based "minimum-model" cell detection and segmentation approach that uses minimal a priori information and detects contours independent of their shape. This approach avoids a segmentation bias with respect to shape features and allows for an accurate segmentation (precision = 0.908; recall = 0.859; validation based on ∼8000 manually-labeled cells) of a broad spectrum of normal and disease-related morphological features without the requirement of prior training.