Quantitative analysis of brain MRI is routine for many neurological diseases and conditions and relies on accurate segmentation of structures of interest. Deep learning-based segmentation approaches ...for brain MRI are gaining interest due to their self-learning and generalization ability over large amounts of data. As the deep learning architectures are becoming more mature, they gradually outperform previous state-of-the-art classical machine learning algorithms. This review aims to provide an overview of current deep learning-based segmentation approaches for quantitative brain MRI. First we review the current deep learning architectures used for segmentation of anatomical brain structures and brain lesions. Next, the performance, speed, and properties of deep learning approaches are summarized and discussed. Finally, we provide a critical assessment of the current state and identify likely future developments and trends.
Detecting microsatellite instability (MSI) in colorectal cancer is crucial for clinical decision making, as it identifies patients with differential treatment response and prognosis. Universal MSI ...testing is recommended, but many patients remain untested. A critical need exists for broadly accessible, cost-efficient tools to aid patient selection for testing. Here, we investigate the potential of a deep learning-based system for automated MSI prediction directly from haematoxylin and eosin (H&E)-stained whole-slide images (WSIs).
Our deep learning model (MSINet) was developed using 100 H&E-stained WSIs (50 with microsatellite stability MSS and 50 with MSI) scanned at 40× magnification, each from a patient randomly selected in a class-balanced manner from the pool of 343 patients who underwent primary colorectal cancer resection at Stanford University Medical Center (Stanford, CA, USA; internal dataset) between Jan 1, 2015, and Dec 31, 2017. We internally validated the model on a holdout test set (15 H&E-stained WSIs from 15 patients; seven cases with MSS and eight with MSI) and externally validated the model on 484 H&E-stained WSIs (402 cases with MSS and 77 with MSI; 479 patients) from The Cancer Genome Atlas, containing WSIs scanned at 40× and 20× magnification. Performance was primarily evaluated using the sensitivity, specificity, negative predictive value (NPV), and area under the receiver operating characteristic curve (AUROC). We compared the model's performance with that of five gastrointestinal pathologists on a class-balanced, randomly selected subset of 40× magnification WSIs from the external dataset (20 with MSS and 20 with MSI).
The MSINet model achieved an AUROC of 0·931 (95% CI 0·771–1·000) on the holdout test set from the internal dataset and 0·779 (0·720–0·838) on the external dataset. On the external dataset, using a sensitivity-weighted operating point, the model achieved an NPV of 93·7% (95% CI 90·3–96·2), sensitivity of 76·0% (64·8–85·1), and specificity of 66·6% (61·8–71·2). On the reader experiment (40 cases), the model achieved an AUROC of 0·865 (95% CI 0·735–0·995). The mean AUROC performance of the five pathologists was 0·605 (95% CI 0·453–0·757).
Our deep learning model exceeded the performance of experienced gastrointestinal pathologists at predicting MSI on H&E-stained WSIs. Within the current universal MSI testing paradigm, such a model might contribute value as an automated screening tool to triage patients for confirmatory testing, potentially reducing the number of tested patients, thereby resulting in substantial test-related labour and cost savings.
Stanford Cancer Institute and Stanford Departments of Pathology and Biomedical Data Science.
Lung cancer is the most prevalent cancer worldwide, and histopathological assessment is indispensable for its diagnosis. However, human evaluation of pathology slides cannot accurately predict ...patients' prognoses. In this study, we obtain 2,186 haematoxylin and eosin stained histopathology whole-slide images of lung adenocarcinoma and squamous cell carcinoma patients from The Cancer Genome Atlas (TCGA), and 294 additional images from Stanford Tissue Microarray (TMA) Database. We extract 9,879 quantitative image features and use regularized machine-learning methods to select the top features and to distinguish shorter-term survivors from longer-term survivors with stage I adenocarcinoma (P<0.003) or squamous cell carcinoma (P=0.023) in the TCGA data set. We validate the survival prediction framework with the TMA cohort (P<0.036 for both tumour types). Our results suggest that automatically derived image features can predict the prognosis of lung cancer patients and thereby contribute to precision oncology. Our methods are extensible to histopathology images of other organs.
We sought to investigate associations between dynamic contrast-enhanced (DCE) magnetic resonance imaging (MRI) features and tumor-infiltrating lymphocytes (TILs) in breast cancer, as well as to study ...if MRI features are complementary to molecular markers of TILs.
In this retrospective study, we extracted 17 computational DCE-MRI features to characterize tumor and parenchyma in The Cancer Genome Atlas cohort (n = 126). The percentage of stromal TILs was evaluated on H&E-stained histological whole-tumor sections. We first evaluated associations between individual imaging features and TILs. Multiple-hypothesis testing was corrected by the Benjamini-Hochberg method using false discovery rate (FDR). Second, we implemented LASSO (least absolute shrinkage and selection operator) and linear regression nested with tenfold cross-validation to develop an imaging signature for TILs. Next, we built a composite prediction model for TILs by combining imaging signature with molecular features. Finally, we tested the prognostic significance of the TIL model in an independent cohort (I-SPY 1; n = 106).
Four imaging features were significantly associated with TILs (P < 0.05 and FDR < 0.2), including tumor volume, cluster shade of signal enhancement ratio (SER), mean SER of tumor-surrounding background parenchymal enhancement (BPE), and proportion of BPE. Among molecular and clinicopathological factors, only cytolytic score was correlated with TILs (ρ = 0.51; 95% CI, 0.36-0.63; P = 1.6E-9). An imaging signature that linearly combines five features showed correlation with TILs (ρ = 0.40; 95% CI, 0.24-0.54; P = 4.2E-6). A composite model combining the imaging signature and cytolytic score improved correlation with TILs (ρ = 0.62; 95% CI, 0.50-0.72; P = 9.7E-15). The composite model successfully distinguished low vs high, intermediate vs high, and low vs intermediate TIL groups, with AUCs of 0.94, 0.76, and 0.79, respectively. During validation (I-SPY 1), the predicted TILs from the imaging signature separated patients into two groups with distinct recurrence-free survival (RFS), with log-rank P = 0.042 among triple-negative breast cancer (TNBC). The composite model further improved stratification of patients with distinct RFS (log-rank P = 0.0008), where TNBC with no/minimal TILs had a worse prognosis.
Specific MRI features of tumor and parenchyma are associated with TILs in breast cancer, and imaging may play an important role in the evaluation of TILs by providing key complementary information in equivocal cases or situations that are prone to sampling bias.
•Rough feature extraction and clustering is used to reveal the diversity of large data.•Independent evaluation of representative regions avoids signal dilution.•Signal based aggregation of data ...subsets gives strong predictions in noisy data.•A diverse feature set allows for better modeling of pathology images.
Computerized analysis of digital pathology images offers the potential of improving clinical care (e.g. automated diagnosis) and catalyzing research (e.g. discovering disease subtypes). There are two key challenges thwarting computerized analysis of digital pathology images: first, whole slide pathology images are massive, making computerized analysis inefficient, and second, diverse tissue regions in whole slide images that are not directly relevant to the disease may mislead computerized diagnosis algorithms. We propose a method to overcome both of these challenges that utilizes a coarse-to-fine analysis of the localized characteristics in pathology images. An initial surveying stage analyzes the diversity of coarse regions in the whole slide image. This includes extraction of spatially localized features of shape, color and texture from tiled regions covering the slide. Dimensionality reduction of the features assesses the image diversity in the tiled regions and clustering creates representative groups. A second stage provides a detailed analysis of a single representative tile from each group. An Elastic Net classifier produces a diagnostic decision value for each representative tile. A weighted voting scheme aggregates the decision values from these tiles to obtain a diagnosis at the whole slide level. We evaluated our method by automatically classifying 302 brain cancer cases into two possible diagnoses (glioblastoma multiforme (N = 182) versus lower grade glioma (N = 120)) with an accuracy of 93.1 % (p << 0.001). We also evaluated our method in the dataset provided for the 2014 MICCAI Pathology Classification Challenge, in which our method, trained and tested using 5-fold cross validation, produced a classification accuracy of 100% (p << 0.001). Our method showed high stability and robustness to parameter variation, with accuracy varying between 95.5% and 100% when evaluated for a wide range of parameters. Our approach may be useful to automatically differentiate between the two cancer subtypes.
Display omitted
Deep learning has become a promising approach for automated support for clinical diagnosis. When medical data samples are limited, collaboration among multiple institutions is necessary to achieve ...high algorithm performance. However, sharing patient data often has limitations due to technical, legal, or ethical concerns. In this study, we propose methods of distributing deep learning models as an attractive alternative to sharing patient data.
We simulate the distribution of deep learning models across 4 institutions using various training heuristics and compare the results with a deep learning model trained on centrally hosted patient data. The training heuristics investigated include ensembling single institution models, single weight transfer, and cyclical weight transfer. We evaluated these approaches for image classification in 3 independent image collections (retinal fundus photos, mammography, and ImageNet).
We find that cyclical weight transfer resulted in a performance that was comparable to that of centrally hosted patient data. We also found that there is an improvement in the performance of cyclical weight transfer heuristic with a high frequency of weight transfer.
We show that distributing deep learning models is an effective alternative to sharing patient data. This finding has implications for any collaborative deep learning study.
Brain glioma is the most common primary malignant brain tumors in adults with different pathologic subtypes: Lower Grade Glioma (LGG) Grade II, Lower Grade Glioma (LGG) Grade III, and Glioblastoma ...Multiforme (GBM) Grade IV. The survival and treatment options are highly dependent of this glioma grade. We propose a deep learning-based, modular classification pipeline for automated grading of gliomas using digital pathology images. Whole tissue digitized images of pathology slides obtained from The Cancer Genome Atlas (TCGA) were used to train our deep learning modules. Our modular pipeline provides diagnostic quality statistics, such as precision, sensitivity and specificity, of the individual deep learning modules, and (1) facilitates training given the limited data in this domain, (2) enables exploration of different deep learning structures for each module, (3) leads to developing less complex modules that are simpler to analyze, and (4) provides flexibility, permitting use of single modules within the framework or use of other modeling or machine learning applications, such as probabilistic graphical models or support vector machines. Our modular approach helps us meet the requirements of minimum accuracy levels that are demanded by the context of different decision points within a multi-class classification scheme. Convolutional Neural Networks are trained for each module for each sub-task with more than 90% classification accuracies on validation data set, and achieved classification accuracy of 96% for the task of GBM vs LGG classification, 71% for further identifying the grade of LGG into Grade II or Grade III on independent data set coming from new patients from the multi-institutional repository.
Published research results are difficult to replicate due to the lack of a standard evaluation data set in the area of decision support systems in mammography; most computer-aided diagnosis (CADx) ...and detection (CADe) algorithms for breast cancer in mammography are evaluated on private data sets or on unspecified subsets of public databases. This causes an inability to directly compare the performance of methods or to replicate prior results. We seek to resolve this substantial challenge by releasing an updated and standardized version of the Digital Database for Screening Mammography (DDSM) for evaluation of future CADx and CADe systems (sometimes referred to generally as CAD) research in mammography. Our data set, the CBIS-DDSM (Curated Breast Imaging Subset of DDSM), includes decompressed images, data selection and curation by trained mammographers, updated mass segmentation and bounding boxes, and pathologic diagnosis for training data, formatted similarly to modern computer vision data sets. The data set contains 753 calcification cases and 891 mass cases, providing a data-set size capable of analyzing decision support systems in mammography.