The diagnosis of most cancers is made by a board-certified pathologist based on a tissue biopsy under the microscope. Recent research reveals a high discordance between individual pathologists. For ...melanoma, the literature reports on 25–26% of discordance for classifying a benign nevus versus malignant melanoma. A recent study indicated the potential of deep learning to lower these discordances. However, the performance of deep learning in classifying histopathologic melanoma images was never compared directly to human experts. The aim of this study is to perform such a first direct comparison.
A total of 695 lesions were classified by an expert histopathologist in accordance with current guidelines (350 nevi/345 melanoma). Only the haematoxylin & eosin (H&E) slides of these lesions were digitalised via a slide scanner and then randomly cropped. A total of 595 of the resulting images were used to train a convolutional neural network (CNN). The additional 100 H&E image sections were used to test the results of the CNN in comparison to 11 histopathologists. Three combined McNemar tests comparing the results of the CNNs test runs in terms of sensitivity, specificity and accuracy were predefined to test for significance (p < 0.05).
The CNN achieved a mean sensitivity/specificity/accuracy of 76%/60%/68% over 11 test runs. In comparison, the 11 pathologists achieved a mean sensitivity/specificity/accuracy of 51.8%/66.5%/59.2%. Thus, the CNN was significantly (p = 0.016) superior in classifying the cropped images.
With limited image information available, a CNN was able to outperform 11 histopathologists in the classification of histopathological melanoma images and thus shows promise to assist human melanoma diagnoses.
•A convolutional neural network (CNN) was trained with 595 histopathologic images of melanoma and nevi.•In a direct comparison, the CNN and 11 histopathologists classified a test set of 100 additional histopathologic images (1:1 melanoma/nevi).•The CNN systematically outperformed the 11 histopathologists in terms of overall accuracy, sensitivity and specificity (p = 0.016).
For clear cell renal cell carcinoma (ccRCC) risk-dependent diagnostic and therapeutic algorithms are routinely implemented in clinical practice. Artificial intelligence-based image analysis has the ...potential to improve outcome prediction and thereby risk stratification. Thus, we investigated whether a convolutional neural network (CNN) can extract relevant image features from a representative hematoxylin and eosin-stained slide to predict 5-year overall survival (5y-OS) in ccRCC. The CNN was trained to predict 5y-OS in a binary manner using slides from TCGA and validated using an independent in-house cohort. Multivariable logistic regression was used to combine of the CNNs prediction and clinicopathological parameters. A mean balanced accuracy of 72.0% (standard deviation SD = 7.9%), sensitivity of 72.4% (SD = 10.6%), specificity of 71.7% (SD = 11.9%) and area under receiver operating characteristics curve (AUROC) of 0.75 (SD = 0.07) was achieved on the TCGA training set (n = 254 patients / WSIs) using 10-fold cross-validation. On the external validation cohort (n = 99 patients / WSIs), mean accuracy, sensitivity, specificity and AUROC were 65.5% (95%-confidence interval CI: 62.9–68.1%), 86.2% (95%-CI: 81.8–90.5%), 44.9% (95%-CI: 40.2–49.6%), and 0.70 (95%-CI: 0.69–0.71). A multivariable model including age, tumor stage and metastasis yielded an AUROC of 0.75 on the TCGA cohort. The inclusion of the CNN-based classification (Odds ratio = 4.86, 95%-CI: 2.70–8.75, p < 0.01) raised the AUROC to 0.81. On the validation cohort, both models showed an AUROC of 0.88. In univariable Cox regression, the CNN showed a hazard ratio of 3.69 (95%-CI: 2.60–5.23, p < 0.01) on TCGA and 2.13 (95%-CI: 0.92–4.94, p = 0.08) on external validation. The results demonstrate that the CNN’s image-based prediction of survival is promising and thus this widely applicable technique should be further investigated with the aim of improving existing risk stratification in ccRCC.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Background:
Artificial intelligence (AI) has shown promise in numerous experimental studies, particularly in skin cancer diagnostics. Translation of these findings into the clinic is the logical next ...step. This translation can only be successful if patients' concerns and questions are addressed suitably. We therefore conducted a survey to evaluate the patients' view of artificial intelligence in melanoma diagnostics in Germany, with a particular focus on patients with a history of melanoma.
Participants and Methods:
A web-based questionnaire was designed using LimeSurvey, sent by e-mail to university hospitals and melanoma support groups and advertised on social media. The anonymous questionnaire evaluated patients' expectations and concerns toward artificial intelligence in general as well as their attitudes toward different application scenarios. Descriptive analysis was performed with expression of categorical variables as percentages and 95% confidence intervals. Statistical tests were performed to investigate associations between sociodemographic data and selected items of the questionnaire.
Results:
298 individuals (154 with a melanoma diagnosis, 143 without) responded to the questionnaire. About 94% 95% CI = 0.91–0.97 of respondents supported the use of artificial intelligence in medical approaches. 88% 95% CI = 0.85–0.92 would even make their own health data anonymously available for the further development of AI-based applications in medicine. Only 41% 95% CI = 0.35–0.46 of respondents were amenable to the use of artificial intelligence as stand-alone system, 94% 95% CI = 0.92–0.97 to its use as assistance system for physicians. In sub-group analyses, only minor differences were detectable. Respondents with a previous history of melanoma were more amenable to the use of AI applications for early detection even at home. They would prefer an application scenario where physician and AI classify the lesions independently. With respect to AI-based applications in medicine, patients were concerned about insufficient data protection, impersonality and susceptibility to errors, but expected faster, more precise and unbiased diagnostics, less diagnostic errors and support for physicians.
Conclusions:
The vast majority of participants exhibited a positive attitude toward the use of artificial intelligence in melanoma diagnostics, especially as an assistance system.
Background
Recent years have been witnessing a substantial improvement in the accuracy of skin cancer classification using convolutional neural networks (CNNs). CNNs perform on par with or better ...than dermatologists with respect to the classification tasks of single images. However, in clinical practice, dermatologists also use other patient data beyond the visual aspects present in a digitized image, further increasing their diagnostic accuracy. Several pilot studies have recently investigated the effects of integrating different subtypes of patient data into CNN-based skin cancer classifiers.
Objective
This systematic review focuses on the current research investigating the impact of merging information from image features and patient data on the performance of CNN-based skin cancer image classification. This study aims to explore the potential in this field of research by evaluating the types of patient data used, the ways in which the nonimage data are encoded and merged with the image features, and the impact of the integration on the classifier performance.
Methods
Google Scholar, PubMed, MEDLINE, and ScienceDirect were screened for peer-reviewed studies published in English that dealt with the integration of patient data within a CNN-based skin cancer classification. The search terms skin cancer classification, convolutional neural network(s), deep learning, lesions, melanoma, metadata, clinical information, and patient data were combined.
Results
A total of 11 publications fulfilled the inclusion criteria. All of them reported an overall improvement in different skin lesion classification tasks with patient data integration. The most commonly used patient data were age, sex, and lesion location. The patient data were mostly one-hot encoded. There were differences in the complexity that the encoded patient data were processed with regarding deep learning methods before and after fusing them with the image features for a combined classifier.
Conclusions
This study indicates the potential benefits of integrating patient data into CNN-based diagnostic algorithms. However, how exactly the individual patient data enhance classification performance, especially in the case of multiclass classification problems, is still unclear. Moreover, a substantial fraction of patient data used by dermatologists remains to be analyzed in the context of CNN-based skin cancer classification. Further exploratory analyses in this promising field may optimize patient data integration into CNN-based skin cancer diagnostics for patients’ benefits.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, UILJ, UKNU, UL, UM, UPUK
Recent studies have shown that deep learning is capable of classifying dermatoscopic images at least as well as dermatologists. However, many studies in skin cancer classification utilize ...non-biopsy-verified training images. This imperfect ground truth introduces a systematic error, but the effects on classifier performance are currently unknown. Here, we systematically examine the effects of label noise by training and evaluating convolutional neural networks (CNN) with 804 images of melanoma and nevi labeled either by dermatologists or by biopsy. The CNNs are evaluated on a test set of 384 images by means of 4-fold cross validation comparing the outputs with either the corresponding dermatological or the biopsy-verified diagnosis. With identical ground truths of training and test labels, high accuracies with 75.03% (95% CI: 74.39-75.66%) for dermatological and 73.80% (95% CI: 73.10-74.51%) for biopsy-verified labels can be achieved. However, if the CNN is trained and tested with different ground truths, accuracy drops significantly to 64.53% (95% CI: 63.12-65.94%,
< 0.01) on a non-biopsy-verified and to 64.24% (95% CI: 62.66-65.83%,
< 0.01) on a biopsy-verified test set. In conclusion, deep learning methods for skin cancer classification are highly sensitive to label noise and future work should use biopsy-verified training images to mitigate this problem.
Early detection of melanoma can be lifesaving but this remains a challenge. Recent diagnostic studies have revealed the superiority of artificial intelligence (AI) in classifying dermoscopic images ...of melanoma and nevi, concluding that these algorithms should assist a dermatologist's diagnoses.
The aim of this study was to investigate whether AI support improves the accuracy and overall diagnostic performance of dermatologists in the dichotomous image-based discrimination between melanoma and nevus.
Twelve board-certified dermatologists were presented disjoint sets of 100 unique dermoscopic images of melanomas and nevi (total of 1200 unique images), and they had to classify the images based on personal experience alone (part I) and with the support of a trained convolutional neural network (CNN, part II). Additionally, dermatologists were asked to rate their confidence in their final decision for each image.
While the mean specificity of the dermatologists based on personal experience alone remained almost unchanged (70.6% vs 72.4%; P=.54) with AI support, the mean sensitivity and mean accuracy increased significantly (59.4% vs 74.6%; P=.003 and 65.0% vs 73.6%; P=.002, respectively) with AI support. Out of the 10% (10/94; 95% CI 8.4%-11.8%) of cases where dermatologists were correct and AI was incorrect, dermatologists on average changed to the incorrect answer for 39% (4/10; 95% CI 23.2%-55.6%) of cases. When dermatologists were incorrect and AI was correct (25/94, 27%; 95% CI 24.0%-30.1%), dermatologists changed their answers to the correct answer for 46% (11/25; 95% CI 33.1%-58.4%) of cases. Additionally, the dermatologists' average confidence in their decisions increased when the CNN confirmed their decision and decreased when the CNN disagreed, even when the dermatologists were correct. Reported values are based on the mean of all participants. Whenever absolute values are shown, the denominator and numerator are approximations as every dermatologist ended up rating a varying number of images due to a quality control step.
The findings of our study show that AI support can improve the overall accuracy of the dermatologists in the dichotomous image-based discrimination between melanoma and nevus. This supports the argument for AI-based tools to aid clinicians in skin lesion classification and provides a rationale for studies of such classifiers in real-life settings, wherein clinicians can integrate additional information such as patient age and medical history into their decisions.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, UILJ, UKNU, UL, UM, UPUK
Objective
To develop a new digital biomarker based on the analysis of primary tumour tissue by a convolutional neural network (CNN) to predict lymph node metastasis (LNM) in a cohort matched for ...already established risk factors.
Patients and Methods
Haematoxylin and eosin (H&E) stained primary tumour slides from 218 patients (102 N+; 116 N0), matched for Gleason score, tumour size, venous invasion, perineural invasion and age, who underwent radical prostatectomy were selected to train a CNN and evaluate its ability to predict LN status.
Results
With 10 models trained with the same data, a mean area under the receiver operating characteristic curve (AUROC) of 0.68 (95% confidence interval CI 0.678–0.682) and a mean balanced accuracy of 61.37% (95% CI 60.05–62.69%) was achieved. The mean sensitivity and specificity was 53.09% (95% CI 49.77–56.41%) and 69.65% (95% CI 68.21–71.1%), respectively. These results were confirmed via cross‐validation. The probability score for LNM prediction was significantly higher on image sections from N+ samples (mean SD N+ probability score 0.58 0.17 vs 0.47 0.15 N0 probability score, P = 0.002). In multivariable analysis, the probability score of the CNN (odds ratio OR 1.04 per percentage probability, 95% CI 1.02–1.08; P = 0.04) and lymphovascular invasion (OR 11.73, 95% CI 3.96–35.7; P < 0.001) proved to be independent predictors for LNM.
Conclusion
In our present study, CNN‐based image analyses showed promising results as a potential novel low‐cost method to extract relevant prognostic information directly from H&E histology to predict the LN status of patients with prostate cancer. Our ubiquitously available technique might contribute to an improved LN status prediction.
Sentinel lymph node status is a central prognostic factor for melanomas. However, the surgical excision involves some risks for affected patients. In this study, we therefore aimed to develop a ...digital biomarker that can predict lymph node metastasis non-invasively from digitised H&E slides of primary melanoma tumours.
A total of 415 H&E slides from primary melanoma tumours with known sentinel node (SN) status from three German university hospitals and one private pathological practice were digitised (150 SN positive/265 SN negative). Two hundred ninety-one slides were used to train artificial neural networks (ANNs). The remaining 124 slides were used to test the ability of the ANNs to predict sentinel status. ANNs were trained and/or tested on data sets that were matched or not matched between SN-positive and SN-negative cases for patient age, ulceration, and tumour thickness, factors that are known to correlate with lymph node status.
The best accuracy was achieved by an ANN that was trained and tested on unmatched cases (61.8% ± 0.2%) area under the receiver operating characteristic (AUROC). In contrast, ANNs that were trained and/or tested on matched cases achieved (55.0% ± 3.5%) AUROC or less.
Our results indicate that the image classifier can predict lymph node status to some, albeit so far not clinically relevant, extent. It may do so by mostly detecting equivalents of factors on histological slides that are already known to correlate with lymph node status. Our results provide a basis for future research with larger data cohorts.
•Lymph node analysis conveys relevant prognostic information.•Surgical removal of lymph nodes can be associated with considerable morbidity.•Thus, biomarkers that allow prediction of lymph node status are needed.•We trained a CNN on H&E slides of node positive and -negative primary melanomas.•This classifier could predict sentinel node status to some extent.