•Demonstrate a bias in a commonly used publicly available brain MRI dataset.•Use several tests to isolate and explain this bias.•This work shows the importance of understanding machine learning ...models.
Machine learning is revolutionising medical image analysis, and clearly the future of the field lies in this direction. However, with increasing automation there is a danger of misunderstanding or misinterpreting models. In this paper, we expose an underlying bias in a commonly used publicly available brain tumour MRI dataset. We propose that this is due to implicit radiologist input in the selection of the 2D slices. Through several experiments we show how this bias allows us to achieve a high tumour classification accuracy, even with no information regarding the tumour itself. No other papers that use the dataset mention this bias. These findings demonstrate the importance of understanding machine learning models and their medical context, and the perils of not doing so.
Display omitted
Textural and shape analysis is gaining considerable interest in medical imaging, particularly to identify parameters characterizing tumor heterogeneity and to feed radiomic models. Here, we present a ...free, multiplatform, and easy-to-use freeware called LIFEx, which enables the calculation of conventional, histogram-based, textural, and shape features from PET, SPECT, MR, CT, and US images, or from any combination of imaging modalities. The application does not require any programming skills and was developed for medical imaging professionals. The goal is that independent and multicenter evidence of the usefulness and limitations of radiomic features for characterization of tumor heterogeneity and subsequent patient management can be gathered. Many options are offered for interactive textural index calculation and for increasing the reproducibility among centers. The software already benefits from a large user community (more than 800 registered users), and interactions within that community are part of the development strategy.
This study presents a user-friendly, multi-platform freeware to extract radiomic features from PET, SPECT, MR, CT, and US images, or any combination of imaging modalities.
.
The use of texture indices to characterize tumor heterogeneity from PET images is being increasingly investigated in retrospective studies, yet the interpretation of PET-derived texture index values ...has not been thoroughly reported. Furthermore, the calculation of texture indices lacks a standardized methodology, making it difficult to compare published results. To allow for texture index value interpretation, we investigated the changes in value of 6 texture indices computed from simulated and real patient data.
Ten sphere models mimicking different activity distribution patterns and the
F-FDG PET images from 54 patients with breast cancer were used. For each volume of interest, 6 texture indices were measured. The values of texture indices and how they changed as a function of the activity distribution were assessed and compared with the visual assessment of tumor heterogeneity.
Using the sphere models and real tumors, we identified 2 sets of texture indices reflecting different types of uptake heterogeneity. Set 1 included homogeneity, entropy, short-run emphasis, and long-run emphasis, all of which were sensitive to the presence of uptake heterogeneity but did not distinguish between hyper- and hyposignal within an otherwise uniform activity distribution. Set 2 comprised high-gray-level-zone emphasis and low-gray-level-zone emphasis, which were mostly sensitive to the average uptake rather than to the uptake local heterogeneity. Four of 6 texture indices significantly differed between homogeneous and heterogeneous lesions as defined by 2 nuclear medicine physicians (
< 0.05). All texture index values were sensitive to voxel size (variations up to 85.8% for the most homogeneous sphere models) and edge effects (variations up to 29.1%).
Unlike a previous report, our study found that variations in texture indices were intuitive in the sphere models and real tumors: the most homogeneous uptake distribution exhibited the highest homogeneity and lowest entropy. Two families of texture index reflecting different types of uptake patterns were identified. Variability in texture index values as a function of voxel size and inclusion of tumor edges was demonstrated, calling for a standardized calculation methodology. This study provides guidance for nuclear medicine physicians in interpreting texture indices in future studies and clinical practice.
Several reports have shown that radiomic features are affected by acquisition and reconstruction parameters, thus hampering multicenter studies. We propose a method that, by removing the center ...effect while preserving patient-specific effects, standardizes features measured from PET images obtained using different imaging protocols.
Pretreatment
F-FDG PET images of patients with breast cancer were included. In one nuclear medicine department (department A), 63 patients were scanned on a time-of-flight PET/CT scanner, and 16 lesions were triple-negative (TN). In another nuclear medicine department (department B), 74 patients underwent PET/CT on a different brand of scanner and a different reconstruction protocol, and 15 lesions were TN. The images from department A were smoothed using a gaussian filter to mimic data from a third department (department A-S). The primary lesion was segmented to obtain a lesion volume of interest (VOI), and a spheric VOI was set in healthy liver tissue. Three SUVs and 6 textural features were computed in all VOIs. A harmonization method initially described for genomic data was used to estimate the department effect based on the observed feature values. Feature distributions in each department were compared before and after harmonization.
In healthy liver tissue, the distributions significantly differed for 4 of 9 features between departments A and B and for 6 of 9 between departments A and A-S (
< 0.05, Wilcoxon test). After harmonization, none of the 9 feature distributions significantly differed between 2 departments (
> 0.1). The same trend was observed in lesions, with a realignment of feature distributions between the departments after harmonization. Identification of TN lesions was largely enhanced after harmonization when the cutoffs were determined on data from one department and applied to data from the other department.
The proposed harmonization method is efficient at removing the multicenter effect for textural features and SUVs. The method is easy to use, retains biologic variations not related to a center effect, and does not require any feature recalculation. Such harmonization allows for multicenter studies and for external validation of radiomic models or cutoffs and should facilitate the use of radiomic models in clinical practice.
The impact of PET image acquisition and reconstruction parameters on SUV measurements or radiomic feature values is widely documented. This scanner effect is detrimental to the design and validation ...of predictive or prognostic models and limits the use of large multicenter cohorts. To reduce the impact of this scanner effect, the ComBat method has been proposed and is now used in various contexts. The purpose of this article is to explain and illustrate the use of ComBat based on practical examples. We also give examples in which the ComBat assumptions are not met and, thus, in which ComBat should not be used.