Machine learning, and especially deep learning, is rapidly gaining acceptance and clinical usage in a wide range of image analysis applications and is regarded as providing high performance in ...detecting anatomical structures and identification and classification of patterns of disease in medical images. However, there are many roadblocks to the widespread implementation of machine learning in clinical image analysis, including differences in data capture leading to different measurements, high dimensionality of imaging and other medical data, and the black-box nature of machine learning, with a lack of insight into relevant features. Techniques such as radiomics have been used in traditional machine learning approaches to model the mathematical relationships between adjacent pixels in an image and provide an explainable framework for clinicians and researchers. Newer paradigms, such as topological data analysis (TDA), have recently been adopted to design and develop innovative image analysis schemes that go beyond the abilities of pixel-to-pixel comparisons. TDA can automatically construct filtrations of topological shapes of image texture through a technique known as persistent homology (PH); these features can then be fed into machine learning models that provide explainable outputs and can distinguish different image classes in a computationally more efficient way, when compared to other currently used methods. The aim of this review is to introduce PH and its variants and to review TDA’s recent successes in medical imaging studies.
Key points
Topological data analysis (TDA) provides information on the shape of data.
In radiology, the shape of 2D and 3D images contains additional information.
TDA can be combined with other applications, such as textural analysis.
Persistent homology can provide a visual representation of extracted TDA data.
Deep-learning algorithms typically fall within the domain of supervised artificial intelligence and are designed to “learn” from annotated data. Deep-learning models require large, diverse training ...datasets for optimal model convergence. The effort to curate these datasets is widely regarded as a barrier to the development of deep-learning systems. We developed
RIL-Contour
to accelerate medical image annotation for and with deep-learning. A major goal driving the development of the software was to create an environment which enables clinically oriented users to utilize deep-learning models to rapidly annotate medical imaging.
RIL-Contour
supports using fully automated deep-learning methods, semi-automated methods, and manual methods to annotate medical imaging with voxel and/or text annotations. To reduce annotation error,
RIL-Contour
promotes the standardization of image annotations across a dataset.
RIL-Contour
accelerates medical imaging annotation through the process of annotation by iterative deep learning (AID). The underlying concept of AID is to iteratively annotate, train, and utilize deep-learning models during the process of dataset annotation and model development. To enable this,
RIL-Contour
supports workflows in which multiple-image analysts annotate medical images, radiologists approve the annotations, and data scientists utilize these annotations to train deep-learning models. To automate the feedback loop between data scientists and image analysts,
RIL-Contour
provides mechanisms to enable data scientists to push deep newly trained deep-learning models to other users of the software.
RIL-Contour
and the AID methodology accelerate dataset annotation and model development by facilitating rapid collaboration between analysts, radiologists, and engineers.
Purpose
Distinguishing stage 1–2 adrenocortical carcinoma (ACC) and large, lipid poor adrenal adenoma (LPAA) via imaging is challenging due to overlapping imaging characteristics. This study ...investigated the ability of deep learning to distinguish ACC and LPAA on single time-point CT images.
Methods
Retrospective cohort study from 1994 to 2022. Imaging studies of patients with adrenal masses who had available adequate CT studies and histology as the reference standard by method of adrenal biopsy and/or adrenalectomy were included as well as four patients with LPAA determined by stability or regression on follow-up imaging. Forty-eight (48) subjects with pathology-proven, stage 1–2 ACC and 43 subjects with adrenal adenoma >3 cm in size demonstrating a mean non-contrast CT attenuation > 20 Hounsfield Units centrally were included. We used annotated single time-point contrast-enhanced CT images of these adrenal masses as input to a 3D Densenet121 model for classifying as ACC or LPAA with five-fold cross-validation. For each fold, two checkpoints were reported, highest accuracy with highest sensitivity (accuracy focused) and highest sensitivity with the highest accuracy (sensitivity focused).
Results
We trained a deep learning model (3D Densenet121) to predict ACC versus LPAA. The sensitivity-focused model achieved mean accuracy: 87.2% and mean sensitivity: 100%. The accuracy-focused model achieved mean accuracy: 91% and mean sensitivity: 96%.
Conclusion
Deep learning demonstrates promising results distinguishing between ACC and large LPAA using single time-point CT images. Before being widely adopted in clinical practice, multicentric and external validation are needed.
Preoperative MR imaging in endometrial cancer patients provides valuable information on local tumor extent, which routinely guides choice of surgical procedure and adjuvant therapy. Furthermore, ...whole-volume tumor analyses of MR images may provide radiomic tumor signatures potentially relevant for better individualization and optimization of treatment. We apply a convolutional neural network for automatic tumor segmentation in endometrial cancer patients, enabling automated extraction of tumor texture parameters and tumor volume. The network was trained, validated and tested on a cohort of 139 endometrial cancer patients based on preoperative pelvic imaging. The algorithm was able to retrieve tumor volumes comparable to human expert level (likelihood-ratio test, Formula: see text). The network was also able to provide a set of segmentation masks with human agreement not different from inter-rater agreement of human experts (Wilcoxon signed rank test, Formula: see text, Formula: see text, and Formula: see text). An automatic tool for tumor segmentation in endometrial cancer patients enables automated extraction of tumor volume and whole-volume tumor texture features. This approach represents a promising method for automatic radiomic tumor profiling with potential relevance for better prognostication and individualization of therapeutic strategy in endometrial cancer.
Establishing imaging registries for large patient cohorts is challenging because manual labeling is tedious and relying solely on DICOM (digital imaging and communications in medicine) metadata can ...result in errors. We endeavored to establish an automated hip and pelvic radiography registry of total hip arthroplasty (THA) patients by utilizing deep-learning pipelines. The aims of the study were (1) to utilize these automated pipelines to identify all pelvic and hip radiographs with appropriate annotation of laterality and presence or absence of implants, and (2) to automatically measure acetabular component inclination and version for THA images.
We retrospectively retrieved 846,988 hip and pelvic radiography DICOM files from 20,378 patients who underwent primary or revision THA performed at our institution from 2000 to 2020. Metadata for the files were screened followed by extraction of imaging data. Two deep-learning algorithms (an EfficientNetB3 classifier and a YOLOv5 object detector) were developed to automatically determine the radiographic appearance of all files. Additional deep-learning algorithms were utilized to automatically measure the acetabular angles on anteroposterior pelvic and lateral hip radiographs. Algorithm performance was compared with that of human annotators on a random test sample of 5,000 radiographs.
Deep-learning algorithms enabled appropriate exclusion of 209,332 DICOM files (24.7%) as misclassified non-hip/pelvic radiographs or having corrupted pixel data. The final registry was automatically curated and annotated in <8 hours and included 168,551 anteroposterior pelvic, 176,890 anteroposterior hip, 174,637 lateral hip, and 117,578 oblique hip radiographs. The algorithms achieved 99.9% accuracy, 99.6% precision, 99.5% recall, and a 99.6% F1 score in determining the radiograph appearance.
We developed a highly accurate series of deep-learning algorithms to rapidly curate and annotate THA patient radiographs. This efficient pipeline can be utilized by other institutions or registries to construct radiography databases for patient care, longitudinal surveillance, and large-scale research. The stepwise approach for establishing a radiography registry can further be utilized as a workflow guide for other anatomic areas.
Diagnostic Level IV . See Instructions for Authors for a complete description of levels of evidence.
Curating and integrating data from sources are bottlenecks to procuring robust training datasets for artificial intelligence (AI) models in healthcare. While numerous applications can process ...discrete types of clinical data, it is still time-consuming to integrate heterogenous data types. Therefore, there exists a need for more efficient retrieval and storage of curated patient data from dissimilar sources, such as biobanks, health records, and sensors. We describe a customizable, modular data retrieval application (RIL-workflow), which integrates clinical notes, images, and prescription data, and show its feasibility applied to research at our institution. It uses the workflow automation platform Camunda (Camunda Services GmbH, Berlin, Germany) to collect internal data from Fast Healthcare Interoperability Resources (FHIR) and Digital Imaging and Communications in Medicine (DICOM) sources. Using the web-based graphical user interface (GUI), the workflow runs tasks to completion according to visual representation, retrieving and storing results for patients meeting study inclusion criteria while segregating errors for human review. We showcase RIL-workflow with its library of ready-to-use modules, enabling researchers to specify human input or automation at fixed steps. We validated our workflow by demonstrating its capability to aggregate, curate, and handle errors related to data from multiple sources to generate a multimodal database for clinical AI research. Further, we solicited user feedback to highlight the pros and cons associated with RIL-workflow. The source code is available at github.com/magnooj/RIL-workflow.
Context:
Pituitary stalk lesions have various etiologies, often not clinically apparent. Pathological samples from these lesions are rarely obtained, because of the critical location and function of ...the hypophyseal stalk.
Objectives:
The purpose of this study was to characterize the etiological spectrum of pituitary stalk lesions seen at Mayo Clinic Rochester over 20 years and to determine whether specific magnetic resonance imaging (MRI) characteristics could provide clinician guidance with regard to the etiology of infundibular lesions.
Design:
A retrospective review of patients with pituitary stalk lesions seen at Mayo Clinic Rochester between 1987 and 2006 was conducted. Demographic, clinical presentation, imaging, laboratory, operative, and pathology data were reviewed and are reported using descriptive statistics.
Results:
Of the 152 pituitary stalk lesions included, 49 (32%) were neoplastic, 30 (20%) were inflammatory, 13 (9%) were congenital anomalies, and 60 (39%) were of unclear etiology. Diabetes insipidus was diagnosed in 43 (28%) of the 152 patients, and 49 (32%) patients had at least one anterior pituitary hormone deficit. Secondary hypogonadism was the most common endocrine deficiency. Eleven of 13 congenital lesions were round in appearance and 5 of 7 patients with neurosarcoidosis confirmed by pathology had a uniformly thickened pituitary stalk on MRI. There were no statistically significant correlations between hypopituitarism and the pattern of enhancement or size of the lesion.
Conclusions:
Findings on MRI remain key in guiding the diagnosis of pituitary stalk lesions, particularly when used in conjunction with other clinical clues. There are no good imaging predictors for hypopituitarism, making clinical evaluation of all patients with pituitary stalk lesions crucial.
Predicting methylation of the O6-methylguanine methyltransferase (MGMT) gene status utilizing MRI imaging is of high importance since it is a predictor of response and prognosis in brain tumors. In ...this study, we compare three different residual deep neural network (ResNet) architectures to evaluate their ability in predicting MGMT methylation status without the need for a distinct tumor segmentation step. We found that the ResNet50 (50 layers) architecture was the best performing model, achieving an accuracy of 94.90% (+/− 3.92%) for the test set (classification of a slice as no tumor, methylated MGMT, or non-methylated). ResNet34 (34 layers) achieved 80.72% (+/− 13.61%) while ResNet18 (18 layers) accuracy was 76.75% (+/− 20.67%). ResNet50 performance was statistically significantly better than both ResNet18 and ResNet34 architectures (
p
< 0.001). We report a method that alleviates the need of extensive preprocessing and acts as a proof of concept that deep neural architectures can be used to predict molecular biomarkers from routine medical images.
Revision total hip arthroplasty (THA) requires preoperatively identifying in situ implants, a time-consuming and sometimes unachievable task. Although deep learning (DL) tools have been attempted to ...automate this process, existing approaches are limited by classifying few femoral and zero acetabular components, only classify on anterior-posterior (AP) radiographs, and do not report prediction uncertainty or flag outlier data.
This study introduces Total Hip Arhtroplasty Automated Implant Detector (THA-AID), a DL tool trained on 241,419 radiographs that identifies common designs of 20 femoral and 8 acetabular components from AP, lateral, or oblique views and reports prediction uncertainty using conformal prediction and outlier detection using a custom framework. We evaluated THA-AID using internal, external, and out-of-domain test sets and compared its performance with human experts.
THA-AID achieved internal test set accuracies of 98.9% for both femoral and acetabular components with no significant differences based on radiographic view. The femoral classifier also achieved 97.0% accuracy on the external test set. Adding conformal prediction increased true label prediction by 0.1% for acetabular and 0.7 to 0.9% for femoral components. More than 99% of out-of-domain and >89% of in-domain outlier data were correctly identified by THA-AID.
The THA-AID is an automated tool for implant identification from radiographs with exceptional performance on internal and external test sets and no decrement in performance based on radiographic view. Importantly, this is the first study in orthopedics to our knowledge including uncertainty quantification and outlier detection of a DL model.
Idiopathic pulmonary fibrosis (IPF) is a progressive, often fatal form of interstitial lung disease (ILD) characterized by the absence of a known cause and usual interstitial pneumonitis (UIP) ...pattern on chest CT imaging and/or histopathology. Distinguishing UIP/IPF from other ILD subtypes is essential given different treatments and prognosis. Lung biopsy is necessary when noninvasive data are insufficient to render a confident diagnosis.
Can we improve noninvasive diagnosis of UIP by predicting ILD histopathology from CT scans by using deep learning?
This study retrospectively identified a cohort of 1,239 patients in a multicenter database with pathologically proven ILD who had chest CT imaging. Each case was assigned a label based on histopathologic diagnosis (UIP or non-UIP). A custom deep learning model was trained to predict class labels from CT images (training set, n = 894) and was evaluated on a 198-patient test set. Separately, two subspecialty-trained radiologists manually labeled each CT scan in the test set according to the 2018 American Thoracic Society IPF guidelines. The performance of the model in predicting histopathologic class was compared against radiologists’ performance by using area under the receiver-operating characteristic curve as the primary metric. Deep learning model reproducibility was compared against intra-rater and inter-rater radiologist reproducibility.
For the entire cohort, mean patient age was 62 ± 12 years, and 605 patients were female (49%). Deep learning performance was superior to visual analysis in predicting histopathologic diagnosis (area under the receiver-operating characteristic curve, 0.87 vs 0.80, respectively; P < .05). Deep learning model reproducibility was significantly greater than radiologist inter-rater and intra-rater reproducibility (95% CI for difference in Krippendorff’s alpha did not include zero).
Deep learning may be superior to visual assessment in predicting UIP/IPF histopathology from CT imaging and may serve as an alternative to invasive lung biopsy.
Display omitted