Objectives
Diagnostic accuracy of artificial intelligence (AI) pneumothorax (PTX) detection in chest radiographs (CXR) is limited by the noisy annotation quality of public training data and ...confounding thoracic tubes (TT). We hypothesize that in-image annotations of the dehiscent visceral pleura for algorithm training boosts algorithm’s performance and suppresses confounders.
Methods
Our single-center evaluation cohort of 3062 supine CXRs includes 760 PTX-positive cases with radiological annotations of PTX size and inserted TTs. Three step-by-step improved algorithms (differing in algorithm architecture, training data from public datasets/clinical sites, and in-image annotations included in algorithm training) were characterized by area under the receiver operating characteristics (AUROC) in detailed subgroup analyses and referenced to the well-established “CheXNet” algorithm.
Results
Performances of established algorithms exclusively trained on publicly available data without in-image annotations are limited to AUROCs of 0.778 and strongly biased towards TTs that can completely eliminate algorithm’s discriminative power in individual subgroups. Contrarily, our final “algorithm 2” which was trained on a lower number of images but additionally with in-image annotations of the dehiscent pleura achieved an overall AUROC of 0.877 for unilateral PTX detection with a significantly reduced TT-related confounding bias.
Conclusions
We demonstrated strong limitations of an established PTX-detecting AI algorithm that can be significantly reduced by designing an AI system capable of learning to both classify and localize PTX. Our results are aimed at drawing attention to the necessity of high-quality in-image localization in training data to reduce the risks of unintentionally biasing the training process of pathology-detecting AI algorithms.
Key Points
• Established pneumothorax-detecting artificial intelligence algorithms trained on public training data are strongly limited and biased by confounding thoracic tubes.
• We used high-quality in-image annotated training data to effectively boost algorithm performance and suppress the impact of confounding thoracic tubes.
• Based on our results, we hypothesize that even hidden confounders might be effectively addressed by in-image annotations of pathology-related image features.
Full text
Available for:
EMUNI, FIS, FZAB, GEOZS, GIS, IJS, IMTLJ, KILJ, KISLJ, MFDPS, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, SBMB, SBNM, UKNU, UL, UM, UPUK, VKSCE, VSZLJ, ZAGLJ
The evaluation of deep-learning (DL) systems typically relies on the Area under the Receiver-Operating-Curve (AU-ROC) as a performance metric. However, AU-ROC, in its holistic form, does not ...sufficiently consider performance within specific ranges of sensitivity and specificity, which are critical for the intended operational context of the system. Consequently, two systems with identical AU-ROC values can exhibit significantly divergent real-world performance. This issue is particularly pronounced in the context of anomaly detection tasks, a commonly employed application of DL systems across various research domains, including medical imaging, industrial automation, manufacturing, cyber security, fraud detection, and drug research, among others. The challenge arises from the heavy class imbalance in training datasets, with the abnormality class often incurring a considerably higher misclassification cost compared to the normal class. Traditional DL systems address this by adjusting the weighting of the cost function or optimizing for specific points along the ROC curve. While these approaches yield reasonable results in many cases, they do not actively seek to maximize performance for the desired operating point. In this study, we introduce a novel technique known as AUCReshaping, designed to reshape the ROC curve exclusively within the specified sensitivity and specificity range, by optimizing sensitivity at a predetermined specificity level. This reshaping is achieved through an adaptive and iterative boosting mechanism that allows the network to focus on pertinent samples during the learning process. We primarily investigated the impact of AUCReshaping in the context of abnormality detection tasks, specifically in Chest X-Ray (CXR) analysis, followed by breast mammogram and credit card fraud detection tasks. The results reveal a substantial improvement, ranging from 2 to 40%, in sensitivity at high-specificity levels for binary classification tasks.
Full text
Available for:
IZUM, KILJ, NUK, PILJ, PNG, SAZU, UL, UM, UPUK
Airspace disease as seen on chest X-rays is an important point in triage for patients initially presenting to the emergency department with suspected COVID-19 infection. The purpose of this study is ...to evaluate a previously trained interpretable deep learning algorithm for the diagnosis and prognosis of COVID-19 pneumonia from chest X-rays obtained in the ED. This retrospective study included 2456 (50% RT-PCR positive for COVID-19) adult patients who received both a chest X-ray and SARS-CoV-2 RT-PCR test from January 2020 to March of 2021 in the emergency department at a single U.S. institution. A total of 2000 patients were included as an additional training cohort and 456 patients in the randomized internal holdout testing cohort for a previously trained Siemens AI-Radiology Companion deep learning convolutional neural network algorithm. Three cardiothoracic fellowship-trained radiologists systematically evaluated each chest X-ray and generated an airspace disease area-based severity score which was compared against the same score produced by artificial intelligence. The interobserver agreement, diagnostic accuracy, and predictive capability for inpatient outcomes were assessed. Principal statistical tests used in this study include both univariate and multivariate logistic regression. Overall ICC was 0.820 (95% CI 0.790-0.840). The diagnostic AUC for SARS-CoV-2 RT-PCR positivity was 0.890 (95% CI 0.861-0.920) for the neural network and 0.936 (95% CI 0.918-0.960) for radiologists. Airspace opacities score by AI alone predicted ICU admission (AUC = 0.870) and mortality (0.829) in all patients. Addition of age and BMI into a multivariate log model improved mortality prediction (AUC = 0.906). The deep learning algorithm provides an accurate and interpretable assessment of the disease burden in COVID-19 pneumonia on chest radiographs. The reported severity scores correlate with expert assessment and accurately predicts important clinical outcomes. The algorithm contributes additional prognostic information not currently incorporated into patient management.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Abstract Positron Emission Tomography (PET), a non-invasive functional imaging method at the molecular level, images the distribution of biologically targeted radiotracers with high sensitivity. PET ...imaging provides detailed quantitative information about many diseases and is often used to evaluate inflammation, infection, and cancer by detecting emitted photons from a radiotracer localized to abnormal cells. In order to differentiate abnormal tissue from surrounding areas in PET images, image segmentation methods play a vital role; therefore, accurate image segmentation is often necessary for proper disease detection, diagnosis, treatment planning, and follow-ups. In this review paper, we present state-of-the-art PET image segmentation methods, as well as the recent advances in image segmentation techniques. In order to make this manuscript self-contained, we also briefly explain the fundamentals of PET imaging, the challenges of diagnostic PET image analysis, and the effects of these challenges on the segmentation results.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UL, UM, UPCLJ, UPUK
The computer-based process of identifying the boundaries of lung from surrounding thoracic tissue on computed tomographic (CT) images, which is called segmentation, is a vital first step in ...radiologic pulmonary image analysis. Many algorithms and software platforms provide image segmentation routines for quantification of lung abnormalities; however, nearly all of the current image segmentation approaches apply well only if the lungs exhibit minimal or no pathologic conditions. When moderate to high amounts of disease or abnormalities with a challenging shape or appearance exist in the lungs, computer-aided detection systems may be highly likely to fail to depict those abnormal regions because of inaccurate segmentation methods. In particular, abnormalities such as pleural effusions, consolidations, and masses often cause inaccurate lung segmentation, which greatly limits the use of image processing methods in clinical and research contexts. In this review, a critical summary of the current methods for lung segmentation on CT images is provided, with special emphasis on the accuracy and performance of the methods in cases with abnormalities and cases with exemplary pathologic findings. The currently available segmentation methods can be divided into five major classes: (a) thresholding-based, (b) region-based, (c) shape-based, (d) neighboring anatomy-guided, and (e) machine learning-based methods. The feasibility of each class and its shortcomings are explained and illustrated with the most common lung abnormalities observed on CT images. In an overview, practical applications and evolving technologies combining the presented approaches for the practicing radiologist are detailed.
In this study, we propose a novel pathological lung segmentation method that takes into account neighbor prior constraints and a novel pathology recognition system. Our proposed framework has two ...stages; during stage one, we adapted the fuzzy connectedness (FC) image segmentation algorithm to perform initial lung parenchyma extraction. In parallel, we estimate the lung volume using rib-cage information without explicitly delineating lungs. This rudimentary, but intelligent lung volume estimation system allows comparison of volume differences between rib cage and FC based lung volume measurements. Significant volume difference indicates the presence of pathology, which invokes the second stage of the proposed framework for the refinement of segmented lung. In stage two, texture-based features are utilized to detect abnormal imaging patterns (consolidations, ground glass, interstitial thickening, tree-inbud, honeycombing, nodules, and micro-nodules) that might have been missed during the first stage of the algorithm. This refinement stage is further completed by a novel neighboring anatomy-guided segmentation approach to include abnormalities with weak textures, and pleura regions. We evaluated the accuracy and efficiency of the proposed method on more than 400 CT scans with the presence of a wide spectrum of abnormalities. To our best of knowledge, this is the first study to evaluate all abnormal imaging patterns in a single segmentation framework. The quantitative results show that our pathological lung segmentation method improves on current standards because of its high sensitivity and specificity and may have considerable potential to enhance the performance of routine clinical tasks.
Computer-aided diagnosis (CAD) techniques for lung field segmentation from chest radiographs (CXR) have been proposed for adult cohorts, but rarely for pediatric subjects. Statistical shape models ...(SSMs), the workhorse of most state-of-the-art CXR-based lung field segmentation methods, do not efficiently accommodate shape variation of the lung field during the pediatric developmental stages. The main contributions of our work are: 1) a generic lung field segmentation framework from CXR accommodating large shape variation for adult and pediatric cohorts; 2) a deep representation learning detection mechanism, ensemble space learning, for robust object localization; and 3) marginal shape deep learning for the shape deformation parameter estimation. Unlike the iterative approach of conventional SSMs, the proposed shape learning mechanism transforms the parameter space into marginal subspaces that are solvable efficiently using the recursive representation learning mechanism. Furthermore, our method is the first to include the challenging retro-cardiac region in the CXR-based lung segmentation for accurate lung capacity estimation. The framework is evaluated on 668 CXRs of patients between 3 month to 89 year of age. We obtain a mean Dice similarity coefficient of 0.96 ± 0.03 (including the retro-cardiac region). For a given accuracy, the proposed approach is also found to be faster than conventional SSM-based iterative segmentation methods. The computational simplicity of the proposed generic framework could be similarly applied to the fast segmentation of other deformable objects.
Analysis of cranial nerve systems, such as the anterior visual pathway (AVP), from MRI sequences is challenging due to their thin long architecture, structural variations along the path, and low ...contrast with adjacent anatomic structures. Segmentation of a pathologic AVP (e.g., with low-grade gliomas) poses additional challenges. In this work, we propose a fully automated partitioned shape model segmentation mechanism for AVP steered by multiple MRI sequences and deep learning features. Employing deep learning feature representation, this framework presents a joint partitioned statistical shape model able to deal with healthy and pathological AVP. The deep learning assistance is particularly useful in the poor contrast regions, such as optic tracts and pathological areas. Our main contributions are: 1) a fast and robust shape localization method using conditional space deep learning, 2) a volumetric multiscale curvelet transform-based intensity normalization method for robust statistical model, and 3) optimally partitioned statistical shape and appearance models based on regional shape variations for greater local flexibility. Our method was evaluated on MRI sequences obtained from 165 pediatric subjects. A mean Dice similarity coefficient of 0.779 was obtained for the segmentation of the entire AVP (optic nerve only =0.791) using the leave-one-out validation. Results demonstrated that the proposed localized shape and sparse appearance-based learning approach significantly outperforms current state-of-the-art segmentation approaches and is as robust as the manual segmentation.
•An accurate and efficient computational framework for airway quantification is proposed..•A novel hybrid approach for precise segmentation of the lumen presented..•Two novel methods both in 2-D and ...3-D to estimate the airway walls are explored..•Better identification of airway surfaces than the widely applied methods is reported..
Display omitted
Inflammatory and infectious lung diseases commonly involve bronchial airway structures and morphology, and these abnormalities are often analyzed non-invasively through high resolution computed tomography (CT) scans. Assessing airway wall surfaces and the lumen are of great importance for diagnosing pulmonary diseases. However, obtaining high accuracy from a complete 3-D airway tree structure can be quite challenging. The airway tree structure has spiculated shapes with multiple branches and bifurcation points as opposed to solid single organ or tumor segmentation tasks in other applications, hence, it is complex for manual segmentation as compared with other tasks. For computerized methods, a fundamental challenge in airway tree segmentation is the highly variable intensity levels in the lumen area, which often causes a segmentation method to leak into adjacent lung parenchyma through blurred airway walls or soft boundaries. Moreover, outer wall definition can be difficult due to similar intensities of the airway walls and nearby structures such as vessels. In this paper, we propose a computational framework to accurately quantify airways through (i) a novel hybrid approach for precise segmentation of the lumen, and (ii) two novel methods (a spatially constrained Markov random walk method (pseudo 3-D) and a relative fuzzy connectedness method (3-D)) to estimate the airway wall thickness. We evaluate the performance of our proposed methods in comparison with mostly used algorithms using human chest CT images. Our results demonstrate that, on publicly available data sets and using standard evaluation criteria, the proposed airway segmentation method is accurate and efficient as compared with the state-of-the-art methods, and the airway wall estimation algorithms identified the inner and outer airway surfaces more accurately than the most widely applied methods, namely full width at half maximum and phase congruency.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UL, UM, UPCLJ, UPUK
The aim of this study was to leverage volumetric quantification of airspace disease (AD) derived from a superior modality (computed tomography CT) serving as ground truth, projected onto digitally ...reconstructed radiographs (DRRs) to (1) train a convolutional neural network (CNN) to quantify AD on paired chest radiographs (CXRs) and CTs, and (2) compare the DRR-trained CNN to expert human readers in the CXR evaluation of patients with confirmed COVID-19.
We retrospectively selected a cohort of 86 COVID-19 patients (with positive reverse transcriptase-polymerase chain reaction test results) from March to May 2020 at a tertiary hospital in the northeastern United States, who underwent chest CT and CXR within 48 hours. The ground-truth volumetric percentage of COVID-19-related AD (POv) was established by manual AD segmentation on CT. The resulting 3-dimensional masks were projected into 2-dimensional anterior-posterior DRR to compute area-based AD percentage (POa). A CNN was trained with DRR images generated from a larger-scale CT dataset of COVID-19 and non-COVID-19 patients, automatically segmenting lungs, AD, and quantifying POa on CXR. The CNN POa results were compared with POa quantified on CXR by 2 expert readers and to the POv ground truth, by computing correlations and mean absolute errors.
Bootstrap mean absolute error and correlations between POa and POv were 11.98% (11.05%-12.47%) and 0.77 (0.70-0.82) for average of expert readers and 9.56% to 9.78% (8.83%-10.22%) and 0.78 to 0.81 (0.73-0.85) for the CNN, respectively.
Our CNN trained with DRR using CT-derived airspace quantification achieved expert radiologist level of accuracy in the quantification of AD on CXR in patients with positive reverse transcriptase-polymerase chain reaction test results for COVID-19.