Objectives
To investigate the predictive value of quantifiable imaging and inflammatory biomarkers in patients with hepatocellular carcinoma (HCC) for the clinical outcome after drug-eluting bead ...transarterial chemoembolization (DEB-TACE) measured as volumetric tumor response and progression-free survival (PFS).
Methods
This retrospective study included 46 patients with treatment-naïve HCC who received DEB-TACE. Laboratory work-up prior to treatment included complete and differential blood count, liver function, and alpha-fetoprotein levels. Neutrophil-to-lymphocyte ratio (NLR) and platelet-to-lymphocyte ratio (PLR) were correlated with radiomic features extracted from pretreatment contrast-enhanced magnetic resonance imaging (MRI) and with tumor response according to quantitative European Association for the Study of the Liver (qEASL) criteria and progression-free survival (PFS) after DEB-TACE. Radiomic features included single nodular tumor growth measured as sphericity, dynamic contrast uptake behavior, arterial hyperenhancement, and homogeneity of contrast uptake. Statistics included univariate and multivariate linear regression, Cox regression, and Kaplan–Meier analysis.
Results
Accounting for laboratory and clinical parameters, high baseline NLR and PLR were predictive of poorer tumor response (
p
= 0.014 and
p
= 0.004) and shorter PFS (
p
= 0.002 and
p
< 0.001). When compared to baseline imaging, high NLR and PLR correlated with non-spherical tumor growth (
p
= 0.001 and
p
< 0.001).
Conclusions
This study establishes the prognostic value of quantitative inflammatory biomarkers associated with aggressive non-spherical tumor growth and predictive of poorer tumor response and shorter PFS after DEB-TACE.
Key Points
• In treatment-naïve hepatocellular carcinoma (HCC), high baseline platelet-to-lymphocyte ratio (PLR) and neutrophil-to-lymphocyte ratio (NLR) are associated with non-nodular tumor growth measured as low tumor sphericity.
• High PLR and NLR are predictive of poorer volumetric enhancement-based tumor response and PFS after DEB-TACE in HCC.
• This set of readily available, quantitative immunologic biomarkers can easily be implemented in clinical guidelines providing a paradigm to guide and monitor the personalized application of loco-regional therapies in HCC.
Purpose
Liver Imaging Reporting and Data System (LI-RADS) uses multiphasic contrast-enhanced imaging for hepatocellular carcinoma (HCC) diagnosis. The goal of this feasibility study was to establish ...a proof-of-principle concept towards automating the application of LI-RADS, using a deep learning algorithm trained to segment the liver and delineate HCCs on MRI automatically.
Methods
In this retrospective single-center study, multiphasic contrast-enhanced MRIs using T1-weighted breath-hold sequences acquired from 2010 to 2018 were used to train a deep convolutional neural network (DCNN) with a U-Net architecture. The U-Net was trained (using 70% of all data), validated (15%) and tested (15%) on 174 patients with 231 lesions. Manual 3D segmentations of the liver and HCC were ground truth. The dice similarity coefficient (DSC) was measured between manual and DCNN methods. Postprocessing using a random forest (RF) classifier employing radiomic features and thresholding (TR) of the mean neural activation was used to reduce the average false positive rate (AFPR).
Results
73 and 75% of HCCs were detected on validation and test sets, respectively, using > 0.2 DSC criterion between individual lesions and their corresponding segmentations. Validation set AFPRs were 2.81, 0.77, 0.85 for U-Net, U-Net + RF, and U-Net + TR, respectively. Combining both RF and TR with the U-Net improved the AFPR to 0.62 and 0.75 for the validation and test sets, respectively. Mean DSC between automatically detected lesions using the DCNN + RF + TR and corresponding manual segmentations was 0.64/0.68 (validation/test), and 0.91/0.91 for liver segmentations.
Conclusion
Our DCNN approach can segment the liver and HCCs automatically. This could enable a more workflow efficient and clinically realistic implementation of LI-RADS.
Deep learning-based algorithms have demonstrated enormous performance in segmentation of medical images. We collected a dataset of multiparametric MRI and contour data acquired for use in ...radiosurgery, to evaluate the performance of deep convolutional neural networks (DCNN) in automatic segmentation of brain metastases (BM).
A conventional U-Net (cU-Net), a modified U-Net (moU-Net) and a U-Net trained only on BM smaller than 0.4 ml (sU-Net) were implemented. Performance was assessed on a separate test set employing sensitivity, specificity, average false positive rate (AFPR), the dice similarity coefficient (DSC), Bland-Altman analysis and the concordance correlation coefficient (CCC).
A dataset of 509 patients (1223 BM) was split into a training set (469 pts) and a test set (40 pts). A combination of all trained networks was the most sensitive (0.82) while maintaining a specificity 0.83. The same model achieved a sensitivity of 0.97 and a specificity of 0.94 when considering only lesions larger than 0.06 ml (75% of all lesions). Type of primary cancer had no significant influence on the mean DSC per lesion (p = 0.60). Agreement between manually and automatically assessed tumor volumes as quantified by a CCC of 0.87 (95% CI, 0.77-0.93), was excellent.
Using a dataset which properly captured the variation in imaging appearance observed in clinical practice, we were able to conclude that DCNNs reach clinically relevant performance for most lesions. Clinical applicability is currently limited by the size of the target lesion. Further studies should address if small targets are accurately represented in the test data.
Purpose
Personalized interpretation of medical images is critical for optimum patient care, but current tools available to physicians to perform quantitative analysis of patient’s medical images in ...real time are significantly limited. In this work, we describe a novel platform within PACS for volumetric analysis of images and thus development of large expert annotated datasets in parallel with radiologist performing the reading that are critically needed for development of clinically meaningful AI algorithms. Specifically, we implemented a deep learning-based algorithm for automated brain tumor segmentation and radiomics extraction, and embedded it into PACS to accelerate a supervised, end-to- end workflow for image annotation and radiomic feature extraction.
Materials and methods
An algorithm was trained to segment whole primary brain tumors on FLAIR images from multi-institutional glioma BraTS 2021 dataset. Algorithm was validated using internal dataset from Yale New Haven Health (YHHH) and compared (by Dice similarity coefficient DSC) to radiologist manual segmentation. A UNETR deep-learning was embedded into Visage 7 (Visage Imaging, Inc., San Diego, CA, United States) diagnostic workstation. The automatically segmented brain tumor was pliable for manual modification. PyRadiomics (Harvard Medical School, Boston, MA) was natively embedded into Visage 7 for feature extraction from the brain tumor segmentations.
Results
UNETR brain tumor segmentation took on average 4 s and the median DSC was 86%, which is similar to published literature but lower than the RSNA ASNR MICCAI BRATS challenge 2021. Finally, extraction of 106 radiomic features within PACS took on average 5.8 ± 0.01 s. The extracted radiomic features did not vary over time of extraction or whether they were extracted within PACS or outside of PACS. The ability to perform segmentation and feature extraction before radiologist opens the study was made available in the workflow. Opening the study in PACS, allows the radiologists to verify the segmentation and thus annotate the study.
Conclusion
Integration of image processing algorithms for tumor auto-segmentation and feature extraction into PACS allows curation of large datasets of annotated medical images and can accelerate translation of research into development of personalized medicine applications in the clinic. The ability to use familiar clinical tools to revise the AI segmentations and natively embedding the segmentation and radiomic feature extraction tools on the diagnostic workstation accelerates the process to generate ground-truth data.
Objectives
To evaluate the prognostic potential of Lipiodol distribution for the pharmacokinetic (PK) profiles of doxorubicin (DOX) and doxorubicinol (DOXOL) after conventional transarterial ...chemoembolization (cTACE).
Methods
This prospective clinical trial (
ClinicalTrials.gov
: NCT02753881) included 30 consecutive participants with liver malignancies treated with cTACE (5/2016–10/2018) using 50 mg DOX/10 mg mitomycin C emulsified 1:2 with ethiodized oil (Lipiodol). Peripheral blood was sampled at 10 timepoints for standard non-compartmental analysis of peak concentrations (
C
max
) and area under the curve (AUC) with dose normalization (DN). Imaging markers included Lipiodol distribution on post-cTACE CT for patient stratification into 1 segment (
n
= 10), ≥ 2 segments (
n
= 10), and lobar cTACE (
n
= 10), and baseline enhancing tumor volume (ETV). Adverse events (AEs) and tumor response on MRI were recorded 3–4 weeks post-cTACE. Statistics included repeated measurement ANOVA (RM-ANOVA), Mann-Whitney, Kruskal-Wallis, Fisher’s exact test, and Pearson correlation.
Results
Hepatocellular (
n
= 26), cholangiocarcinoma (
n
= 1), and neuroendocrine metastases (
n
= 3) were included. Stratified according to Lipiodol distribution, DOX-
C
max
increased from 1 segment (DOX-
C
max
, 83.94 ± 75.09 ng/mL; DN-DOX-
C
max
, 2.67 ± 2.02 ng/mL/mg) to ≥ 2 segments (DOX-
C
max
, 139.66 ± 117.73 ng/mL; DN-DOX-
C
max
, 3.68 ± 4.20 ng/mL/mg) to lobar distribution (DOX-
C
max
, 334.35 ± 215.18 ng/mL; DN-DOX-
C
max
, 7.11 ± 4.24 ng/mL/mg;
p
= 0.036). While differences in DN-DOX-AUC remained insignificant, RM-ANOVA revealed significant separation of time concentration curves for DOX (
p
= 0.023) and DOXOL (
p
= 0.041) comparing 1, ≥ 2 segments, and lobar cTACE. Additional indicators of higher DN-DOX-
C
max
were high ETV (
p
= 0.047) and Child-Pugh B (
p
= 0.009). High ETV and tumoral Lipiodol coverage also correlated with tumor response. AE occurred less frequently after segmental cTACE.
Conclusions
This prospective clinical trial provides updated PK data revealing Lipiodol distribution as an imaging marker predictive of DOX-
C
max
and tumor response after cTACE in liver cancer.
Key Points
•
Prospective pharmacokinetic analysis after conventional TACE revealed Lipiodol distribution (1
vs.
≥ 2 segments
vs.
lobar) as an imaging marker predictive of doxorubicin peak concentrations (C
max
).
•
Child-Pugh B class and tumor hypervascularization, measurable as enhancing tumor volume (ETV) at baseline, were identified as additional predictors for higher dose-normalized doxorubicin C
max
after conventional TACE.
•
ETV at baseline and tumoral Lipiodol coverage can serve as predictors of volumetric tumor response after conventional TACE according to quantitative European Association for the Study of the Liver (qEASL) criteria.
Gliomas with CDKN2A mutations are known to have worse prognosis but imaging features of these gliomas are unknown. Our goal is to identify CDKN2A specific qualitative imaging biomarkers in ...glioblastomas using a new informatics workflow that enables rapid analysis of qualitative imaging features with Visually AcceSAble Rembrandtr Images (VASARI) for large datasets in PACS. Sixty nine patients undergoing GBM resection with CDKN2A status determined by whole-exome sequencing were included. GBMs on magnetic resonance images were automatically 3D segmented using deep learning algorithms incorporated within PACS. VASARI features were assessed using FHIR forms integrated within PACS. GBMs without CDKN2A alterations were significantly larger (64 vs. 30%, p = 0.007) compared to tumors with homozygous deletion (HOMDEL) and heterozygous loss (HETLOSS). Lesions larger than 8 cm were four times more likely to have no CDKN2A alteration (OR: 4.3; 95% CI 1.5-12.1; p < 0.001). We developed a novel integrated PACS informatics platform for the assessment of GBM molecular subtypes and show that tumors with HOMDEL are more likely to have radiographic evidence of pial invasion and less likely to have deep white matter invasion or subependymal invasion. These imaging features may allow noninvasive identification of CDKN2A allele status.
To generate and validate state-of-the-art radiomics models for prediction of radiation-induced lung injury and oncologic outcome in non-small cell lung cancer (NSCLC) patients treated with robotic ...stereotactic body radiation therapy (SBRT).
Radiomics models were generated from the planning CT images of 110 patients with primary, inoperable stage I/IIa NSCLC who were treated with robotic SBRT using a risk-adapted fractionation scheme at the University Hospital Cologne (training cohort). In total, 199 uncorrelated radiomic features fulfilling the standards of the Image Biomarker Standardization Initiative (IBSI) were extracted from the outlined gross tumor volume (GTV). Regularized models (Coxnet and Gradient Boost) for the development of local lung fibrosis (LF), local tumor control (LC), disease-free survival (DFS) and overall survival (OS) were built from either clinical/ dosimetric variables, radiomics features or a combination thereof and validated in a comparable cohort of 71 patients treated by robotic SBRT at the Radiosurgery Center in Northern Germany (test cohort).
Oncologic outcome did not differ significantly between the two cohorts (OS at 36 months 56% vs. 43%, p = 0.065; median DFS 25 months vs. 23 months, p = 0.43; LC at 36 months 90% vs. 93%, p = 0.197). Local lung fibrosis developed in 33% vs. 35% of the patients (p = 0.75), all events were observed within 36 months. In the training cohort, radiomics models were able to predict OS, DFS and LC (concordance index 0.77-0.99, p < 0.005), but failed to generalize to the test cohort. In opposite, models for the development of lung fibrosis could be generated from both clinical/dosimetric factors and radiomic features or combinations thereof, which were both predictive in the training set (concordance index 0.71- 0.79, p < 0.005) and in the test set (concordance index 0.59-0.66, p < 0.05). The best performing model included 4 clinical/dosimetric variables (GTV-D
, PTV-D
, Lung-D
, age) and 7 radiomic features (concordance index 0.66, p < 0.03).
Despite the obvious difficulties in generalizing predictive models for oncologic outcome and toxicity, this analysis shows that carefully designed radiomics models for prediction of local lung fibrosis after SBRT of early stage lung cancer perform well across different institutions.
Amino acid PET using the tracer O-(2-
Ffluoroethyl)-L-tyrosine (FET) has attracted considerable interest in neurooncology. Furthermore, initial studies suggested the additional diagnostic value of ...FET PET radiomics in brain tumor patient management. However, the conclusiveness of radiomics models strongly depends on feature generalizability. We here evaluated the repeatability of feature-based FET PET radiomics. A test-retest analysis based on equivalent but statistically independent subsamples of FET PET images was performed in 50 newly diagnosed and histomolecularly characterized glioma patients. A total of 1,302 radiomics features were calculated from semi-automatically segmented tumor volumes-of-interest (VOIs). Furthermore, to investigate the influence of the spatial resolution of PET on repeatability, spherical VOIs of different sizes were positioned in the tumor and healthy brain tissue. Feature repeatability was assessed by calculating the intraclass correlation coefficient (ICC). To further investigate the influence of the isocitrate dehydrogenase (IDH) genotype on feature repeatability, a hierarchical cluster analysis was performed. For tumor VOIs, 73% of first-order features and 71% of features extracted from the gray level co-occurrence matrix showed high repeatability (ICC 95% confidence interval, 0.91-1.00). In the largest spherical tumor VOIs, 67% of features showed high repeatability, significantly decreasing towards smaller VOIs. The IDH genotype did not affect feature repeatability. Based on 297 repeatable features, two clusters were identified separating patients with IDH-wildtype glioma from those with an IDH mutation. Our results suggest that robust features can be obtained from routinely acquired FET PET scans, which are valuable for further standardization of radiomics analyses in neurooncology.
To systematically review, assess the reporting quality of, and discuss improvement opportunities for studies describing machine learning (ML) models for glioma grade prediction.
This study followed ...the Preferred Reporting Items for Systematic Reviews and Meta-Analyses of Diagnostic Test Accuracy (PRISMA-DTA) statement. A systematic search was performed in September 2020, and repeated in January 2021, on four databases: Embase, Medline, CENTRAL, and Web of Science Core Collection. Publications were screened in Covidence, and reporting quality was measured against the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Statement. Descriptive statistics were calculated using GraphPad Prism 9.
The search identified 11,727 candidate articles with 1,135 articles undergoing full text review and 85 included in analysis. 67 (79%) articles were published between 2018-2021. The mean prediction accuracy of the best performing model in each study was 0.89 ± 0.09. The most common algorithm for conventional machine learning studies was Support Vector Machine (mean accuracy: 0.90 ± 0.07) and for deep learning studies was Convolutional Neural Network (mean accuracy: 0.91 ± 0.10). Only one study used both a large training dataset (n>200) and external validation (accuracy: 0.72) for their model. The mean adherence rate to TRIPOD was 44.5% ± 11.1%, with poor reporting adherence for model performance (0%), abstracts (0%), and titles (0%).
The application of ML to glioma grade prediction has grown substantially, with ML model studies reporting high predictive accuracies but lacking essential metrics and characteristics for assessing model performance. Several domains, including generalizability and reproducibility, warrant further attention to enable translation into clinical practice.
PROSPERO, identifier CRD42020209938.
Glioma and brain metastasis can be difficult to distinguish on conventional magnetic resonance imaging (MRI) due to the similarity of imaging features in specific clinical circumstances. Multiple ...studies have investigated the use of machine learning (ML) models for non-invasive differentiation of glioma from brain metastasis. Many of the studies report promising classification results, however, to date, none have been implemented into clinical practice. After a screening of 12,470 studies, we included 29 eligible studies in our systematic review. From each study, we aggregated data on model design, development, and best classifiers, as well as quality of reporting according to the TRIPOD statement. In a subset of eligible studies, we conducted a meta-analysis of the reported AUC. It was found that data predominantly originated from single-center institutions (n = 25/29) and only two studies performed external validation. The median TRIPOD adherence was 0.48, indicating insufficient quality of reporting among surveyed studies. Our findings illustrate that despite promising classification results, reliable model assessment is limited by poor reporting of study design and lack of algorithm validation and generalizability. Therefore, adherence to quality guidelines and validation on outside datasets is critical for the clinical translation of ML for the differentiation of glioma and brain metastasis.