The pace of innovation in radiation oncology is high and the window of opportunity for evaluation narrow. Financial incentives, industry pressure, and patients' demand for high-tech treatments have ...led to widespread implementation of innovations before, or even without, robust evidence of improved outcomes has been generated. The standard phase I-IV framework for drug evaluation is not the most efficient and desirable framework for assessment of technological innovations. In order to provide a standard assessment methodology for clinical evaluation of innovations in radiotherapy, we adapted the surgical IDEAL framework to fit the radiation oncology setting. Like surgery, clinical evaluation of innovations in radiation oncology is complicated by continuous technical development, team and operator dependence, and differences in quality control. Contrary to surgery, radiotherapy innovations may be used in various ways, e.g., at different tumor sites and with different aims, such as radiation volume reduction and dose escalation. Also, the effect of radiation treatment can be modeled, allowing better prediction of potential benefits and improved patient selection. Key distinctive features of R-IDEAL include the important role of predicate and modeling studies (Stage 0), randomization at an early stage in the development of the technology, and long-term follow-up for late toxicity. We implemented R-IDEAL for clinical evaluation of a recent innovation in radiation oncology, the MRI-guided linear accelerator (MR-Linac). MR-Linac combines a radiotherapy linear accelerator with a 1.5-T MRI, aiming for improved targeting, dose escalation, and margin reduction, and is expected to increase the use of hypofractionation, improve tumor control, leading to higher cure rates and less toxicity. An international consortium, with participants from seven large cancer institutes from Europe and North America, has adopted the R-IDEAL framework to work toward coordinated, evidence-based introduction of the MR-Linac. R-IDEAL holds the promise for timely, evidence-based introduction of radiotherapy innovations with proven superior effectiveness, while preventing unnecessary exposure of patients to potentially harmful interventions.
Summary
Support vector regression (SVR) is particularly beneficial when the outcome and predictors are nonlinearly related. However, when many covariates are available, the method’s flexibility can ...lead to overfitting and an overall loss in predictive accuracy. To overcome this drawback, we develop a feature selection method for SVR based on a genetic algorithm that iteratively searches across potential subsets of covariates to find those that yield the best performance according to a user-defined fitness function. We evaluate the performance of our feature selection method for SVR, comparing it to alternate methods including LASSO and random forest, in a simulation study. We find that our method yields higher predictive accuracy than SVR without feature selection. Our method outperforms LASSO when the relationship between covariates and outcome is nonlinear. Random forest performs equivalently to our method in some scenarios, but more poorly when covariates are correlated. We apply our method to predict donor kidney function 1 year after transplant using data from the United Network for Organ Sharing national registry.
Given the limitations of extant models for normal tissue complication probability estimation for osteoradionecrosis (ORN) of the mandible, the purpose of this study was to enrich statistical ...inference by exploiting structural properties of data and provide a clinically reliable model for ORN risk evaluation through an unsupervised-learning analysis that incorporates the whole radiation dose distribution on the mandible.
The analysis was conducted on retrospective data of 1259 patients with head and neck cancer treated at The University of Texas MD Anderson Cancer Center between 2005 and 2015. During a minimum 12-month posttherapy follow-up period, 173 patients in this cohort (13.7%) developed ORN (grades I to IV). The (structural) clusters of mandibular dose-volume histograms (DVHs) for these patients were identified using the K-means clustering method. A soft-margin support vector machine was used to determine the cluster borders and partition the dose-volume space. The risk of ORN for each dose-volume region was calculated based on incidence rates and other clinical risk factors.
The K-means clustering method identified 6 clusters among the DVHs. Based on the first 5 clusters, the dose-volume space was partitioned by the soft-margin support vector machine into distinct regions with different risk indices. The sixth cluster entirely overlapped with the others; the region of this cluster was determined by its envelopes. For each region, the ORN incidence rate per preradiation dental extraction status (a statistically significant, nondose related risk factor for ORN) was reported as the corresponding risk index.
This study presents an unsupervised-learning analysis of a large-scale data set to evaluate the risk of mandibular ORN among patients with head and neck cancer. The results provide a visual risk-assessment tool for ORN (based on the whole DVH and preradiation dental extraction status) as well as a range of constraints for dose optimization under different risk levels.
•Repeatability and reproducibility of MR-linac DWI sequences were within 2.22% and 4.37% of MR sim DWI sequences, respectively.•Of the MR-linac DWI sequences, SPLICE generally outperformed EPI and ...TSE in terms of repeatability/reproducibility, ADC bias, and SNR.•Spatial dependence of phantom ADC values was observed for the MR-linac but not for the MR sim, which may be due to uncorrected gradient non-linearities.•MR-linac DWI sequences are robust and worthy of further clinical evaluation for treatment response assessment and biological image-guided ART in HNC.
Diffusion-weighted imaging (DWI) on MRI-linear accelerator (MR-linac) systems can potentially be used for monitoring treatment response and adaptive radiotherapy in head and neck cancers (HNC) but requires extensive validation. We performed technical validation to compare six total DWI sequences on an MR-linac and MR simulator (MR sim) in patients, volunteers, and phantoms.
Ten human papillomavirus-positive oropharyngeal cancer patients and ten healthy volunteers underwent DWI on a 1.5 T MR-linac with three DWI sequences: echo planar imaging (EPI), split acquisition of fast spin echo signals (SPLICE), and turbo spin echo (TSE). Volunteers were also imaged on a 1.5 T MR sim with three sequences: EPI, BLADE (vendor tradename), and readout segmentation of long variable echo trains (RESOLVE). Participants underwent two scan sessions per device and two repeats of each sequence per session. Repeatability and reproducibility within-subject coefficient of variation (wCV) of mean ADC were calculated for tumors and lymph nodes (patients) and parotid glands (volunteers). ADC bias, repeatability/reproducibility metrics, SNR, and geometric distortion were quantified using a phantom.
In vivo repeatability/reproducibility wCV for parotids were 5.41%/6.72%, 3.83%/8.80%, 5.66%/10.03%, 3.44%/5.70%, 5.04%/5.66%, 4.23%/7.36% for EPIMR-linac, SPLICE, TSE, EPIMR sim, BLADE, RESOLVE. Repeatability/reproducibility wCV for EPIMR-linac, SPLICE, TSE were 9.64%/10.28%, 7.84%/8.96%, 7.60%/11.68% for tumors and 7.80%/9.95%, 7.23%/8.48%, 10.82%/10.44% for nodes. All sequences except TSE had phantom ADC biases within ± 0.1x10-3 mm2/s for most vials (EPIMR-linac, SPLICE, and BLADE had 2, 3, and 1 vials out of 13 with larger biases, respectively). SNR of b = 0 images was 87.3, 180.5, 161.3, 171.0, 171.9, 130.2 for EPIMR-linac, SPLICE, TSE, EPIMR sim, BLADE, RESOLVE.
MR-linac DWI sequences demonstrated near-comparable performance to MR sim sequences and warrant further clinical validation for treatment response assessment in HNC.
Automated segmentation templates can save clinicians time compared to de novo segmentation but may still take substantial time to review and correct. It has not been thoroughly investigated which ...automated segmentation-corrected segmentation similarity metrics best predict clinician correction time. Bilateral thoracic cavity volumes in 329 CT scans were segmented by a UNet-inspired deep learning segmentation tool and subsequently corrected by a fourth-year medical student. Eight spatial similarity metrics were calculated between the automated and corrected segmentations and associated with correction times using Spearman’s rank correlation coefficients. Nine clinical variables were also associated with metrics and correction times using Spearman’s rank correlation coefficients or Mann–Whitney
U
tests. The added path length, false negative path length, and surface Dice similarity coefficient correlated better with correction time than traditional metrics, including the popular volumetric Dice similarity coefficient (respectively
ρ
= 0.69,
ρ
= 0.65,
ρ
= − 0.48 versus
ρ
= − 0.25; correlation
p
values < 0.001). Clinical variables poorly represented in the autosegmentation tool’s training data were often associated with decreased accuracy but not necessarily with prolonged correction time. Metrics used to develop and evaluate autosegmentation tools should correlate with clinical time saved. To our knowledge, this is only the second investigation of which metrics correlate with time saved. Validation of our findings is indicated in other anatomic sites and clinical workflows. Novel spatial similarity metrics may be preferable to traditional metrics for developing and evaluating autosegmentation tools that are intended to save clinicians time.
MR-linac devices offer the potential for advancements in radiotherapy (RT) treatment of head and neck cancer (HNC) by using daily MR imaging performed at the time and setup of treatment delivery. ...This article aims to present a review of current adaptive RT (ART) methods on MR-Linac devices directed towards the sparing of organs at risk (OAR) and a view of future adaptive techniques seeking to improve the therapeutic ratio. This ratio expresses the relationship between the probability of tumor control and the probability of normal tissue damage and is thus an important conceptual metric of success in the sparing of OARs. Increasing spatial conformity of dose distributions to target volume and OARs is an initial step in achieving therapeutic improvements, followed by the use of imaging and clinical biomarkers to inform the clinical decision-making process in an ART paradigm. Pre-clinical and clinical findings support the incorporation of biomarkers into ART protocols and investment into further research to explore imaging biomarkers by taking advantage of the daily MR imaging workflow. A coherent understanding of this road map for RT in HNC is critical for directing future research efforts related to sparing OARs using image-guided radiotherapy (IGRT).
Artificial intelligence (AI) has exceptional potential to positively impact the field of radiation oncology. However, large curated datasets - often involving imaging data and corresponding ...annotations - are required to develop radiation oncology AI models. Importantly, the recent establishment of Findable, Accessible, Interoperable, Reusable (FAIR) principles for scientific data management have enabled an increasing number of radiation oncology related datasets to be disseminated through data repositories, thereby acting as a rich source of data for AI model building. This manuscript reviews the current and future state of radiation oncology data dissemination, with a particular emphasis on published imaging datasets, AI data challenges, and associated infrastructure. Moreover, we provide historical context of FAIR data dissemination protocols, difficulties in the current distribution of radiation oncology data, and recommendations regarding data dissemination for eventual utilization in AI models. Through FAIR principles and standardized approaches to data dissemination, radiation oncology AI research has nothing to lose and everything to gain.
Sarcopenia is prognostic for survival in patients with head and neck cancer (HNC). However, identification of this high-risk feature remains challenging without computed tomography (CT) imaging of ...the abdomen or thorax. Herein, we establish sarcopenia thresholds at the C3 level and determine if C3 sarcopenia is associated with survival in patients with HNC.
This retrospective cohort study was conducted in consecutive patients with a squamous cell carcinoma of the head and neck with cross-sectional abdominal or neck imaging within 60 days prior to treatment. Measurement of cross-sectional muscle area at L3 and C3 levels was performed from CT imaging. Primary study outcome was overall survival.
Skeletal muscle area at C3 was strongly correlated with the L3 level in both men (n = 188; r = 0.77; p < 0.001) and women (n = 65; r = 0.80; p < 0.001), and C3 sarcopenia thresholds of 14.0 cm
/m
(men) and 11.1 cm
/m
(women) were best predictive of L3 sarcopenia thresholds. Applying these C3 thresholds to a cohort of patients with neck imaging alone revealed that C3 sarcopenia was independently associated with reduced overall survival in men (HR = 2.63; 95% CI, 1.79, 3.85) but not women (HR = 1.18, 95% CI, 0.76, 1.85).
This study identifies sarcopenia thresholds at the C3 level that best predict L3 sarcopenia in men and women. In HNC, C3-defined sarcopenia is associated with poor survival outcomes in men, but not women, suggesting sarcopenia may differentially affect men and women with HNC.
•mpMRI coupled to deep learning can generate reasonable OPC tumor segmentations.•Using multiple input channels may positively impact segmentation performance.•Deep learning segmentations were ...non-inferior to ground truth as per Turing test.
Oropharyngeal cancer (OPC) primary gross tumor volume (GTVp) segmentation is crucial for radiotherapy. Multiparametric MRI (mpMRI) is increasingly used for OPC adaptive radiotherapy but relies on manual segmentation. Therefore, we constructed mpMRI deep learning (DL) OPC GTVp auto-segmentation models and determined the impact of input channels on segmentation performance.
GTVp ground truth segmentations were manually generated for 30 OPC patients from a clinical trial. We evaluated five mpMRI input channels (T2, T1, ADC, Ktrans, Ve). 3D Residual U-net models were developed and assessed using leave-one-out cross-validation. A baseline T2 model was compared to mpMRI models (T2 + T1, T2 + ADC, T2 + Ktrans, T2 + Ve, all five channels ALL) primarily using the Dice similarity coefficient (DSC). False-negative DSC (FND), false-positive DSC, sensitivity, positive predictive value, surface DSC, Hausdorff distance (HD), 95% HD, and mean surface distance were also assessed. For the best model, ground truth and DL-generated segmentations were compared through a blinded Turing test using three physician observers.
Models yielded mean DSCs from 0.71 ± 0.12 (ALL) to 0.73 ± 0.12 (T2 + T1). Compared to the T2 model, performance was significantly improved for FND, sensitivity, surface DSC, HD, and 95% HD for the T2 + T1 model (p < 0.05) and for FND for the T2 + Ve and ALL models (p < 0.05). No model demonstrated significant correlations between tumor size and DSC (p > 0.05). Most models demonstrated significant correlations between tumor size and HD or Surface DSC (p < 0.05), except those that included ADC or Ve as input channels (p > 0.05). On average, there were no significant differences between ground truth and DL-generated segmentations for all observers (p > 0.05).
DL using mpMRI provides reasonably accurate segmentations of OPC GTVp that may be comparable to ground truth segmentations generated by clinical experts. Incorporating additional mpMRI channels may increase the performance of FND, sensitivity, surface DSC, HD, and 95% HD, and improve model robustness to tumor size.