Purpose
To validate a machine learning approach to Virtual intensity‐modulated radiation therapy (IMRT) quality assurance (QA) for accurately predicting gamma passing rates using different ...measurement approaches at different institutions.
Methods
A Virtual IMRT QA framework was previously developed using a machine learning algorithm based on 498 IMRT plans, in which QA measurements were performed using diode‐array detectors and a 3%local/3 mm with 10% threshold at Institution 1. An independent set of 139 IMRT measurements from a different institution, Institution 2, with QA data based on portal dosimetry using the same gamma index, was used to test the mathematical framework. Only pixels with ≥10% of the maximum calibrated units (CU) or dose were included in the comparison. Plans were characterized by 90 different complexity metrics. A weighted poison regression with Lasso regularization was trained to predict passing rates using the complexity metrics as input.
Results
The methodology predicted passing rates within 3% accuracy for all composite plans measured using diode‐array detectors at Institution 1, and within 3.5% for 120 of 139 plans using portal dosimetry measurements performed on a per‐beam basis at Institution 2. The remaining measurements (19) had large areas of low CU, where portal dosimetry has a larger disagreement with the calculated dose and as such, the failure was expected. These beams need further modeling in the treatment planning system to correct the under‐response in low‐dose regions. Important features selected by Lasso to predict gamma passing rates were as follows: complete irradiated area outline (CIAO), jaw position, fraction of MLC leafs with gaps smaller than 20 or 5 mm, fraction of area receiving less than 50% of the total CU, fraction of the area receiving dose from penumbra, weighted average irregularity factor, and duty cycle.
Conclusions
We have demonstrated that Virtual IMRT QA can predict passing rates using different measurement techniques and across multiple institutions. Prediction of QA passing rates can have profound implications on the current IMRT process.
•Among an extensive set of 32 clinical and dosimetric features, Lung V20, mean lung dose, lung V10 and lung V5 are the best individual predictors of radiation pneumonitis in stage II–III ...LA-NSCLC.•The combined predictive performance of radiation pneumonitis predictors such as maximum esophagus dose, lung V20, mean lung dose, pack-year, lung V5 and lung V10 improves the performance of individual predictors up to a 24.6% improvement rate using random forest.•Lung V20, maximum esophagus dose and mean lung dose are consistently selected as the most important predictors of radiation pneumonitis by the machine learning algorithms, random forest, RUSBoost and CART.
Radiation pneumonitis (RP) is a radiotherapy dose-limiting toxicity for locally advanced non-small cell lung cancer (LA-NSCLC). Prior studies have proposed relevant dosimetric constraints to limit this toxicity. Using machine learning algorithms, we performed analyses of contributing factors in the development of RP to uncover previously unidentified criteria and elucidate the relative importance of individual factors.
We evaluated 32 clinical features per patient in a cohort of 203 stage II–III LA-NSCLC patients treated with definitive chemoradiation to a median dose of 66.6 Gy in 1.8 Gy daily fractions at our institution from 2008 to 2016. Of this cohort, 17.7% of patients developed grade ≥2 RP. Univariate analysis was performed using trained decision stumps to individually analyze statistically significant predictors of RP and perform feature selection. Applying Random Forest, we performed multivariate analysis to assess the combined performance of important predictors of RP.
On univariate analysis, lung V20, lung mean, lung V10 and lung V5 were found to be significant RP predictors with the greatest balance of specificity and sensitivity. On multivariate analysis, Random Forest (AUC = 0.66, p = 0.0005) identified esophagus max (20.5%), lung V20 (16.4%), lung mean (15.7%) and pack-year (14.9%) as the most common primary differentiators of RP.
We highlight Random Forest as an accurate machine learning method to identify known and new predictors of symptomatic RP. Furthermore, this analysis confirms the importance of lung V20, lung mean and pack-year as predictors of RP while also introducing esophagus max as an important RP predictor.
Meningiomas are stratified according to tumor grade and extent of resection, often in isolation of other clinical variables. Here, we use machine learning (ML) to integrate demographic, clinical, ...radiographic and pathologic data to develop predictive models for meningioma outcomes.
We developed a comprehensive database containing information from 235 patients who underwent surgery for 257 meningiomas at a single institution from 1990 to 2015. The median follow-up was 4.3 years, and resection specimens were re-evaluated according to current diagnostic criteria, revealing 128 WHO grade I, 104 grade II and 25 grade III meningiomas. A series of ML algorithms were trained and tuned by nested resampling to create models based on preoperative features, conventional postoperative features, or both. We compared different algorithms' accuracy as well as the unique insights they offered into the data. Machine learning models restricted to preoperative information, such as patient demographics and radiographic features, had similar accuracy for predicting local failure (AUC = 0.74) or overall survival (AUC = 0.68) as models based on meningioma grade and extent of resection (AUC = 0.73 and AUC = 0.72, respectively). Integrated models incorporating all available demographic, clinical, radiographic and pathologic data provided the most accurate estimates (AUC = 0.78 and AUC = 0.74, respectively). From these models, we developed decision trees and nomograms to estimate the risks of local failure or overall survival for meningioma patients.
Clinical information has been historically underutilized in the prediction of meningioma outcomes. Predictive models trained on preoperative clinical data perform comparably to conventional models trained on meningioma grade and extent of resection. Combination of all available information can help stratify meningioma patients more accurately.
To develop a patient-specific 'big data' clinical decision tool to predict pneumonitis in stage I non-small cell lung cancer (NSCLC) patients after stereotactic body radiation therapy (SBRT). 61 ...features were recorded for 201 consecutive patients with stage I NSCLC treated with SBRT, in whom 8 (4.0%) developed radiation pneumonitis. Pneumonitis thresholds were found for each feature individually using decision stumps. The performance of three different algorithms (Decision Trees, Random Forests, RUSBoost) was evaluated. Learning curves were developed and the training error analyzed and compared to the testing error in order to evaluate the factors needed to obtain a cross-validated error smaller than 0.1. These included the addition of new features, increasing the complexity of the algorithm and enlarging the sample size and number of events. In the univariate analysis, the most important feature selected was the diffusion capacity of the lung for carbon monoxide (DLCO adj%). On multivariate analysis, the three most important features selected were the dose to 15 cc of the heart, dose to 4 cc of the trachea or bronchus, and race. Higher accuracy could be achieved if the RUSBoost algorithm was used with regularization. To predict radiation pneumonitis within an error smaller than 10%, we estimate that a sample size of 800 patients is required. Clinically relevant thresholds that put patients at risk of developing radiation pneumonitis were determined in a cohort of 201 stage I NSCLC patients treated with SBRT. The consistency of these thresholds can provide radiation oncologists with an estimate of their reliability and may inform treatment planning and patient counseling. The accuracy of the classification is limited by the number of patients in the study and not by the features gathered or the complexity of the algorithm.
Expert-augmented machine learning Gennatas, Efstathios D.; Friedman, Jerome H.; Ungar, Lyle H. ...
Proceedings of the National Academy of Sciences - PNAS,
03/2020, Letnik:
117, Številka:
9
Journal Article
Recenzirano
Odprti dostop
Machine learning is proving invaluable across disciplines. However, its success is often limited by the quality and quantity of available data, while its adoption is limited by the level of trust ...afforded by given models. Human vs. machine performance is commonly compared empirically to decide whether a certain task should be performed by a computer or an expert. In reality, the optimal learning strategy may involve combining the complementary strengths of humans and machines. Here, we present expertaugmented machine learning (EAML), an automated method that guides the extraction of expert knowledge and its integration into machine-learned models. We used a large dataset of intensive-care patient data to derive 126 decision rules that predict hospital mortality. Using an online platform, we asked 15 clinicians to assess the relative risk of the subpopulation defined by each rule compared to the total sample. We compared the clinician-assessed risk to the empirical risk and found that, while clinicians agreed with the data in most cases, there were notable exceptions where they overestimated or underestimated the true risk. Studying the rules with greatest disagreement, we identified problems with the training data, including one miscoded variable and one hidden confounder. Filtering the rules based on the extent of disagreement between clinician-assessed risk and empirical risk, we improved performance on out-of-sample data and were able to train with less data. EAML provides a platform for automated creation of problemspecific priors, which help build robust and dependable machinelearning models in critical applications.
Machine learning algorithms that are both interpretable and accurate are essential in applications such as medicine where errors can have a dire consequence. Unfortunately, there is currently a ...tradeoff between accuracy and interpretability among state-of-the-art methods. Decision trees are interpretable and are therefore used extensively throughout medicine for stratifying patients. Current decision tree algorithms, however, are consistently outperformed in accuracy by other, less-interpretable machine learning models, such as ensemble methods. We present MediBoost, a novel framework for constructing decision trees that retain interpretability while having accuracy similar to ensemble methods, and compare MediBoost's performance to that of conventional decision trees and ensemble methods on 13 medical classification problems. MediBoost significantly outperformed current decision tree algorithms in 11 out of 13 problems, giving accuracy comparable to ensemble methods. The resulting trees are of the same type as decision trees used throughout clinical practice but have the advantage of improved accuracy. Our algorithm thus gives the best of both worlds: it grows a single, highly interpretable tree that has the high accuracy of ensemble methods.
Clinical decision support systems are a growing class of tools with the potential to impact healthcare. This study investigates the construction of a decision support system through which clinicians ...can efficiently identify which previously approved historical treatment plans are achievable for a new patient to aid in selection of therapy.
Treatment data were collected for early-stage lung and postoperative oropharyngeal cancers treated using photon (lung and head and neck) and proton (head and neck) radiotherapy. Machine-learning classifiers were constructed using patient-specific feature-sets and a library of historical plans. Model accuracy was analyzed using learning curves, and historical treatment plan matching was investigated.
Learning curves demonstrate that for these datasets, approximately 45, 60, and 30 patients are needed for a sufficiently accurate classification model for radiotherapy for early-stage lung, postoperative oropharyngeal photon, and postoperative oropharyngeal proton, respectively. The resulting classification model provides a database of previously approved treatment plans that are achievable for a new patient. An exemplary case, highlighting tradeoffs between the heart and chest wall dose while holding target dose constant in two historical plans is provided.
We report on the first artificial-intelligence based clinical decision support system that connects patients to past discrete treatment plans in radiation oncology and demonstrate for the first time how this tool can enable clinicians to use past decisions to help inform current assessments. Clinicians can be informed of dose tradeoffs between critical structures early in the treatment process, enabling more time spent on finding the optimal course of treatment for individual patients.
•AI is beginning to transform treatment planning for head and neck patients.•The complexity and novelty of AI algorithms make them susceptible to misuse.•AI algorithms are distinct in their ...advantages and potential applications.•Raising the level of AI competence will allow benefits to be realized in a controlled and safe manner.
Artificial intelligence (AI) is beginning to transform IMRT treatment planning for head and neck patients. However, the complexity and novelty of AI algorithms make them susceptible to misuse by researchers and clinicians. Understanding nuances of new technologies could serve to mitigate potential clinical implementation pitfalls. This article is intended to facilitate integration of AI into the radiotherapy clinic by providing an overview of AI algorithms, including support vector machines (SVMs), random forests (RF), gradient boosting (GB), and several variations of deep learning. This document describes current AI algorithms that have been applied to head and neck IMRT planning and identifies rapidly growing branches of AI in industry that have potential applications to head and neck cancer patients receiving IMRT. AI algorithms have great clinical potential if used correctly but can also cause harm if misused, so it is important to raise the level of AI competence within radiation oncology so that the benefits can be realized in a controlled and safe manner.
The expansion of machine learning to high-stakes application domains such as medicine, finance, and criminal justice, where making informed decisions requires clear understanding of the model, has ...increased the interest in interpretable machine learning. The widely used Classification and Regression Trees (CART) have played a major role in health sciences, due to their simple and intuitive explanation of predictions. Ensemble methods like gradient boosting can improve the accuracy of decision trees, but at the expense of the interpretability of the generated model. Additive models, such as those produced by gradient boosting, and full interaction models, such as CART, have been investigated largely in isolation. We show that these models exist along a spectrum, revealing previously unseen connections between these approaches. This paper introduces a rigorous formalization for the additive tree, an empirically validated learning technique for creating a single decision tree, and shows that this method can produce models equivalent to CART or gradient boosted stumps at the extremes by varying a single parameter. Although the additive tree is designed primarily to provide both the model interpretability and predictive performance needed for high-stakes applications like medicine, it also can produce decision trees represented by hybrid models between CART and boosted stumps that can outperform either of these approaches.