Advances in Auto-Segmentation Cardenas, Carlos E.; Yang, Jinzhong; Anderson, Brian M. ...
Seminars in radiation oncology,
July 2019, 2019-07-00, 20190701, Volume:
29, Issue:
3
Journal Article
Peer reviewed
Manual image segmentation is a time-consuming task routinely performed in radiotherapy to identify each patient's targets and anatomical structures. The efficacy and safety of the radiotherapy plan ...requires accurate segmentations as these regions of interest are generally used to optimize and assess the quality of the plan. However, reports have shown that this process can be subject to significant inter- and intraobserver variability. Furthermore, the quality of the radiotherapy treatment, and subsequent analyses (ie, radiomics, dosimetric), can be subject to the accuracy of these manual segmentations. Automatic segmentation (or auto-segmentation) of targets and normal tissues is, therefore, preferable as it would address these challenges. Previously, auto-segmentation techniques have been clustered into 3 generations of algorithms, with multiatlas based and hybrid techniques (third generation) being considered the state-of-the-art. More recently, however, the field of medical image segmentation has seen accelerated growth driven by advances in computer vision, particularly through the application of deep learning algorithms, suggesting we have entered the fourth generation of auto-segmentation algorithm development. In this paper, the authors review traditional (nondeep learning) algorithms particularly relevant for applications in radiotherapy. Concepts from deep learning are introduced focusing on convolutional neural networks and fully-convolutional networks which are generally used for segmentation tasks. Furthermore, the authors provide a summary of deep learning auto-segmentation radiotherapy applications reported in the literature. Lastly, considerations for clinical deployment (commissioning and QA) of auto-segmentation software are provided.
Marfan syndrome (MFS) is an autosomal dominant disorder of the connective tissue caused by mutations in the
(fibrillin-1) gene encoding a large glycoprotein in the extracellular matrix called ...fibrillin-1. The major complication of this connective disorder is the risk to develop thoracic aortic aneurysm. To date, no effective pharmacologic therapies have been identified for the management of thoracic aortic disease and the only options capable of preventing aneurysm rupture are endovascular repair or open surgery. Here, we have studied the role of mitochondrial dysfunction in the progression of thoracic aortic aneurysm and mitochondrial boosting strategies as a potential treatment to managing aortic aneurysms.
Combining transcriptomics and metabolic analysis of aortas from an MFS mouse model (
) and MFS patients, we have identified mitochondrial dysfunction alongside with mtDNA depletion as a new hallmark of aortic aneurysm disease in MFS. To demonstrate the importance of mitochondrial decline in the development of aneurysms, we generated a conditional mouse model with mitochondrial dysfunction specifically in vascular smooth muscle cells (VSMC) by conditional depleting Tfam (mitochondrial transcription factor A;
mice). We used a mouse model of MFS to test for drugs that can revert aortic disease by enhancing Tfam levels and mitochondrial respiration.
The main canonical pathways highlighted in the transcriptomic analysis in aortas from
mice were those related to metabolic function, such as mitochondrial dysfunction. Mitochondrial complexes, whose transcription depends on Tfam and mitochondrial DNA content, were reduced in aortas from young
mice. In vitro experiments in
-silenced VSMCs presented increased lactate production and decreased oxygen consumption. Similar results were found in MFS patients. VSMCs seeded in matrices produced by Fbn1-deficient VSMCs undergo mitochondrial dysfunction. Conditional Tfam-deficient VSMC mice lose their contractile capacity, showed aortic aneurysms, and died prematurely. Restoring mitochondrial metabolism with the NAD precursor nicotinamide riboside rapidly reverses aortic aneurysm in
mice.
Mitochondrial function of VSMCs is controlled by the extracellular matrix and drives the development of aortic aneurysm in Marfan syndrome. Targeting vascular metabolism is a new available therapeutic strategy for managing aortic aneurysms associated with genetic disorders.
•Clinical target volume definition in radiotherapy is challenging.•The contribution of computational methods is discussed.•Goals are automation, consistency, and ultimately improvements.•Image ...segmentation algorithms can automate the process in parts.•Mathematical models may quantitatively describe tumor progression patterns.
Treatment planning in radiotherapy distinguishes three target volume concepts: the gross tumor volume (GTV), the clinical target volume (CTV), and the planning target volume (PTV). Over time, GTV definition and PTV margins have improved through the development of novel imaging techniques and better image guidance, respectively. CTV definition is sometimes considered the weakest element in the planning process. CTV definition is particularly complex since the extension of microscopic disease cannot be seen using currently available in-vivo imaging techniques. Instead, CTV definition has to incorporate knowledge of the patterns of tumor progression. While CTV delineation has largely been considered the domain of radiation oncologists, this paper, arising from a 2019 ESTRO Physics research workshop, discusses the contributions that medical physics and computer science can make by developing computational methods to support CTV definition. First, we overview the role of image segmentation algorithms, which may in part automate CTV delineation through segmentation of lymph node stations or normal tissues representing anatomical boundaries of microscopic tumor progression. The recent success of deep convolutional neural networks has also enabled learning entire CTV delineations from examples. Second, we discuss the use of mathematical models of tumor progression for CTV definition, using as example the application of glioma growth models to facilitate GTV-to-CTV expansion for glioblastoma that is consistent with neuroanatomy. We further consider statistical machine learning models to quantify lymphatic metastatic progression of tumors, which may eventually improve elective CTV definition. Lastly, we discuss approaches to incorporate uncertainty in CTV definition into treatment plan optimization as well as general limitations of the CTV concept in the case of infiltrating tumors without natural boundaries.
Purpose
To develop a head and neck normal structures autocontouring tool that could be used to automatically detect the errors in autocontours from a clinically validated autocontouring tool.
Methods
...An autocontouring tool based on convolutional neural networks (CNN) was developed for 16 normal structures of the head and neck and tested to identify the contour errors from a clinically validated multiatlas‐based autocontouring system (MACS). The computed tomography (CT) scans and clinical contours from 3495 patients were semiautomatically curated and used to train and validate the CNN‐based autocontouring tool. The final accuracy of the tool was evaluated by calculating the Sørensen–Dice similarity coefficients (DSC) and Hausdorff distances between the automatically generated contours and physician‐drawn contours on 174 internal and 24 external CT scans. Lastly, the CNN‐based tool was evaluated on 60 patients' CT scans to investigate the possibility to detect contouring failures. The contouring failures on these patients were classified as either minor or major errors. The criteria to detect contouring errors were determined by analyzing the DSC between the CNN‐ and MACS‐based contours under two independent scenarios: (a) contours with minor errors are clinically acceptable and (b) contours with minor errors are clinically unacceptable.
Results
The average DSC and Hausdorff distance of our CNN‐based tool was 98.4%/1.23 cm for brain, 89.1%/0.42 cm for eyes, 86.8%/1.28 cm for mandible, 86.4%/0.88 cm for brainstem, 83.4%/0.71 cm for spinal cord, 82.7%/1.37 cm for parotids, 80.7%/1.08 cm for esophagus, 71.7%/0.39 cm for lenses, 68.6%/0.72 for optic nerves, 66.4%/0.46 cm for cochleas, and 40.7%/0.96 cm for optic chiasm. With the error detection tool, the proportions of the clinically unacceptable MACS contours that were correctly detected were 0.99/0.80 on average except for the optic chiasm, when contours with minor errors are clinically acceptable/unacceptable, respectively. The proportions of the clinically acceptable MACS contours that were correctly detected were 0.81/0.60 on average except for the optic chiasm, when contours with minor errors are clinically acceptable/unacceptable, respectively.
Conclusion
Our CNN‐based autocontouring tool performed well on both the publically available and the internal datasets. Furthermore, our results show that CNN‐based algorithms are able to identify ill‐defined contours from a clinically validated and used multiatlas‐based autocontouring tool. Therefore, our CNN‐based tool can effectively perform automatic verification of MACS contours.
Purpose
Radiation therapy treatment planning is a time‐consuming and iterative manual process. Consequently, plan quality varies greatly between and within institutions. Artificial intelligence shows ...great promise in improving plan quality and reducing planning times. This technical note describes our participation in the American Association of Physicists in Medicine Open Knowledge‐Based Planning Challenge (OpenKBP), a competition to accurately predict radiation therapy dose distributions.
Methods
A three‐dimensional (3D) densely connected U‐Net with dilated convolutions was developed to predict 3D dose distributions given contoured CT images of head and neck patients as input. While traditional augmentation techniques such as rotations and translations were explored, it was found that training on random patches alone resulted in the greatest model performance. A custom‐weighted mean squared error loss function was employed. Finally, an ensemble of best‐performing networks was used to generate the final challenge predictions.
Results
Our team (SuperPod) placed second in the dose stream of the OpenKBP challenge. The average mean absolute difference between the predicted and clinical dose distributions of the testing dataset was 2.56 Gy. On average, the predicted normalized target DVH metrics were within 3% of the clinical plans, and the predicted organ at risk DVH metrics were within 2 Gy of the clinical plans.
Conclusions
The developed 3D dense dilated U‐Net architecture can accurately predict 3D radiotherapy dose distributions and can be used as part of a fully automated radiation therapy planning pipeline.
Purpose
To investigate the impact of computed tomography (CT) image acquisition and reconstruction parameters, including slice thickness, pixel size, and dose, on automatic contouring algorithms.
...Methods
Eleven scans from patients with head‐and‐neck cancer were reconstructed with varying slice thicknesses and pixel sizes. CT dose was varied by adding noise using low‐dose simulation software. The impact of these imaging parameters on two in‐house auto‐contouring algorithms, one convolutional neural network (CNN)‐based and one multiatlas‐based system (MACS) was investigated for 183 reconstructed scans. For each algorithm, auto‐contours for organs‐at‐risk were compared with auto‐contours from scans with 3 mm slice thickness, 0.977 mm pixel size, and 100% CT dose using Dice similarity coefficient (DSC), Hausdorff distance (HD), and mean surface distance (MSD).
Results
Increasing the slice thickness from baseline value of 3 mm gave a progressive reduction in DSC and an increase in HD and MSD on average for all structures. Reducing the CT dose only had a relatively minimal effect on DSC and HD. The rate of change with respect to dose for both auto‐contouring methods is approximately 0. Changes in pixel size had a small effect on DSC and HD for CNN‐based auto‐contouring with differences in DSC being within 0.07. Small structures had larger deviations from the baseline values than large structures for DSC. The relative differences in HD and MSD between the large and small structures were small.
Conclusions
Auto‐contours can deviate substantially with changes in CT acquisition and reconstruction parameters, especially slice thickness and pixel size. The CNN was less sensitive to changes in pixel size, and dose levels than the MACS. The results contraindicated more restrictive values for the parameters should be used than a typical imaging protocol for head‐and‐neck.
To develop a deep learning model that generates consistent, high-quality lymph node clinical target volumes (CTV) contours for head and neck cancer (HNC) patients, as an integral part of a fully ...automated radiation treatment planning workflow.
Computed tomography (CT) scans from 71 HNC patients were retrospectively collected and split into training (n = 51), cross-validation (n = 10), and test (n = 10) data sets. All had target volume delineations covering lymph node levels Ia through V (Ia-V), Ib through V (Ib-V), II through IV (II-IV), and retropharyngeal (RP) nodes, which were previously approved by a radiation oncologist specializing in HNC. Volumes of interest (VOIs) about nodal levels were automatically identified using computer vision techniques. The VOI (cropped CT image) and approved contours were used to train a U-Net autosegmentation model. Each lymph node level was trained independently, with model parameters optimized by assessing performance on the cross-validation data set. Once optimal model parameters were identified, overlap and distance metrics were calculated between ground truth and autosegmentations on the test set. Lastly, this final model was used on 32 additional patient scans (not included in original 71 cases) and autosegmentations visually rated by 3 radiation oncologists as being “clinically acceptable without requiring edits,” “requiring minor edits,” or “requiring major edits.”
When comparing ground truths to autosegmentations on the test data set, median Dice Similarity Coefficients were 0.90, 0.90, 0.89, and 0.81, and median mean surface distance values were 1.0 mm, 1.0 mm, 1.1 mm, and 1.3 mm for node levels Ia-V, Ib-V, II-IV, and RP nodes, respectively. Qualitative scoring varied among physicians. Overall, 99% of autosegmented target volumes were either scored as being clinically acceptable or requiring minor edits (ie, stylistic recommendations, <2 minutes).
We developed a fully automated artificial intelligence approach to autodelineate nodal CTVs for patients with intact HNC. Most autosegmentations were found to be clinically acceptable after qualitative review when considering recommended stylistic edits. This promising work automatically delineates nodal CTVs in a robust and consistent manner; this approach can be implemented in ongoing efforts for fully automated radiation treatment planning.
The future of artificial intelligence (AI) heralds unprecedented change for the field of radiation oncology. Commercial vendors and academic institutions have created AI tools for radiation oncology, ...but such tools have not yet been widely adopted into clinical practice. In addition, numerous discussions have prompted careful thoughts about AI's impact upon the future landscape of radiation oncology: How can we preserve innovation, creativity, and patient safety? When will AI-based tools be widely adopted into the clinic? Will the need for clinical staff be reduced? How will these devices and tools be developed and regulated?
In this work, we examine how deep learning, a rapidly emerging subset of AI, fits into the broader historical context of advancements made in radiation oncology and medical physics. In addition, we examine a representative set of deep learning-based tools that are being made available for use in external beam radiotherapy treatment planning and how these deep learning-based tools and other AI-based tools will impact members of the radiation treatment planning team. Key Messages: Compared to past transformative innovations explored in this article, such as the Monte Carlo method or intensity-modulated radiotherapy, the development and adoption of deep learning-based tools is occurring at faster rates and promises to transform practices of the radiation treatment planning team. However, accessibility to these tools will be determined by each clinic's access to the internet, web-based solutions, or high-performance computing hardware. As seen by the trends exhibited by many technologies, high dependence on new technology can result in harm should the product fail in an unexpected manner, be misused by the operator, or if the mitigation to an expected failure is not adequate. Thus, the need for developers and researchers to rigorously validate deep learning-based tools, for users to understand how to operate tools appropriately, and for professional bodies to develop guidelines for their use and maintenance is essential. Given that members of the radiation treatment planning team perform many tasks that are automatable, the use of deep learning-based tools, in combination with other automated treatment planning tools, may refocus tasks performed by the treatment planning team and may potentially reduce resource-related burdens for clinics with limited resources.
Automating and standardizing the contouring of clinical target volumes (CTVs) can reduce interphysician variability, which is one of the largest sources of uncertainty in head and neck radiation ...therapy. In addition to using uniform margin expansions to auto-delineate high-risk CTVs, very little work has been performed to provide patient- and disease-specific high-risk CTVs. The aim of the present study was to develop a deep neural network for the auto-delineation of high-risk CTVs.
Fifty-two oropharyngeal cancer patients were selected for the present study. All patients were treated at The University of Texas MD Anderson Cancer Center from January 2006 to August 2010 and had previously contoured gross tumor volumes and CTVs. We developed a deep learning algorithm using deep auto-encoders to identify physician contouring patterns at our institution. These models use distance map information from surrounding anatomic structures and the gross tumor volume as input parameters and conduct voxel-based classification to identify voxels that are part of the high-risk CTV. In addition, we developed a novel probability threshold selection function, based on the Dice similarity coefficient (DSC), to improve the generalization of the predicted volumes. The DSC-based function is implemented during an inner cross-validation loop, and probability thresholds are selected a priori during model parameter optimization. We performed a volumetric comparison between the predicted and manually contoured volumes to assess our model.
The predicted volumes had a median DSC value of 0.81 (range 0.62-0.90), median mean surface distance of 2.8 mm (range 1.6-5.5), and median 95th Hausdorff distance of 7.5 mm (range 4.7-17.9) when comparing our predicted high-risk CTVs with the physician manual contours.
These predicted high-risk CTVs provided close agreement to the ground-truth compared with current interobserver variability. The predicted contours could be implemented clinically, with only minor or no changes.
Purpose
To develop a tool for the automatic contouring of clinical treatment volumes (CTVs) and normal tissues for radiotherapy treatment planning in cervical cancer patients.
Methods
An ...auto‐contouring tool based on convolutional neural networks (CNN) was developed to delineate three cervical CTVs and 11 normal structures (seven OARs, four bony structures) in cervical cancer treatment for use with the Radiation Planning Assistant, a web‐based automatic plan generation system. A total of 2254 retrospective clinical computed tomography (CT) scans from a single cancer center and 210 CT scans from a segmentation challenge were used to train and validate the CNN‐based auto‐contouring tool. The accuracy of the tool was evaluated by calculating the Sørensen‐dice similarity coefficient (DSC) and mean surface and Hausdorff distances between the automatically generated contours and physician‐drawn contours on 140 internal CT scans. A radiation oncologist scored the automatically generated contours on 30 external CT scans from three South African hospitals.
Results
The average DSC, mean surface distance, and Hausdorff distance of our CNN‐based tool were 0.86/0.19 cm/2.02 cm for the primary CTV, 0.81/0.21 cm/2.09 cm for the nodal CTV, 0.76/0.27 cm/2.00 cm for the PAN CTV, 0.89/0.11 cm/1.07 cm for the bladder, 0.81/0.18 cm/1.66 cm for the rectum, 0.90/0.06 cm/0.65 cm for the spinal cord, 0.94/0.06 cm/0.60 cm for the left femur, 0.93/0.07 cm/0.66 cm for the right femur, 0.94/0.08 cm/0.76 cm for the left kidney, 0.95/0.07 cm/0.84 cm for the right kidney, 0.93/0.05 cm/1.06 cm for the pelvic bone, 0.91/0.07 cm/1.25 cm for the sacrum, 0.91/0.07 cm/0.53 cm for the L4 vertebral body, and 0.90/0.08 cm/0.68 cm for the L5 vertebral bodies. On average, 80% of the CTVs, 97% of the organ at risk, and 98% of the bony structure contours in the external test dataset were clinically acceptable based on physician review.
Conclusions
Our CNN‐based auto‐contouring tool performed well on both internal and external datasets and had a high rate of clinical acceptability.