Mammography is a very well-established imaging modality for the early detection and diagnosis of breast cancer. However, since the introduction of digital imaging to the realm of radiology, more ...advanced, and especially tomographic imaging methods have been made possible. One of these methods, breast tomosynthesis, has finally been introduced to the clinic for routine everyday use, with potential to in the future replace mammography for screening for breast cancer. In this two part paper, the extensive research performed during the development of breast tomosynthesis is reviewed, with a focus on the research addressing the medical physics aspects of this imaging modality. This first paper will review the research performed on the issues relevant to the image acquisition process, including system design, optimization of geometry and technique, x-ray scatter, and radiation dose. The companion to this paper will review all other aspects of breast tomosynthesis imaging, including the reconstruction process.
Many important post-acquisition aspects of breast tomosynthesis imaging can impact its clinical performance. Chief among them is the reconstruction algorithm that generates the representation of the ...three-dimensional breast volume from the acquired projections. But even after reconstruction, additional processes, such as artifact reduction algorithms, computer aided detection and diagnosis, among others, can also impact the performance of breast tomosynthesis in the clinical realm. In this two part paper, a review of breast tomosynthesis research is performed, with an emphasis on its medical physics aspects. In the companion paper, the first part of this review, the research performed relevant to the image acquisition process is examined. This second part will review the research on the post-acquisition aspects, including reconstruction, image processing, and analysis, as well as the advanced applications being investigated for breast tomosynthesis.
Objectives
To evaluate the technical performance of an ultra-high-resolution CT (UHRCT) system.
Methods
The physico-technical capabilities of a novel commercial UHRCT system were assessed and ...compared with those of a current-generation multi-detector (MDCT) system. The super-high-resolution (SHR) mode of the system uses 0.25 mm (at isocentre) detector elements (dels) in the in-plane and longitudinal directions, while the high-resolution (HR) mode bins two dels in the longitudinal direction. The normal-resolution (NR) mode bins dels 2 × 2, resulting in a del-size equivalent to that of the MDCT system. In general, standard procedures and phantoms were used to perform these assessments.
Results
The UHRCT MTF (10% MTF 4.1 lp/mm) is twice as high as that of the MDCT (10% MTF 1.9 lp/mm), which is comparable to the MTF in the NR mode (10% MTF 1.7 lp/mm). The width of the slice sensitivity profile in the SHR mode (FWHM 0.45 mm) is about 60% of that of the MDCT (FWHM 0.77 mm). Uniformity and CT numbers are within the expected range. Noise in the high-resolution modes has a higher magnitude and higher frequency components compared with MDCT. Low-contrast visibility is lower for the NR, HR and SHR modes compared with MDCT, but about a 14%, for NR, and 23%, for HR and SHR, dose increase gives the same results.
Conclusions
HR and SHR mode scanning results in double the spatial resolution, with about a 23% increase in dose required to achieve the same low-contrast detectability.
Key Points
• Resolution on UHRCT is up to twice as high as for the tested MDCT.
• With abdominal settings, UHRCT needs higher dose for the same low-contrast detectability as MDCT, but dose is still below achievable levels as defined by current diagnostic reference levels.
• The UHRCT system used in normal-resolution mode yields comparable resolution and noise characteristics as the MDCT system.
To comprehensively characterize the dosimetric properties of a clinical digital breast tomosynthesis (DBT) system for the acquisition of mammographic and tomosynthesis images.
Compressible water-oil ...mixture phantoms were created and imaged by using the automatic exposure control (AEC) of the Selenia Dimensions system (Hologic, Bedford, Mass) in both DBT and full-field digital mammography (FFDM) mode. Empirical measurements of the x-ray tube output were performed with a dosimeter to measure the air kerma for the range of tube current-exposure time product settings and to develop models of the automatically selected x-ray spectra. A Monte Carlo simulation of the system was developed and used in conjunction with the AEC-chosen settings and spectra models to compute and compare the mean glandular dose (MGD) resulting from both imaging modalities for breasts of varying sizes and glandular compositions.
Acquisition of a single craniocaudal view resulted in an MGD ranging from 0.309 to 5.26 mGy in FFDM mode and from 0.657 to 3.52 mGy in DBT mode. For a breast with a compressed thickness of 5.0 cm and a 50% glandular fraction, a DBT acquisition resulted in an only 8% higher MGD than an FFDM acquisition (1.30 and 1.20 mGy, respectively). For a breast with a compressed thickness of 6.0 cm and a 14.3% glandular fraction, a DBT acquisition resulted in an 83% higher MGD than an FFDM acquisition (2.12 and 1.16 mGy, respectively).
For two-dimensional-three-dimensional fusion imaging with the Selenia Dimensions system, the MGD for a 5-cm-thick 50% glandular breast is 2.50 mGy, which is less than the Mammography Quality Standards Act limit for a two-view screening mammography study.
Although computers have had a role in interpretation of mammograms for at least two decades, their impact on performance has not lived up to expectations. However, in the last five years, the field ...of medical image analysis has undergone a revolution due to the introduction of deep learning convolutional neural networks – a form of artificial intelligence (AI). Because of their considerably higher performance compared to conventional computer aided detection methods, these AI algorithms have resulted in renewed interest in their potential for interpreting breast images in stand-alone mode. For this, first the actual capability of the algorithms, compared to breast radiologists, needs to be well understood. Although early studies have pointed to the comparable performance between AI systems and breast radiologists in interpreting mammograms, these comparisons have been performed in laboratory conditions with limited, enriched datasets. AI algorithms with performance comparable to breast radiologists could be used in a number of different ways, the most impactful being pre-selection, or triaging, of normal screening mammograms that would not need human interpretation. Initial studies evaluating this proposed use have shown very promising results, with the resulting accuracy of the complete screening process not being affected, but with a significant reduction in workload. There is a need to perform additional studies, especially prospective ones, with large screening data sets, to both gauge the actual stand-alone performance of these new algorithms, and the impact of the different implementation possibilities on screening programs.
•AI-based mammography interpretation systems are feasible for stand-alone mode use.•Studies to date have shown that their performance approximates that of radiologists.•Larger scale, prospective screening trials are needed to determine their impact.•Once proven, AI identification of normal cases could reduce the radiologist workload.
Objectives
To evaluate image quality and reconstruction times of a commercial deep learning reconstruction algorithm (DLR) compared to hybrid-iterative reconstruction (Hybrid-IR) and model-based ...iterative reconstruction (MBIR) algorithms for cerebral non-contrast CT (NCCT).
Methods
Cerebral NCCT acquisitions of 50 consecutive patients were reconstructed using DLR, Hybrid-IR and MBIR with a clinical CT system. Image quality, in terms of six subjective characteristics (noise, sharpness, grey-white matter differentiation, artefacts, natural appearance and overall image quality), was scored by five observers. As objective metrics of image quality, the noise magnitude and signal-difference-to-noise ratio (SDNR) of the grey and white matter were calculated. Mean values for the image quality characteristics scored by the observers were estimated using a general linear model to account for multiple readers. The estimated means for the reconstruction methods were pairwise compared. Calculated measures were compared using paired
t
tests.
Results
For all image quality characteristics, DLR images were scored significantly higher than MBIR images. Compared to Hybrid-IR, perceived noise and grey-white matter differentiation were better with DLR, while no difference was detected for other image quality characteristics. Noise magnitude was lower for DLR compared to Hybrid-IR and MBIR (5.6, 6.4 and 6.2, respectively) and SDNR higher (2.4, 1.9 and 2.0, respectively). Reconstruction times were 27 s, 44 s and 176 s for Hybrid-IR, DLR and MBIR respectively.
Conclusions
With a slight increase in reconstruction time, DLR results in lower noise and improved tissue differentiation compared to Hybrid-IR. Image quality of MBIR is significantly lower compared to DLR with much longer reconstruction times.
Key Points
• Deep learning reconstruction of cerebral non-contrast CT results in lower noise and improved tissue differentiation compared to hybrid-iterative reconstruction.
• Deep learning reconstruction of cerebral non-contrast CT results in better image quality in all aspects evaluated compared to model-based iterative reconstruction.
• Deep learning reconstruction only needs a slight increase in reconstruction time compared to hybrid-iterative reconstruction, while model-based iterative reconstruction requires considerably longer processing time.
Studies involving Monte Carlo simulations are common in both diagnostic and therapy medical physics research, as well as other fields of basic and applied science. As with all experimental studies, ...the conditions and parameters used for Monte Carlo simulations impact their scope, validity, limitations, and generalizability. Unfortunately, many published peer‐reviewed articles involving Monte Carlo simulations do not provide the level of detail needed for the reader to be able to properly assess the quality of the simulations. The American Association of Physicists in Medicine Task Group #268 developed guidelines to improve reporting of Monte Carlo studies in medical physics research. By following these guidelines, manuscripts submitted for peer‐review will include a level of relevant detail that will increase the transparency, the ability to reproduce results, and the overall scientific value of these studies. The guidelines include a checklist of the items that should be included in the Methods, Results, and Discussion sections of manuscripts submitted for peer‐review. These guidelines do not attempt to replace the journal reviewer, but rather to be a tool during the writing and review process. Given the varied nature of Monte Carlo studies, it is up to the authors and the reviewers to use this checklist appropriately, being conscious of how the different items apply to each particular scenario. It is envisioned that this list will be useful both for authors and for reviewers, to help ensure the adequate description of Monte Carlo studies in the medical physics literature.
A deep learning (DL) network for 2D-based breast mass segmentation in unenhanced dedicated breast CT images was developed and validated, and its robustness in radiomic feature stability and ...diagnostic performance compared to manual annotations of multiple radiologists was investigated. 93 mass-like lesions were extensively augmented and used to train the network (n = 58 masses), which was then tested (n = 35 masses) against manual ground truth of a qualified breast radiologist with experience in breast CT imaging using the Conformity coefficient (with a value equal to 1 indicating a perfect performance). Stability and diagnostic power of 672 radiomic descriptors were investigated between the computerized segmentation, and 4 radiologists' annotations for the 35 test set cases. Feature stability and diagnostic performance in the discrimination between benign and malignant cases were quantified using intraclass correlation (ICC) and multivariate analysis of variance (MANOVA), performed for each segmentation case (4 radiologists and DL algorithm). DL-based segmentation resulted in a Conformity of 0.85 ± 0.06 against the annotated ground truth. For the stability analysis, although modest agreement was found among the four annotations performed by radiologists (Conformity 0.78 ± 0.03), over 90% of all radiomic features were found to be stable (ICC>0.75) across multiple segmentations. All MANOVA analyses were statistically significant (p ≤ 0.05), with all dimensions equal to 1, and Wilks’ lambda ≤0.35. In conclusion, DL-based mass segmentation in dedicated breast CT images can achieve high segmentation performance, and demonstrated to provide stable radiomic descriptors with comparable discriminative power in the classification of benign and malignant tumors to expert radiologist annotation.
•The validity of engineered solutions for breast mass segmentation is investigated in the perspective of radiomic analyses.•A deep learning network for breast mass segmentation in dedicated breast CT imaging was developed and validated.•Radiomic feature stability was investigated between the annotations of multiple radiologists and the deep learning network.•The majority of radiomic features were found to be stable across multiple mass segmentations.•Deep learning-based breast lesion segmentation has the potential to substitute manual annotation in radiomic analyses.
•The use of deep learning and ultra-high resolution CT for personalized cochlear implant surgery is investigated for the first time.•An algorithm for automated cochlea segmentation and measurements ...is proposed.•The proposed algorithm can achieve accurate cochlea segmentation results, thanks to extensive data augmentation and to the combination of multiple networks in cascade, which allow an optimized learning from a limited training set.•The algorithm highlighted a large variability in cochlea size in a large patient cohort.•The proposed approach could help in future personalized cochlear implant surgery, potentially allowing to adapt the size of cochlear implant electrodes to each single patient based on image-based cochlear measurements.
Performing patient-specific, pre-operative cochlea CT-based measurements could be helpful to positively affect the outcome of cochlear surgery in terms of intracochlear trauma and loss of residual hearing. Therefore, we propose a method to automatically segment and measure the human cochlea in clinical ultra-high-resolution (UHR) CT images, and investigate differences in cochlea size for personalized implant planning.
123 temporal bone CT scans were acquired with two UHR-CT scanners, and used to develop and validate a deep learning-based system for automated cochlea segmentation and measurement. The segmentation algorithm is composed of two major steps (detection and pixel-wise classification) in cascade, and aims at combining the results of a multi-scale computer-aided detection scheme with a U-Net-like architecture for pixelwise classification. The segmentation results were used as an input to the measurement algorithm, which provides automatic cochlear measurements (volume, basal diameter, and cochlear duct length (CDL)) through the combined use of convolutional neural networks and thinning algorithms. Automatic segmentation was validated against manual annotation, by the means of Dice similarity, Boundary-F1 (BF) score, and maximum and average Hausdorff distances, while measurement errors were calculated between the automatic results and the corresponding manually obtained ground truth on a per-patient basis. Finally, the developed system was used to investigate the differences in cochlea size within our patient cohort, to relate the measurement errors to the actual variation in cochlear size across different patients.
Automatic segmentation resulted in a Dice of 0.90 ± 0.03, BF score of 0.95 ± 0.03, and maximum and average Hausdorff distance of 3.05 ± 0.39 and 0.32 ± 0.07 against manual annotation. Automatic cochlear measurements resulted in errors of 8.4% (volume), 5.5% (CDL), 7.8% (basal diameter). The cochlea size varied broadly, ranging between 0.10 and 0.28 ml (volume), 1.3 and 2.5 mm (basal diameter), and 27.7 and 40.1 mm (CDL).
The proposed algorithm could successfully segment and analyze the cochlea on UHR-CT images, resulting in accurate measurements of cochlear anatomy. Given the wide variation in cochlear size found in our patient cohort, it may find application as a pre-operative tool in cochlear implant surgery, potentially helping elaborate personalized treatment strategies based on patient-specific, image-based anatomical measurements.
Objectives
Digital breast tomosynthesis (DBT) increases sensitivity of mammography and is increasingly implemented in breast cancer screening. However, the large volume of images increases the risk ...of reading errors and reading time. This study aims to investigate whether the accuracy of breast radiologists reading wide-angle DBT increases with the aid of an artificial intelligence (AI) support system. Also, the impact on reading time was assessed and the stand-alone performance of the AI system in the detection of malignancies was compared to the average radiologist.
Methods
A multi-reader multi-case study was performed with 240 bilateral DBT exams (71 breasts with cancer lesions, 70 breasts with benign findings, 339 normal breasts). Exams were interpreted by 18 radiologists, with and without AI support, providing cancer suspicion scores per breast. Using AI support, radiologists were shown examination-based and region-based cancer likelihood scores. Area under the receiver operating characteristic curve (AUC) and reading time per exam were compared between reading conditions using mixed-models analysis of variance.
Results
On average, the AUC was higher using AI support (0.863 vs 0.833;
p
= 0.0025). Using AI support, reading time per DBT exam was reduced (
p <
0.001) from 41 (95% CI = 39–42 s) to 36 s (95% CI = 35– 37 s). The AUC of the stand-alone AI system was non-inferior to the AUC of the average radiologist (+0.007,
p
= 0.8115).
Conclusions
Radiologists improved their cancer detection and reduced reading time when evaluating DBT examinations using an AI reading support system.
Key Points
• Radiologists improved their cancer detection accuracy in digital breast tomosynthesis (DBT) when using an AI system for support, while simultaneously reducing reading time.
• The stand-alone breast cancer detection performance of an AI system is non-inferior to the average performance of radiologists for reading digital breast tomosynthesis exams.
• The use of an AI support system could make advanced and more reliable imaging techniques more accessible and could allow for more cost-effective breast screening programs with DBT.