Recent Developments in Speeding up Prostate MRI Mir, Nida; Fransen, Stefan J.; Wolterink, Jelmer M. ...
Journal of magnetic resonance imaging,
September 2024, Letnik:
60, Številka:
3
Journal Article
Recenzirano
Odprti dostop
The increasing incidence of prostate cancer cases worldwide has led to a tremendous demand for multiparametric MRI (mpMRI). In order to relieve the pressure on healthcare, reducing mpMRI scan time is ...necessary. This review focuses on recent techniques proposed for faster mpMRI acquisition, specifically shortening T2W and DWI sequences while adhering to the PI‐RADS (Prostate Imaging Reporting and Data System) guidelines. Speeding up techniques in the reviewed studies rely on more efficient sampling of data, ranging from the acquisition of fewer averages or b‐values to adjustment of the pulse sequence. Novel acquisition methods based on undersampling techniques are often followed by suitable reconstruction methods typically incorporating synthetic priori information. These reconstruction methods often use artificial intelligence for various tasks such as denoising, artifact correction, improvement of image quality, and in the case of DWI, for the generation of synthetic high b‐value images or apparent diffusion coefficient maps. Reduction of mpMRI scan time is possible, but it is crucial to maintain diagnostic quality, confirmed through radiological evaluation, to integrate the proposed methods into the standard mpMRI protocol. Additionally, before clinical integration, prospective studies are recommended to validate undersampling techniques to avoid potentially inaccurate results demonstrated by retrospective analysis. This review provides an overview of recently proposed techniques, discussing their implementation, advantages, disadvantages, and diagnostic performance according to PI‐RADS guidelines compared to conventional methods.
Level of Evidence
3
Technical Efficacy
Stage 3
Background: Deep learning (DL)-based models have demonstrated an ability to automatically diagnose clinically significant prostate cancer (PCa) on MRI scans and are regularly reported to approach ...expert performance. The aim of this work was to systematically review the literature comparing deep learning (DL) systems to radiologists in order to evaluate the comparative performance of current state-of-the-art deep learning models and radiologists. Methods: This systematic review was conducted in accordance with the 2020 Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) checklist. Studies investigating DL models for diagnosing clinically significant (cs) PCa on MRI were included. The quality and risk of bias of each study were assessed using the checklist for AI in medical imaging (CLAIM) and QUADAS-2, respectively. Patient level and lesion-based diagnostic performance were separately evaluated by comparing the sensitivity achieved by DL and radiologists at an identical specificity and the false positives per patient, respectively. Results: The final selection consisted of eight studies with a combined 7337 patients. The median study quality with CLAIM was 74.1% (IQR: 70.6–77.6). DL achieved an identical patient-level performance to the radiologists for PI-RADS ≥ 3 (both 97.7%, SD = 2.1%). DL had a lower sensitivity for PI-RADS ≥ 4 (84.2% vs. 88.8%, p = 0.43). The sensitivity of DL for lesion localization was also between 2% and 12.5% lower than that of the radiologists. Conclusions: DL models for the diagnosis of csPCa on MRI appear to approach the performance of experts but currently have a lower sensitivity compared to experienced radiologists. There is a need for studies with larger datasets and for validation on external data.
Deep learning (DL) MRI reconstruction enables fast scan acquisition with good visual quality, but the diagnostic impact is often not assessed because of large reader study requirements. This study ...used existing diagnostic DL to assess the diagnostic quality of reconstructed images.
A retrospective multisite study of 1535 patients assessed biparametric prostate MRI between 2016 and 2020. Likely clinically significant prostate cancer (csPCa) lesions (PI-RADS
4) were delineated by expert radiologists. T2-weighted scans were retrospectively undersampled, simulating accelerated protocols. DL reconstruction (DLRecon) and diagnostic DL detection (DLDetect) were developed. The effect on the partial area under (pAUC), the Free-Response Operating Characteristic (FROC) curve, and the structural similarity (SSIM) were compared as metrics for diagnostic and visual quality, respectively. DLDetect was validated with a reader concordance analysis. Statistical analysis included Wilcoxon, permutation, and Cohen's kappa tests for visual quality, diagnostic performance, and reader concordance.
DLRecon improved visual quality at 4- and 8-fold (R4, R8) subsampling rates, with SSIM (range: -1 to 1) improved to 0.78 ± 0.02 (p < 0.001) and 0.67 ± 0.03 (p < 0.001) from 0.68 ± 0.03 and 0.51 ± 0.03, respectively. However, diagnostic performance at R4 showed a pAUC FROC of 1.33 (CI 1.28-1.39) for DL and 1.29 (CI 1.23-1.35) for naive reconstructions, both significantly lower than fully sampled pAUC of 1.58 (DL: p = 0.024, naïve: p = 0.02). Similar trends were noted for R8.
DL reconstruction produces visually appealing images but may reduce diagnostic accuracy. Incorporating diagnostic AI into the assessment framework offers a clinically relevant metric essential for adopting reconstruction models into clinical practice.
In clinical settings, caution is warranted when using DL reconstruction for MRI scans. While it recovered visual quality, it failed to match the prostate cancer detection rates observed in scans not subjected to acceleration and DL reconstruction.
Adequate communication of scientific findings is crucial to enhance knowledge transfer. This study aimed to determine the key features of a good scientific oral presentation on artificial ...intelligence (AI) in medical imaging.
A total of 26 oral presentations dealing with original research on AI studies in medical imaging at the 2023 RSNA annual meeting were included and systematically assessed by three observers. The presentation quality of the research question, inclusion criteria, reference standard, method, results, clinical impact, presentation clarity, presenter engagement, and the presentation's quality of knowledge transfer were assessed using five-point Likert scales. The number of slides, the average number of words per slide, the number of interactive slides, the number of figures, and the number of tables were also determined for each presentation. Mixed-effects ordinal regression was used to assess the association between the above-mentioned variables and the quality of knowledge transfer of the presentation.
A significant positive association was found between the quality of the presentation of the research question and the presentation's quality of knowledge transfer (odds ratio OR: 2.5, P = 0.005). The average number of words per slide was significantly negatively associated with the presentation's quality of knowledge transfer (OR: 0.9, P < 0.001). No other significant associations were found.
Researchers who orally present their scientific findings in the field of AI and medical imaging should pay attention to clearly communicating their research question and minimizing the number of words per slide to maximize the value of their presentation.
•Artificial intelligence (AI) in medical imaging is a booming field of research.•Presentations at scientific meetings are an important means of knowledge transfer.•A good research question is essential for presentations on AI in medical imaging.•Presentations on AI in medical imaging improve by reducing the words per slide.
•AI, as an expert reader, may be helpful in other tasks than assisting diagnosis.•Diagnostic AI can objectively assess the diagnostic efficacy of MRI sequences.•DWI scan time for prostate cancer ...detection can be reduced by a factor of two.
To explore diagnostic deep learning for optimizing the prostate MRI protocol by assessing the diagnostic efficacy of MRI sequences.
This retrospective study included 840 patients with a biparametric prostate MRI scan. The MRI protocol included a T2-weighted image, three DWI sequences (b50, b400, and b800 s/mm2), a calculated ADC map, and a calculated b1400 sequence. Two accelerated MRI protocols were simulated, using only two acquired b-values to calculate the ADC and b1400. Deep learning models were trained to detect prostate cancer lesions on accelerated and full protocols. The diagnostic performances of the protocols were compared on the patient-level with the area under the receiver operating characteristic (AUROC), using DeLong's test, and on the lesion-level with the partial area under the free response operating characteristic (pAUFROC), using a permutation test. Validation of the results was performed among expert radiologists.
No significant differences in diagnostic performance were found between the accelerated protocols and the full bpMRI baseline. Omitting b800 reduced 53% DWI scan time, with a performance difference of + 0.01 AUROC (p = 0.20) and −0.03 pAUFROC (p = 0.45). Omitting b400 reduced 32% DWI scan time, with a performance difference of −0.01 AUROC (p = 0.65) and + 0.01 pAUFROC (p = 0.73). Multiple expert radiologists underlined the findings.
This study shows that deep learning can assess the diagnostic efficacy of MRI sequences by comparing prostate MRI protocols on diagnostic accuracy. Omitting either the b400 or the b800 DWI sequence can optimize the prostate MRI protocol by reducing scan time without compromising diagnostic quality.
Magnetic Resonance Imaging (MRI) offers strong soft tissue contrast but suffers from long acquisition times and requires tedious annotation from radiologists. Traditionally, these challenges have ...been addressed separately with reconstruction and image analysis algorithms. To see if performance could be improved by treating both as end-to-end, we hosted the K2S challenge, in which challenge participants segmented knee bones and cartilage from 8× undersampled k-space. We curated the 300-patient K2S dataset of multicoil raw k-space and radiologist quality-checked segmentations. 87 teams registered for the challenge and there were 12 submissions, varying in methodologies from serial reconstruction and segmentation to end-to-end networks to another that eschewed a reconstruction algorithm altogether. Four teams produced strong submissions, with the winner having a weighted Dice Similarity Coefficient of 0.910 ± 0.021 across knee bones and cartilage. Interestingly, there was no correlation between reconstruction and segmentation metrics. Further analysis showed the top four submissions were suitable for downstream biomarker analysis, largely preserving cartilage thicknesses and key bone shape features with respect to ground truth. K2S thus showed the value in considering reconstruction and image analysis as end-to-end tasks, as this leaves room for optimization while more realistically reflecting the long-term use case of tools being developed by the MR community.
Artificial intelligence (AI) systems can potentially aid the diagnostic pathway of prostate cancer by alleviating the increasing workload, preventing overdiagnosis, and reducing the dependence on ...experienced radiologists. We aimed to investigate the performance of AI systems at detecting clinically significant prostate cancer on MRI in comparison with radiologists using the Prostate Imaging—Reporting and Data System version 2.1 (PI-RADS 2.1) and the standard of care in multidisciplinary routine practice at scale.
In this international, paired, non-inferiority, confirmatory study, we trained and externally validated an AI system (developed within an international consortium) for detecting Gleason grade group 2 or greater cancers using a retrospective cohort of 10 207 MRI examinations from 9129 patients. Of these examinations, 9207 cases from three centres (11 sites) based in the Netherlands were used for training and tuning, and 1000 cases from four centres (12 sites) based in the Netherlands and Norway were used for testing. In parallel, we facilitated a multireader, multicase observer study with 62 radiologists (45 centres in 20 countries; median 7 IQR 5–10 years of experience in reading prostate MRI) using PI-RADS (2.1) on 400 paired MRI examinations from the testing cohort. Primary endpoints were the sensitivity, specificity, and the area under the receiver operating characteristic curve (AUROC) of the AI system in comparison with that of all readers using PI-RADS (2.1) and in comparison with that of the historical radiology readings made during multidisciplinary routine practice (ie, the standard of care with the aid of patient history and peer consultation). Histopathology and at least 3 years (median 5 IQR 4–6 years) of follow-up were used to establish the reference standard. The statistical analysis plan was prespecified with a primary hypothesis of non-inferiority (considering a margin of 0·05) and a secondary hypothesis of superiority towards the AI system, if non-inferiority was confirmed. This study was registered at ClinicalTrials.gov, NCT05489341.
Of the 10 207 examinations included from Jan 1, 2012, through Dec 31, 2021, 2440 cases had histologically confirmed Gleason grade group 2 or greater prostate cancer. In the subset of 400 testing cases in which the AI system was compared with the radiologists participating in the reader study, the AI system showed a statistically superior and non-inferior AUROC of 0·91 (95% CI 0·87–0·94; p<0·0001), in comparison to the pool of 62 radiologists with an AUROC of 0·86 (0·83–0·89), with a lower boundary of the two-sided 95% Wald CI for the difference in AUROC of 0·02. At the mean PI-RADS 3 or greater operating point of all readers, the AI system detected 6·8% more cases with Gleason grade group 2 or greater cancers at the same specificity (57·7%, 95% CI 51·6–63·3), or 50·4% fewer false-positive results and 20·0% fewer cases with Gleason grade group 1 cancers at the same sensitivity (89·4%, 95% CI 85·3–92·9). In all 1000 testing cases where the AI system was compared with the radiology readings made during multidisciplinary practice, non-inferiority was not confirmed, as the AI system showed lower specificity (68·9% 95% CI 65·3–72·4 vs 69·0% 65·5–72·5) at the same sensitivity (96·1%, 94·0–98·2) as the PI-RADS 3 or greater operating point. The lower boundary of the two-sided 95% Wald CI for the difference in specificity (−0·04) was greater than the non-inferiority margin (−0·05) and a p value below the significance threshold was reached (p<0·001).
An AI system was superior to radiologists using PI-RADS (2.1), on average, at detecting clinically significant prostate cancer and comparable to the standard of care. Such a system shows the potential to be a supportive tool within a primary diagnostic setting, with several associated benefits for patients and radiologists. Prospective validation is needed to test clinical applicability of this system.
Health~Holland and EU Horizon 2020.
To report, from a retrospective analysis of prospectively collected data, on the feasibility, outcome, toxicity, and voice-handicap index (VHI) of patients with T1a glottic cancer treated by a novel ...intensity modulated radiation therapy technique developed at our institution to treat only the involved vocal cord: single vocal cord irradiation (SVCI).
Thirty patients with T1a glottic cancer were treated by means of SVCI. Dose prescription was set to 16 × 3.63 Gy (total dose 58.08 Gy). The clinical target volume was the entire vocal cord. Setup verification was done by means of an online correction protocol using cone beam computed tomography. Data for voice quality assessment were collected prospectively at baseline, end of treatment, and 4, 6, and 12 weeks and 6, 12, and 18 months after treatment using VHI questionnaires.
After a median follow-up of 30 months (range, 7-50 months), the 2-year local control and overall survival rates were 100% and 90% because no single local recurrence was reported and 3 patients died because of comorbidity. All patients have completed the intended treatment schedule; no treatment interruptions and no grade 3 acute toxicity were reported. Grade 2 acute dermatitis or dysphagia was reported in only 5 patients (17%). No serious late toxicity was reported; only 1 patient developed temporary grade 2 laryngeal edema, and responded to a short-course of corticosteroid. The VHI improved significantly, from 33.5 at baseline to 9.5 and 10 at 6 weeks and 18 months, respectively (P<.001). The control group, treated to the whole larynx, had comparable local control rates (92.2% vs 100%, P=.24) but more acute toxicity (66% vs 17%, P<.0001) and higher VHI scores (23.8 and 16.7 at 6 weeks and 18 months, respectively, P<.0001).
Single vocal cord irradiation is feasible and resulted in maximal local control rate at 2 years. The deterioration in VHI scores was slight and temporary and subsequently improved to normal levels. Long-term follow-up is needed to consolidate these promising results.
Developmental and epileptic encephalopathies (DEEs) represent a large clinical and genetic heterogeneous group of neurodevelopmental diseases. The identification of pathogenic genetic variants in ...DEEs remains crucial for deciphering this complex group and for accurately caring for affected individuals (clinical diagnosis, genetic counseling, impacting medical, precision therapy, clinical trials, etc.). Whole-exome sequencing and intensive data sharing identified a recurrent de novo PACS2 heterozygous missense variant in 14 unrelated individuals. Their phenotype was characterized by epilepsy, global developmental delay with or without autism, common cerebellar dysgenesis, and facial dysmorphism. Mixed focal and generalized epilepsy occurred in the neonatal period, controlled with difficulty in the first year, but many improved in early childhood. PACS2 is an important PACS1 paralog and encodes a multifunctional sorting protein involved in nuclear gene expression and pathway traffic regulation. Both proteins harbor cargo(furin)-binding regions (FBRs) that bind cargo proteins, sorting adaptors, and cellular kinase. Compared to the defined PACS1 recurrent variant series, individuals with PACS2 variant have more consistently neonatal/early-infantile-onset epilepsy that can be challenging to control. Cerebellar abnormalities may be similar but PACS2 individuals exhibit a pattern of clear dysgenesis ranging from mild to severe. Functional studies demonstrated that the PACS2 recurrent variant reduces the ability of the predicted autoregulatory domain to modulate the interaction between the PACS2 FBR and client proteins, which may disturb cellular function. These findings support the causality of this recurrent de novo PACS2 heterozygous missense in DEEs with facial dysmorphim and cerebellar dysgenesis.
Objectives
To compare the prognostic value of the World Health Organization (WHO) 1973 and 2004 classification systems for grade in T1 bladder cancer (T1‐BC), as both are currently recommended in ...international guidelines.
Patients and Methods
Three uro‐pathologists re‐revised slides of 601 primary (first diagnosis) T1‐BCs, initially managed conservatively (bacille Calmette–Guérin) in four hospitals. Grade was defined according to WHO1973 (Grade 1–3) and WHO2004 (low‐grade LG and high‐grade HG). This resulted in a lack of Grade 1 tumours, 188 (31%) Grade 2, and 413 (69%) Grade 3 tumours. There were 47 LG (8%) vs 554 (92%) HG tumours. We determined the prognostic value for progression‐free survival (PFS) and cancer‐specific survival (CSS) in Cox‐regression models and corrected for age, sex, multiplicity, size and concomitant carcinoma in situ.
Results
At a median follow‐up of 5.9 years, 148 patients showed progression and 94 died from BC. The WHO1973 Grade 3 was negatively associated with PFS (hazard ratio HR 2.1) and CSS (HR 3.4), whilst WHO2004 grade was not prognostic. On multivariable analysis, WHO1973 grade was the only prognostic factor for progression (HR 2.0). Grade 3 tumours (HR 3.0), older age (HR 1.03) and tumour size >3 cm (HR 1.8) were all independently associated with worse CSS.
Conclusion
The WHO1973 classification system for grade has strong prognostic value in T1‐BC, compared to the WHO2004 system. Our present results suggest that WHO1973 grade cannot be replaced by the WHO2004 classification in non‐muscle‐invasive BC guidelines.