Akademska digitalna zbirka SLovenije - logo
E-viri
Celotno besedilo
Recenzirano
  • Machine learning for the id...
    Cuocolo, Renato; Cipullo, Maria Brunella; Stanzione, Arnaldo; Romeo, Valeria; Green, Roberta; Cantoni, Valeria; Ponsiglione, Andrea; Ugga, Lorenzo; Imbriaco, Massimo

    European radiology, 12/2020, Letnik: 30, Številka: 12
    Journal Article

    Objectives The aim of this study was to systematically review the literature and perform a meta-analysis of machine learning (ML) diagnostic accuracy studies focused on clinically significant prostate cancer (csPCa) identification on MRI. Methods Multiple medical databases were systematically searched for studies on ML applications in csPCa identification up to July 31, 2019. Two reviewers screened all papers independently for eligibility. The area under the receiver operating characteristic curves (AUC) was pooled to quantify predictive accuracy. A random-effects model estimated overall effect size while statistical heterogeneity was assessed with the I 2 value. A funnel plot was used to investigate publication bias. Subgroup analyses were performed based on reference standard (biopsy or radical prostatectomy) and ML type (deep and non-deep). Results After the final revision, 12 studies were included in the analysis. Statistical heterogeneity was high both in overall and in subgroup analyses. The overall pooled AUC for ML in csPCa identification was 0.86, with 0.81–0.91 95% confidence intervals (95%CI). The biopsy subgroup ( n  = 9) had a pooled AUC of 0.85 (95%CI = 0.79–0.91) while the radical prostatectomy one ( n  = 3) of 0.88 (95%CI = 0.76–0.99). Deep learning ML ( n  = 4) had a 0.78 AUC (95%CI = 0.69–0.86) while the remaining 8 had AUC = 0.90 (95%CI = 0.85–0.94). Conclusions ML pipelines using prostate MRI to identify csPCa showed good accuracy and should be further investigated, possibly with better standardisation in design and reporting of results. Key Points • Overall pooled AUC was 0.86 with 0.81–0.91 95% confidence intervals. • In the reference standard subgroup analysis, algorithm accuracy was similar with pooled AUCs of 0.85 (0.79–0.91 95% confidence intervals) and 0.88 (0.76–0.99 95% confidence intervals) for studies employing biopsies and radical prostatectomy, respectively. • Deep learning pipelines performed worse (AUC = 0.78, 0.69–0.86 95% confidence intervals) than other approaches (AUC = 0.90, 0.85–0.94 95% confidence intervals).