Akademska digitalna zbirka SLovenije - logo
E-resources
Peer reviewed Open access
  • Reliability assessment of t...
    Dadar, Mahsa; Duchesne, Simon

    NeuroImage, 08/2020, Volume: 217
    Journal Article

    Gray and white matter volume difference and change are important imaging markers of pathology and disease progression in neurology and psychiatry. Such measures are usually estimated from tissue segmentation maps produced by publicly available image processing pipelines. However, the reliability of the produced segmentations when using multi-center and multi-scanner data remains understudied. Here, we assess the robustness of six publicly available tissue classification pipelines across images acquired from different MR scanners and sites. We used 90 T1-weighted images of a single individual, scanned in 73 sessions across 27 different sites to assess the robustness of the tissue classification tools. Variability in Dice similarity index values and tissue volumes was assessed for Atropos, BISON, Classify_Clean, FAST, FreeSurfer, and SPM12. BISON had the highest overall Dice coefficient for GM, followed by SPM12 and Atropos; while Atropos had the highest overall Dice coefficient for WM, followed by BISON and SPM12. BISON had the lowest overall variability in its volumetric estimates, followed by FreeSurfer, and SPM12. All methods also had significant differences between some of their estimates across different scanner manufacturers (e.g. BISON had significantly higher GM estimates and correspondingly lower WM estimates for GE scans compared to Philips and Siemens), and different signal-to-noise ratio (SNR) levels (e.g. FAST and FreeSurfer had significantly higher WM volume estimates for high versus medium and low SNR tertiles as well as correspondingly lower GM volume estimates). Our comparisons provide a benchmark on the reliability of the publicly used tissue classification techniques and the amount of variability that can be expected when using large multi-center and multi-scanner databases. •Reliability comparison of six publicly available tissue classification pipelines.•90 T1-weighted images of a single individual across 27 sites used for evaluation.•Compared estimated volume differences across scanner manufacutrers•Assessed the impact of signal-to-noise ratio.•Our comparisons provide a benchmark on the reliability of each technique.