Cluster failure Eklund, Anders; Nichols, Thomas E.; Knutsson, Hans
Proceedings of the National Academy of Sciences - PNAS,
07/2016, Letnik:
113, Številka:
28
Journal Article
Recenzirano
Odprti dostop
The most widely used task functional magnetic resonance imaging (fMRI) analyses use parametric statistical methods that depend on a variety of assumptions. In this work, we use real resting-state ...data and a total of 3 million random task group analyses to compute empirical familywise error rates for the fMRI software packages SPM, FSL, and AFNI, as well as a nonparametric permutation method. For a nominal familywise error rate of 5%, the parametric statistical methods are shown to be conservative for voxelwise inference and invalid for clusterwise inference. Our results suggest that the principal cause of the invalid cluster inferences is spatial autocorrelation functions that do not follow the assumed Gaussian shape. By comparison, the nonparametric permutation test is found to produce nominal results for voxelwise as well as clusterwise inference. These findings speak to the need of validating the statistical methods being used in the field of neuroimaging.
Smith and Nichols discuss “big data” human neuroimaging studies, with very large subject numbers and amounts of data. These studies provide great opportunities for making new discoveries about the ...brain but raise many new analytical challenges and interpretational risks.
Smith and Nichols discuss “big data” human neuroimaging studies, with very large subject numbers and amounts of data. These studies provide great opportunities for making new discoveries about the brain but raise many new analytical challenges and interpretational risks.
I provide a selective review of the literature on the multiple testing problem in fMRI. By drawing connections with the older modalities, PET in particular, and how software implementations have ...tracked (or lagged behind) theoretical developments, my narrative aims to give the methodological researcher a historical perspective on this important aspect of fMRI data analysis.
A wealth of analysis tools are available to fMRI researchers in order to extract patterns of task variation and, ultimately, understand cognitive function. However, this “methodological plurality” ...comes with a drawback. While conceptually similar, two different analysis pipelines applied on the same dataset may not produce the same scientific results. Differences in methods, implementations across software, and even operating systems or software versions all contribute to this variability. Consequently, attention in the field has recently been directed to reproducibility and data sharing. In this work, our goal is to understand how choice of software package impacts on analysis results. We use publicly shared data from three published task fMRI neuroimaging studies, reanalyzing each study using the three main neuroimaging software packages, AFNI, FSL, and SPM, using parametric and nonparametric inference. We obtain all information on how to process, analyse, and model each dataset from the publications. We make quantitative and qualitative comparisons between our replications to gauge the scale of variability in our results and assess the fundamental differences between each software package. Qualitatively we find similarities between packages, backed up by Neurosynth association analyses that correlate similar words and phrases to all three software package's unthresholded results for each of the studies we reanalyse. However, we also discover marked differences, such as Dice similarity coefficients ranging from 0.000 to 0.684 in comparisons of thresholded statistic maps between software. We discuss the challenges involved in trying to reanalyse the published studies, and highlight our efforts to make this research reproducible.
The dependence between pairs of time series is commonly quantified by Pearson's correlation. However, if the time series are themselves dependent (i.e. exhibit temporal autocorrelation), the ...effective degrees of freedom (EDF) are reduced, the standard error of the sample correlation coefficient is biased, and Fisher's transformation fails to stabilise the variance. Since fMRI time series are notoriously autocorrelated, the issue of biased standard errors – before or after Fisher's transformation – becomes vital in individual-level analysis of resting-state functional connectivity (rsFC) and must be addressed anytime a standardised Z-score is computed. We find that the severity of autocorrelation is highly dependent on spatial characteristics of brain regions, such as the size of regions of interest and the spatial location of those regions. We further show that the available EDF estimators make restrictive assumptions that are not supported by the data, resulting in biased rsFC inferences that lead to distorted topological descriptions of the connectome on the individual level. We propose a practical “xDF” method that accounts not only for distinct autocorrelation in each time series, but instantaneous and lagged cross-correlation. We find the xDF correction varies substantially over node pairs, indicating the limitations of global EDF corrections used previously. In addition to extensive synthetic and real data validations, we investigate the impact of this correction on rsFC measures in data from the Young Adult Human Connectome Project, showing that accounting for autocorrelation dramatically changes fundamental graph theoretical measures relative to no correction.
•Autocorrelation is a problem for sample correlation, breaking the variance-stabilising property of Fisher's transformation.•We show that fMRI autocorrelation varies systematically with region of interest size, and is heterogeneous over subjects.•Existing adjustment methods are themselves biased when true correlation is non-zero due to a confounding effect.•Our “xDF” method provides accurate Z-scores based on either of Pearson's or Fisher's transformed correlations.•Resting state fMRI autocorrelation considerably alters the graph theoretical description of human connectome.
Given concerns about the reproducibility of scientific findings, neuroimaging must define best practices for data analysis, results reporting, and algorithm and data sharing to promote transparency, ...reliability and collaboration. We describe insights from developing a set of recommendations on behalf of the Organization for Human Brain Mapping and identify barriers that impede these practices, including how the discipline must change to fully exploit the potential of the world's neuroimaging data.
•This work presents BLMM, a Python tool for analysing mass-univariate LMMs.•BLMM utilizes vectorization speed-ups when working with multiple voxels.•BLMM accounts for the ǣvoxel-wise missingnessǥ ...ubiquitous in large-n fMRI analyses.•The correctness and performance of BLMM are assessed via extensive simulation.•The paper concludes by providing a real data example based on the UK Biobank.
Within neuroimaging large-scale, shared datasets are becoming increasingly commonplace, challenging existing tools both in terms of overall scale and complexity of the study designs. As sample sizes grow, researchers are presented with new opportunities to detect and account for grouping factors and covariance structure present in large experimental designs. In particular, standard linear model methods cannot account for the covariance and grouping structures present in large datasets, and the existing linear mixed models (LMM) tools are neither scalable nor exploit the computational speed-ups afforded by vectorisation of computations over voxels. Further, nearly all existing tools for imaging (fixed or mixed effect) do not account for variability in the patterns of missing data near cortical boundaries and the edge of the brain, and instead omit any voxels with any missing data. Yet in the large-n setting, such a voxel-wise deletion missing data strategy leads to severe shrinkage of the final analysis mask. To counter these issues, we describe the “Big” Linear Mixed Models (BLMM) toolbox, an efficient Python package for large-scale fMRI LMM analyses. BLMM is designed for use on high performance computing clusters and utilizes a Fisher Scoring procedure made possible by derivations for the LMM Fisher information matrix and score vectors derived in our previous work, Maullin-Sapey and Nichols (2021).
Is it possible to prevent atrophy of key brain regions related to cognitive decline and Alzheimer’s disease (AD)? One approach is to modify nongenetic risk factors, for instance by lowering elevated ...plasma homocysteine using B vitamins. In an initial, randomized controlled study on elderly subjects with increased dementia risk (mild cognitive impairment according to 2004 Petersen criteria), we showed that high-dose B-vitamin treatment (folic acid 0.8 mg, vitamin B6 20 mg, vitamin B12 0.5 mg) slowed shrinkage of the whole brain volume over 2 y. Here, we go further by demonstrating that B-vitamin treatment reduces, by as much as seven fold, the cerebral atrophy in those gray matter (GM) regions specifically vulnerable to the AD process, including the medial temporal lobe. In the placebo group, higher homocysteine levels at baseline are associated with faster GM atrophy, but this deleterious effect is largely prevented by B-vitamin treatment. We additionally show that the beneficial effect of B vitamins is confined to participants with high homocysteine (above the median, 11 µmol/L) and that, in these participants, a causal Bayesian network analysis indicates the following chain of events: B vitamins lower homocysteine, which directly leads to a decrease in GM atrophy, thereby slowing cognitive decline. Our results show that B-vitamin supplementation can slow the atrophy of specific brain regions that are a key component of the AD process and that are associated with cognitive decline. Further B-vitamin supplementation trials focusing on elderly subjets with high homocysteine levels are warranted to see if progression to dementia can be prevented.
We investigated the relationship between individual subjects' functional connectomes and 280 behavioral and demographic measures in a single holistic multivariate analysis relating imaging to ...non-imaging data from 461 subjects in the Human Connectome Project. We identified one strong mode of population co-variation: subjects were predominantly spread along a single 'positive-negative' axis linking lifestyle, demographic and psychometric measures to each other and to a specific pattern of brain connectivity.
Many image enhancement and thresholding techniques make use of spatial neighbourhood information to boost belief in extended areas of signal. The most common such approach in neuroimaging is ...cluster-based thresholding, which is often more sensitive than voxel-wise thresholding. However, a limitation is the need to define the initial cluster-forming threshold. This threshold is arbitrary, and yet its exact choice can have a large impact on the results, particularly at the lower (e.g., t, z < 4) cluster-forming thresholds frequently used. Furthermore, the amount of spatial pre-smoothing is also arbitrary (given that the expected signal extent is very rarely known in advance of the analysis). In the light of such problems, we propose a new method which attempts to keep the sensitivity benefits of cluster-based thresholding (and indeed the general concept of “clusters” of signal), while avoiding (or at least minimising) these problems. The method takes a raw statistic image and produces an output image in which the voxel-wise values represent the amount of cluster-like local spatial support. The method is thus referred to as “threshold-free cluster enhancement” (TFCE). We present the TFCE approach and discuss in detail ROC-based optimisation and comparisons with cluster-based and voxel-based thresholding. We find that TFCE gives generally better sensitivity than other methods over a wide range of test signal shapes and SNR values. We also show an example on a real imaging dataset, suggesting that TFCE does indeed provide not just improved sensitivity, but richer and more interpretable output than cluster-based thresholding.