Fully automated classification algorithms have been successfully applied to diagnose a wide range of neurological and psychiatric diseases. They are sufficiently robust to handle data from different ...scanners for many applications and in specific cases outperform radiologists. This article provides an overview of current applications taking structural imaging in Alzheimer's disease and schizophrenia as well as functional imaging to diagnose depression as examples. In this context, we also report studies aiming to predict the future course of the disease and the response to treatment for the individual. This has obvious clinical relevance but is also important for the design of treatment studies that may aim to include a cohort with a predicted fast disease progression to be more sensitive to detect treatment effects.
In the second part, we present our own opinions on i) the role these classification methods can play in the clinical setting; ii) where their limitations are at the moment and iii) how those can be overcome. Specifically, we discuss strategies to deal with disease heterogeneity, diagnostic uncertainties, a probabilistic framework for classification and multi-class classification approaches.
► Overview of clinical applications of classification methods. ► Description of current limitations. ► Outlook and future directions.
We measured the fast temporal dynamics of face processing simultaneously across the human temporal cortex (TC) using intracranial recordings in eight participants. We found sites with selective ...responses to faces clustered in the ventral TC, which responded increasingly strongly to marine animal, bird, mammal, and human faces. Both face-selective and face-active but non-selective sites showed a posterior to anterior gradient in response time and selectivity. A sparse model focusing on information from the human face-selective sites performed as well as, or better than, anatomically distributed models when discriminating faces from non-faces stimuli. Additionally, we identified the posterior fusiform site (pFUS) as causally the most relevant node for inducing distortion of conscious face processing by direct electrical stimulation. These findings support anatomically discrete but temporally distributed response profiles in the human brain and provide a new common ground for unifying the seemingly contradictory modular and distributed modes of face processing.
Psychiatric prognosis is a difficult problem. Making a prognosis requires looking far into the future, as opposed to making a diagnosis, which is concerned with the current state. During the ...follow-up period, many factors will influence the course of the disease. Combined with the usually scarcer longitudinal data and the variability in the definition of outcomes/transition, this makes prognostic predictions a challenging endeavor. Employing neuroimaging data in this endeavor introduces the additional hurdle of high dimensionality. Machine learning techniques are especially suited to tackle this challenging problem. This review starts with a brief introduction to machine learning in the context of its application to clinical neuroimaging data. We highlight a few issues that are especially relevant for prediction of outcome and transition using neuroimaging. We then review the literature that discusses the application of machine learning for this purpose. Critical examination of the studies and their results with respect to the relevant issues revealed the following: 1) there is growing evidence for the prognostic capability of machine learning–based models using neuroimaging; and 2) reported accuracies may be too optimistic owing to small sample sizes and the lack of independent test samples. Finally, we discuss options to improve the reliability of (prognostic) prediction models. These include new methodologies and multimodal modeling. Paramount, however, is our conclusion that future work will need to provide properly (cross-)validated accuracy estimates of models trained on sufficiently large datasets. Nevertheless, with the technological advances enabling acquisition of large databases of patients and healthy subjects, machine learning represents a powerful tool in the search for psychiatric biomarkers.
Supervised machine learning (ML) algorithms are increasingly popular tools for fMRI decoding due to their predictive capability and their ability to capture information encoded by spatially ...correlated voxels. In addition, an important secondary outcome is a multivariate representation of the pattern underlying the prediction. Despite an impressive array of applications, most fMRI applications are framed as classification problems and predictions are limited to categorical class decisions. For many applications, quantitative predictions are desirable that more accurately represent variability within subject groups and that can be correlated with behavioural variables. We evaluate the predictive capability of Gaussian process (GP) models for two types of quantitative prediction (multivariate regression and probabilistic classification) using whole-brain fMRI volumes. As a proof of concept, we apply GP models to an fMRI experiment investigating subjective responses to thermal pain and show GP models predict subjective pain ratings without requiring anatomical hypotheses about functional localisation of relevant brain processes. Even in the case of pain perception, where strong hypotheses do exist, GP predictions were more accurate than any region previously demonstrated to encode pain intensity. We demonstrate two brain mapping methods suitable for GP models and we show that GP regression models outperform state of the art support vector- and relevance vector regression. For classification, GP models perform categorical prediction as accurately as a support vector machine classifier and furnish probabilistic class predictions.
The 21st century marks the emergence of “big data” with a rapid increase in the availability of datasets with multiple measurements. In neuroscience, brain-imaging datasets are more commonly ...accompanied by dozens or hundreds of phenotypic subject descriptors on the behavioral, neural, and genomic level. The complexity of such “big data” repositories offer new opportunities and pose new challenges for systems neuroscience. Canonical correlation analysis (CCA) is a prototypical family of methods that is useful in identifying the links between variable sets from different modalities. Importantly, CCA is well suited to describing relationships across multiple sets of data, such as in recently available big biomedical datasets. Our primer discusses the rationale, promises, and pitfalls of CCA.
•Introduction to the feature of canonical correlation analysis and its applications in combining two or more domains of data, such as behavioural and neuroimaging measures.•The utility of different variations the pros/cons of CCA.•Tips on application of CCA on rich phenotype datasets such as UK Biobank and HCP.
In the present study, we applied the Support Vector Machine (SVM) algorithm to perform multivariate classification of brain states from whole functional magnetic resonance imaging (fMRI) volumes ...without prior selection of spatial features. In addition, we did a comparative analysis between the SVM and the Fisher Linear Discriminant (FLD) classifier. We applied the methods to two multisubject attention experiments: a face matching and a location matching task. We demonstrate that SVM outperforms FLD in classification performance as well as in robustness of the spatial maps obtained (i.e. discriminating volumes). In addition, the SVM discrimination maps had greater overlap with the general linear model (GLM) analysis compared to the FLD. The analysis presents two phases: during the training, the classifier algorithm finds the set of regions by which the two brain states can be best distinguished from each other. In the next phase, the test phase, given an fMRI volume from a new subject, the classifier predicts the subject's instantaneous brain state.
Autism spectrum disorder (ASD) is a neurodevelopmental condition with multiple causes, comorbid conditions, and a wide range in the type and severity of symptoms expressed by different individuals. ...This makes the neuroanatomy of autism inherently difficult to describe. Here, we demonstrate how a multiparameter classification approach can be used to characterize the complex and subtle structural pattern of gray matter anatomy implicated in adults with ASD, and to reveal spatially distributed patterns of discriminating regions for a variety of parameters describing brain anatomy. A set of five morphological parameters including volumetric and geometric features at each spatial location on the cortical surface was used to discriminate between people with ASD and controls using a support vector machine (SVM) analytic approach, and to find a spatially distributed pattern of regions with maximal classification weights. On the basis of these patterns, SVM was able to identify individuals with ASD at a sensitivity and specificity of up to 90% and 80%, respectively. However, the ability of individual cortical features to discriminate between groups was highly variable, and the discriminating patterns of regions varied across parameters. The classification was specific to ASD rather than neurodevelopmental conditions in general (e.g., attention deficit hyperactivity disorder). Our results confirm the hypothesis that the neuroanatomy of autism is truly multidimensional, and affects multiple and most likely independent cortical features. The spatial patterns detected using SVM may help further exploration of the specific genetic and neuropathological underpinnings of ASD, and provide new insights into the most likely multifactorial etiology of the condition.
Structured sparse methods have received significant attention in neuroimaging. These methods allow the incorporation of domain knowledge through additional spatial and temporal constraints in the ...predictive model and carry the promise of being more interpretable than non-structured sparse methods, such as LASSO or Elastic Net methods. However, although sparsity has often been advocated as leading to more interpretable models it can also lead to unstable models under subsampling or slight changes of the experimental conditions. In the present work we investigate the impact of using stability/reproducibility as an additional model selection criterion on several different sparse (and structured sparse) methods that have been recently applied for fMRI brain decoding. We compare three different model selection criteria: (i) classification accuracy alone; (ii) classification accuracy and overlap between the solutions; (iii) classification accuracy and correlation between the solutions. The methods we consider include LASSO, Elastic Net, Total Variation, sparse Total Variation, Laplacian and Graph Laplacian Elastic Net (GraphNET). Our results show that explicitly accounting for stability/reproducibility during the model optimization can mitigate some of the instability inherent in sparse methods. In particular, using accuracy and overlap between the solutions as a joint optimization criterion can lead to solutions that are more similar in terms of accuracy, sparsity levels and coefficient maps even when different sparsity methods are considered.
Clustering is usually the first exploratory analysis step in empirical data. When the data set comprises graphs, the most common approaches focus on clustering its vertices. In this work, we are ...interested in grouping networks with similar connectivity structures together instead of grouping vertices of the graph. We could apply this approach to functional brain networks (FBNs) for identifying subgroups of people presenting similar functional connectivity, such as studying a mental disorder. The main problem is that real-world networks present natural fluctuations, which we should consider.
In this context, spectral density is an exciting feature because graphs generated by different models present distinct spectral densities, thus presenting different connectivity structures. We introduce two clustering methods: k-means for graphs of the same size and gCEM, a model-based approach for graphs of different sizes. We evaluated their performance in toy models. Finally, we applied them to FBNs of monkeys under anesthesia and a dataset of chemical compounds.
We show that our methods work well in both toy models and real-world data. They present good results for clustering graphs presenting different connectivity structures even when they present the same number of edges, vertices, and degree of centrality.
We recommend using k-means-based clustering for graphs when graphs present the same number of vertices and the gCEM method when graphs present a different number of vertices.