Today, despite decades of developments in medicine and the growing interest in precision healthcare, vast majority of diagnoses happen once patients begin to show noticeable signs of illness. Early ...indication and detection of diseases, however, can provide patients and carers with the chance of early intervention, better disease management, and efficient allocation of healthcare resources. The latest developments in machine learning (including deep learning) provides a great opportunity to address this unmet need. In this study, we introduce BEHRT: A deep neural sequence transduction model for electronic health records (EHR), capable of simultaneously predicting the likelihood of 301 conditions in one's future visits. When trained and evaluated on the data from nearly 1.6 million individuals, BEHRT shows a striking improvement of 8.0-13.2% (in terms of average precision scores for different tasks), over the existing state-of-the-art deep EHR models. In addition to its scalability and superior accuracy, BEHRT enables personalised interpretation of its predictions; its flexible architecture enables it to incorporate multiple heterogeneous concepts (e.g., diagnosis, medication, measurements, and more) to further improve the accuracy of its predictions; its (pre-)training results in disease and patient representations can be useful for future studies (i.e., transfer learning).
One major impediment to the wider use of deep learning for clinical decision making is the difficulty of assigning a level of confidence to model predictions. Currently, deep Bayesian neural networks ...and sparse Gaussian processes are the main two scalable uncertainty estimation methods. However, deep Bayesian neural networks suffer from lack of expressiveness, and more expressive models such as deep kernel learning, which is an extension of sparse Gaussian process, captures only the uncertainty from the higher-level latent space. Therefore, the deep learning model under it lacks interpretability and ignores uncertainty from the raw data. In this paper, we merge features of the deep Bayesian learning framework with deep kernel learning to leverage the strengths of both methods for a more comprehensive uncertainty estimation. Through a series of experiments on predicting the first incidence of heart failure, diabetes and depression applied to large-scale electronic medical records, we demonstrate that our method is better at capturing uncertainty than both Gaussian processes and deep Bayesian neural networks in terms of indicating data insufficiency and identifying misclassifications, with a comparable generalization performance. Furthermore, by assessing the accuracy and area under the receiver operating characteristic curve over the predictive probability, we show that our method is less susceptible to making overconfident predictions, especially for the minority class in imbalanced datasets. Finally, we demonstrate how uncertainty information derived by the model can inform risk factor analysis towards model interpretability.
Many sources of fluctuation contribute to the fMRI signal, and this makes identifying the effects that are truly related to the underlying neuronal activity difficult. Independent component analysis ...(ICA) – one of the most widely used techniques for the exploratory analysis of fMRI data – has shown to be a powerful technique in identifying various sources of neuronally-related and artefactual fluctuation in fMRI data (both with the application of external stimuli and with the subject “at rest”). ICA decomposes fMRI data into patterns of activity (a set of spatial maps and their corresponding time series) that are statistically independent and add linearly to explain voxel-wise time series. Given the set of ICA components, if the components representing “signal” (brain activity) can be distinguished form the “noise” components (effects of motion, non-neuronal physiology, scanner artefacts and other nuisance sources), the latter can then be removed from the data, providing an effective cleanup of structured noise. Manual classification of components is labour intensive and requires expertise; hence, a fully automatic noise detection algorithm that can reliably detect various types of noise sources (in both task and resting fMRI) is desirable. In this paper, we introduce FIX (“FMRIB's ICA-based X-noiseifier”), which provides an automatic solution for denoising fMRI data via accurate classification of ICA components. For each ICA component FIX generates a large number of distinct spatial and temporal features, each describing a different aspect of the data (e.g., what proportion of temporal fluctuations are at high frequencies). The set of features is then fed into a multi-level classifier (built around several different classifiers). Once trained through the hand-classification of a sufficient number of training datasets, the classifier can then automatically classify new datasets. The noise components can then be subtracted from (or regressed out of) the original data, to provide automated cleanup. On conventional resting-state fMRI (rfMRI) single-run datasets, FIX achieved about 95% overall accuracy. On high-quality rfMRI data from the Human Connectome Project, FIX achieves over 99% classification accuracy, and as a result is being used in the default rfMRI processing pipeline for generating HCP connectomes. FIX is publicly available as a plugin for FSL.
The prevalence, age of onset, and symptomatology of many neuropsychiatric conditions differ between males and females. To understand the causes and consequences of sex differences it is important to ...establish where they occur in the human brain. We report the first meta-analysis of typical sex differences on global brain volume, a descriptive account of the breakdown of studies of each compartmental volume by six age categories, and whole-brain voxel-wise meta-analyses on brain volume and density. Gaussian-process regression coordinate-based meta-analysis was used to examine sex differences in voxel-based regional volume and density. On average, males have larger total brain volumes than females. Examination of the breakdown of studies providing total volumes by age categories indicated a bias towards the 18-59 year-old category. Regional sex differences in volume and tissue density include the amygdala, hippocampus and insula, areas known to be implicated in sex-biased neuropsychiatric conditions. Together, these results suggest candidate regions for investigating the asymmetric effect that sex has on the developing brain, and for understanding sex-biased neurological and psychiatric conditions.
Background How measures of long-term exposure to elevated blood pressure might add to the performance of "current" blood pressure in predicting future cardiovascular disease is unclear. We compared ...incident cardiovascular disease risk prediction using past, current, and usual systolic blood pressure alone or in combination. Methods and Results Using data from UK primary care linked electronic health records, we applied a landmark cohort study design and identified 80 964 people, aged 50 years (derivation cohort=64 772; validation cohort=16 192), who, at study entry, had recorded blood pressure, no prior cardiovascular disease, and no previous antihypertensive or lipid-lowering prescriptions. We used systolic blood pressure recorded up to 10 years before baseline to estimate past systolic blood pressure (mean, time-weighted mean, and variability) and usual systolic blood pressure (correcting current values for past time-dependent blood pressure fluctuations) and examined their prospective relation with incident cardiovascular disease (first hospitalization for or death from coronary heart disease or stroke/transient ischemic attack). We used Cox regression to estimate hazard ratios and applied Bayesian analysis within a machine learning framework in model development and validation. Predictive performance of models was assessed using discrimination (area under the receiver operating characteristic curve) and calibration metrics. We found that elevated past, current, and usual systolic blood pressure values were separately and independently associated with increased incident cardiovascular disease risk. When used alone, the hazard ratio (95% credible interval) per 20-mm Hg increase in current systolic blood pressure was 1.22 (1.18-1.30), but associations were stronger for past systolic blood pressure (mean and time-weighted mean) and usual systolic blood pressure (hazard ratio ranging from 1.39-1.45). The area under the receiver operating characteristic curve for a model that included current systolic blood pressure, sex, smoking, deprivation, diabetes mellitus, and lipid profile was 0.747 (95% credible interval, 0.722-0.811). The addition of past systolic blood pressure mean, time-weighted mean, or variability to this model increased the area under the receiver operating characteristic curve (95% credible interval) to 0.750 (0.727-0.811), 0.750 (0.726-0.811), and 0.748 (0.723-0.811), respectively, with all models showing good calibration. Similar small improvements in area under the receiver operating characteristic curve were observed when testing models on the validation cohort, in sex-stratified analyses, or by using different landmark ages (40 or 60 years). Conclusions Using multiple blood pressure recordings from patients' electronic health records showed stronger associations with incident cardiovascular disease than a single blood pressure measurement, but their addition to multivariate risk prediction models had negligible effects on model performance.
Emergency admissions are a major source of healthcare spending. We aimed to derive, validate, and compare conventional and machine learning models for prediction of the first emergency admission. ...Machine learning methods are capable of capturing complex interactions that are likely to be present when predicting less specific outcomes, such as this one.
We used longitudinal data from linked electronic health records of 4.6 million patients aged 18-100 years from 389 practices across England between 1985 to 2015. The population was divided into a derivation cohort (80%, 3.75 million patients from 300 general practices) and a validation cohort (20%, 0.88 million patients from 89 general practices) from geographically distinct regions with different risk levels. We first replicated a previously reported Cox proportional hazards (CPH) model for prediction of the risk of the first emergency admission up to 24 months after baseline. This reference model was then compared with 2 machine learning models, random forest (RF) and gradient boosting classifier (GBC). The initial set of predictors for all models included 43 variables, including patient demographics, lifestyle factors, laboratory tests, currently prescribed medications, selected morbidities, and previous emergency admissions. We then added 13 more variables (marital status, prior general practice visits, and 11 additional morbidities), and also enriched all variables by incorporating temporal information whenever possible (e.g., time since first diagnosis). We also varied the prediction windows to 12, 36, 48, and 60 months after baseline and compared model performances. For internal validation, we used 5-fold cross-validation. When the initial set of variables was used, GBC outperformed RF and CPH, with an area under the receiver operating characteristic curve (AUC) of 0.779 (95% CI 0.777, 0.781), compared to 0.752 (95% CI 0.751, 0.753) and 0.740 (95% CI 0.739, 0.741), respectively. In external validation, we observed an AUC of 0.796, 0.736, and 0.736 for GBC, RF, and CPH, respectively. The addition of temporal information improved AUC across all models. In internal validation, the AUC rose to 0.848 (95% CI 0.847, 0.849), 0.825 (95% CI 0.824, 0.826), and 0.805 (95% CI 0.804, 0.806) for GBC, RF, and CPH, respectively, while the AUC in external validation rose to 0.826, 0.810, and 0.788, respectively. This enhancement also resulted in robust predictions for longer time horizons, with AUC values remaining at similar levels across all models. Overall, compared to the baseline reference CPH model, the final GBC model showed a 10.8% higher AUC (0.848 compared to 0.740) for prediction of risk of emergency admission within 24 months. GBC also showed the best calibration throughout the risk spectrum. Despite the wide range of variables included in models, our study was still limited by the number of variables included; inclusion of more variables could have further improved model performances.
The use of machine learning and addition of temporal information led to substantially improved discrimination and calibration for predicting the risk of emergency admission. Model performance remained stable across a range of prediction time windows and when externally validated. These findings support the potential of incorporating machine learning models into electronic health records to inform care and service planning.
Functional connectomics from resting-state fMRI Smith, Stephen M; Vidaurre, Diego; Beckmann, Christian F ...
Trends in cognitive sciences,
12/2013, Letnik:
17, Številka:
12
Journal Article
Recenzirano
Odprti dostop
Highlights • Spontaneous fluctuations in brain activity reflect functional brain networks. • We review rfMRI for mapping the functional connectome. • We review methods for functional connectomics ...network analysis. • We describe the WU–Minn Human Connectome Project. • We present exciting new analyses using the latest-released HCP data.
Display omitted
•Deep Learning (DL) is becoming the main way to study electronic health records (EHR).•The first comparative review of the key DL architectures used for EHR is carried out.•One of the ...largest EHR databases, containing data from 4 M people, is introduced.•A set of best practices to work with EHR using DL has been shared.•Recurrent DL architectures showed superior flexibility and predictive power.
Despite the recent developments in deep learning models, their applications in clinical decision-support systems have been very limited. Recent digitalisation of health records, however, has provided a great platform for the assessment of the usability of such techniques in healthcare. As a result, the field is starting to see a growing number of research papers that employ deep learning on electronic health records (EHR) for personalised prediction of risks and health trajectories. While this can be a promising trend, vast paper-to-paper variability (from data sources and models they use to the clinical questions they attempt to answer) have hampered the field’s ability to simply compare and contrast such models for a given application of interest. Thus, in this paper, we aim to provide a comparative review of the key deep learning architectures that have been applied to EHR data. Furthermore, we also aim to: (1) introduce and use one of the world’s largest and most complex linked primary care EHR datasets (i.e., Clinical Practice Research Datalink, or CPRD) as a new asset for training such data-hungry models; (2) provide a guideline for working with EHR data for deep learning; (3) share some of the best practices for assessing the “goodness” of deep-learning models in clinical risk prediction; (4) and propose future research ideas for making deep learning models more suitable for the EHR data. Our results highlight the difficulties of working with highly imbalanced datasets, and show that sequential deep learning architectures such as RNN may be more suitable to deal with the temporal nature of EHR.
Abstract Background Major Depressive Disorder (MDD) is a leading cause of disease burden worldwide. With the rapid growth of neuroimaging research on relatively small samples, meta-analytic ...techniques are becoming increasingly important. Here, we aim to clarify the support in fMRI literature for three leading neurobiological models of MDD: limbic–cortical, cortico–striatal and the default mode network. Methods Searches of PubMed and Web of Knowledge, and manual searches, were undertaken in early 2011. Data from 34 case-control comparisons ( n =1165) and 6 treatment studies ( n =105) were analysed separately with two meta-analytic methods for imaging data: Activation Likelihood Estimation and Gaussian-Process Regression. Results There was broad support for limbic–cortical and cortico–striatal models in the case-control data. Evidence for the role of the default mode network was weaker. Treatment-sensitive regions were primarily in lateral frontal areas. Limitations In any meta-analysis, the increase in the statistical power of the inference comes with the risk of aggregating heterogeneous study pools. While we believe that this wide range of paradigms allows identification of key regions of dysfunction in MDD (regardless of task), we attempted to minimise such risks by employing GPR, which models such heterogeneity. Conclusions The focus of treatment effects in frontal areas indicates that dysregulation here may represent a biomarker of treatment response. Since the dysregulation in many subcortical regions in the case-control comparisons appeared insensitive to treatment, we propose that these act as trait vulnerability markers, or perhaps treatment insensitivity. Our findings allow these models of MDD to be applied to fMRI literature with some confidence.