The electrocardiogram (ECG) is a widely used medical test, consisting of voltage versus time traces collected from surface recordings over the heart
. Here we hypothesized that a deep neural network ...(DNN) can predict an important future clinical event, 1-year all-cause mortality, from ECG voltage-time traces. By using ECGs collected over a 34-year period in a large regional health system, we trained a DNN with 1,169,662 12-lead resting ECGs obtained from 253,397 patients, in which 99,371 events occurred. The model achieved an area under the curve (AUC) of 0.88 on a held-out test set of 168,914 patients, in which 14,207 events occurred. Even within the large subset of patients (n = 45,285) with ECGs interpreted as 'normal' by a physician, the performance of the model in predicting 1-year mortality remained high (AUC = 0.85). A blinded survey of cardiologists demonstrated that many of the discriminating features of these normal ECGs were not apparent to expert reviewers. Finally, a Cox proportional-hazard model revealed a hazard ratio of 9.5 (P < 0.005) for the two predicted groups (dead versus alive 1 year after ECG) over a 25-year follow-up period. These results show that deep learning can add substantial prognostic information to the interpretation of 12-lead resting ECGs, even in cases that are interpreted as normal by physicians.
Atrial fibrillation (AF) is associated with substantial morbidity, especially when it goes undetected. If new-onset AF could be predicted, targeted screening could be used to find it early. We ...hypothesized that a deep neural network could predict new-onset AF from the resting 12-lead ECG and that this prediction may help identify those at risk of AF-related stroke.
We used 1.6 M resting 12-lead digital ECG traces from 430 000 patients collected from 1984 to 2019. Deep neural networks were trained to predict new-onset AF (within 1 year) in patients without a history of AF. Performance was evaluated using areas under the receiver operating characteristic curve and precision-recall curve. We performed an incidence-free survival analysis for a period of 30 years following the ECG stratified by model predictions. To simulate real-world deployment, we trained a separate model using all ECGs before 2010 and evaluated model performance on a test set of ECGs from 2010 through 2014 that were linked to our stroke registry. We identified the patients at risk for AF-related stroke among those predicted to be high risk for AF by the model at different prediction thresholds.
The area under the receiver operating characteristic curve and area under the precision-recall curve were 0.85 and 0.22, respectively, for predicting new-onset AF within 1 year of an ECG. The hazard ratio for the predicted high- versus low-risk groups over a 30-year span was 7.2 (95% CI, 6.9-7.6). In a simulated deployment scenario, the model predicted new-onset AF at 1 year with a sensitivity of 69% and specificity of 81%. The number needed to screen to find 1 new case of AF was 9. This model predicted patients at high risk for new-onset AF in 62% of all patients who experienced an AF-related stroke within 3 years of the index ECG.
Deep learning can predict new-onset AF from the 12-lead ECG in patients with no previous history of AF. This prediction may help identify patients at risk for AF-related strokes.
Heart failure is a prevalent, costly disease for which new value-based payment models demand optimized population management strategies.
This study sought to generate a strategy for managing ...populations of patients with heart failure by leveraging large clinical datasets and machine learning.
Geisinger electronic health record data were used to train machine learning models to predict 1-year all-cause mortality in 26,971 patients with heart failure who underwent 276,819 clinical episodes. There were 26 clinical variables (demographics, laboratory test results, medications), 90 diagnostic codes, 41 electrocardiogram measurements and patterns, 44 echocardiographic measurements, and 8 evidence-based “care gaps”: flu vaccine, blood pressure of <130/80 mm Hg, A1c of <8%, cardiac resynchronization therapy, and active medications (active angiotensin-converting enzyme inhibitor/angiotensin II receptor blocker/angiotensin receptor-neprilysin inhibitor, aldosterone receptor antagonist, hydralazine, and evidence-based beta-blocker) were collected. Care gaps represented actionable variables for which associations with all-cause mortality were modeled from retrospective data and then used to predict the benefit of prospective interventions in 13,238 currently living patients.
Machine learning models achieved areas under the receiver-operating characteristic curve (AUCs) of 0.74 to 0.77 in a split-by-year training/test scheme, with the nonlinear XGBoost model (AUC: 0.77) outperforming linear logistic regression (AUC: 0.74). Out of 13,238 currently living patients, 2,844 were predicted to die within a year, and closing all care gaps was predicted to save 231 of these lives. Prioritizing patients for intervention by using the predicted reduction in 1-year mortality risk outperformed all other priority rankings (e.g., random selection or Seattle Heart Failure risk score).
Machine learning can be used to priority-rank patients most likely to benefit from interventions to optimize evidence-based therapies. This approach may prove useful for optimizing heart failure population health management teams within value-based payment models.
Display omitted
Background:
Several large trials have employed age or clinical features to select patients for atrial fibrillation (AF) screening to reduce strokes. We hypothesized that a machine learning (ML) model ...trained to predict AF risk from 12‑lead electrocardiogram (ECG) would be more efficient than criteria based on clinical variables in indicating a population for AF screening to potentially prevent AF-related stroke.
Methods:
We retrospectively included all patients with clinical encounters in Geisinger without a prior history of AF. Incidence of AF within 1 year and AF-related strokes within 3 years of the encounter were identified. AF-related stroke was defined as a stroke where AF was diagnosed at the time of stroke or within a year after the stroke. The efficiency of five methods was evaluated for selecting a cohort for AF screening. The methods were selected from four clinical trials (mSToPS, GUARD-AF, SCREEN-AF and STROKESTOP) and the ECG-based ML model. We simulated patient selection for the five methods between the years 2011 and 2014 and evaluated outcomes for 1 year intervals between 2012 and 2015, resulting in a total of twenty 1-year periods. Patients were considered eligible if they met the criteria before the start of the given 1-year period or within that period. The primary outcomes were numbers needed to screen (NNS) for AF and AF-associated stroke.
Results:
The clinical trial models indicated large proportions of the population with a prior ECG for AF screening (up to 31%), coinciding with NNS ranging from 14 to 18 for AF and 249–359 for AF-associated stroke. At comparable sensitivity, the ECG ML model indicated a modest number of patients for screening (14%) and had the highest efficiency in NNS for AF (7.3; up to 60% reduction) and AF-associated stroke (223; up to 38% reduction).
Conclusions:
An ECG-based ML risk prediction model is more efficient than contemporary AF-screening criteria based on age alone or age and clinical features at indicating a population for AF screening to potentially prevent AF-related strokes.
Machine learning promises to assist physicians with predictions of mortality and of other future clinical events by learning complex patterns from historical data, such as longitudinal electronic ...health records. Here we show that a convolutional neural network trained on raw pixel data in 812,278 echocardiographic videos from 34,362 individuals provides superior predictions of one-year all-cause mortality. The model's predictions outperformed the widely used pooled cohort equations, the Seattle Heart Failure score (measured in an independent dataset of 2,404 patients with heart failure who underwent 3,384 echocardiograms), and a machine learning model involving 58 human-derived variables from echocardiograms and 100 clinical variables derived from electronic health records. We also show that cardiologists assisted by the model substantially improved the sensitivity of their predictions of one-year all-cause mortality by 13% while maintaining prediction specificity. Large unstructured datasets may enable deep learning to improve a wide range of clinical prediction models.
Use of machine learning (ML) for automated annotation of heart structures from echocardiographic videos is an active research area, but understanding of comparative, generalizable performance among ...models is lacking. This study aimed to (1) assess the generalizability of five state-of-the-art ML-based echocardiography segmentation models within a large Geisinger clinical dataset, and (2) test the hypothesis that a quality control (QC) method based on segmentation uncertainty can further improve segmentation results. Five models were applied to 47,431 echocardiography studies that were independent from any training samples. Chamber volume and mass from model segmentations were compared to clinically-reported values. The median absolute errors (MAE) in left ventricular (LV) volumes and ejection fraction exhibited by all five models were comparable to reported inter-observer errors (IOE). MAE for left atrial volume and LV mass were similarly favorable to respective IOE for models trained for those tasks. A single model consistently exhibited the lowest MAE in all five clinically-reported measures. We leveraged the tenfold cross-validation training scheme of this best-performing model to quantify segmentation uncertainty. We observed that removing segmentations with high uncertainty from 14 to 71% studies reduced volume/mass MAE by 6–10%. The addition of convexity filters improved specificity, efficiently removing < 10% studies with large MAE (16–40%). In conclusion, five previously published echocardiography segmentation models generalized to a large, independent clinical dataset—segmenting one or multiple cardiac structures with overall accuracy comparable to manual analyses—with variable performance. Convexity-reinforced uncertainty QC efficiently improved segmentation performance and may further facilitate the translation of such models.
Background: Given that the 12-lead electrocardiogram (ECG) is a widely used medical diagnostic test, an accurate and automated method to predict clinically relevant future events using ECGs can ...significantly impact clinical care. In this study, we propose a deeplearning model to predict one-year mortality and future development of atrial fibrillation (AF) from the voltage-time signals of 12-lead ECGs. Method: We extracted all 12-lead ECGs from the electronic records of a large regional US health system (Geisinger) to evaluate one-year mortality and incident AF within 5 years. One-year mortality was evaluated using 1,309,304 ECGs (132,340 events) from 226,783 patients with at least one year of follow-up after ECG. Incident AF was evaluated using 67,767 ECGs from 19,458 patients with new-onset AF diagnosed between 30 days and 5 years after the ECG (case) and 379,764 ECGs from 113,809 patients who either developed AF after 5 years or never developed AF with at least 5 years follow-up (controls). To predict the above endpoints, we trained a deep neural network using convolutional and Long Short-Term Memory layers to aggregate spatial and temporal features of the voltage-time signals. Classes were weighted during training to account for imbalance in cases vs controls. The models were evaluated in a 5-fold crossvalidation without ECGs from the same patient in both train and test sets. Model performance was assessed using area under the receiver operating curve (AUC). Results: The mean AUC and F1 score for predicting one-year mortality were 0.81. Even within the subset of 190,666 ECGs (4,422 events) that were interpreted as normal by a cardiologist, the model predicted one-year mortality with an AUC of 0.78. For predicting incident AF, the mean AUC was 0.78. Conclusions: A deep neural network can predict one-year mortality and incident atrial fibrillation with high accuracy using only raw voltage data from 12-lead ECG, even in studies interpreted as normal by a physician. Deep learning therefore has potential to add significant prognostic information to the clinical interpretation of one of the most widely-utilized medical tests.
The paper provides a survey of the development of machine-learning techniques for video analysis. The survey provides a summary of the most popular deep learning methods used for human activity ...recognition. We discuss how popular architectures perform on standard datasets and highlight the differences from real-life datasets dominated by multiple activities performed by multiple participants over long periods. For real-life datasets, we describe the use of low-parameter models (with 200X or 1,000X fewer parameters) that are trained to detect a single activity after the relevant objects have been successfully detected. Our survey then turns to a summary of machine learning methods that are specifically developed for working with a small number of labeled video samples. Our goal here is to describe modern techniques that are specifically designed so as to minimize the amount of ground truth that is needed for training and testing video analysis systems. We provide summaries of the development of self-supervised learning, semi-supervised learning, active learning, and zero-shot learning for applications in video analysis. For each method, we provide representative examples.