Coronary artery disease (CAD), the most common manifestation of cardiovascular disease, remains the most common cause of mortality in the United States. Risk assessment is key for primary prevention ...of coronary events and coronary artery calcium (CAC) scoring using computed tomography (CT) is one such non-invasive tool. Despite the proven clinical value of CAC, the current clinical practice implementation for CAC has limitations such as the lack of insurance coverage for the test, need for capital-intensive CT machines, specialized imaging protocols, and accredited 3D imaging labs for analysis (including personnel and software). Perhaps the greatest gap is the millions of patients who undergo routine chest CT exams and demonstrate coronary artery calcification, but their presence is not often reported or quantitation is not feasible. We present two deep learning models that automate CAC scoring demonstrating advantages in automated scoring for both dedicated gated coronary CT exams and routine non-gated chest CTs performed for other reasons to allow opportunistic screening. First, we trained a gated coronary CT model for CAC scoring that showed near perfect agreement (mean difference in scores = -2.86; Cohen's Kappa = 0.89, P < 0.0001) with current conventional manual scoring on a retrospective dataset of 79 patients and was found to perform the task faster (average time for automated CAC scoring using a graphics processing unit (GPU) was 3.5 ± 2.1 s vs. 261 s for manual scoring) in a prospective trial of 55 patients with little difference in scores compared to three technologists (mean difference in scores = 3.24, 5.12, and 5.48, respectively). Then using CAC scores from paired gated coronary CT as a reference standard, we trained a deep learning model on our internal data and a cohort from the Multi-Ethnic Study of Atherosclerosis (MESA) study (total training n = 341, Stanford test n = 42, MESA test n = 46) to perform CAC scoring on routine non-gated chest CT exams with validation on external datasets (total n = 303) obtained from four geographically disparate health systems. On identifying patients with any CAC (i.e., CAC ≥ 1), sensitivity and PPV was high across all datasets (ranges: 80-100% and 87-100%, respectively). For CAC ≥ 100 on routine non-gated chest CTs, which is the latest recommended threshold to initiate statin therapy, our model showed sensitivities of 71-94% and positive predictive values in the range of 88-100% across all the sites. Adoption of this model could allow more patients to be screened with CAC scoring, potentially allowing opportunistic early preventive interventions.
Deep learning (DL) models can harness electronic health records (EHRs) to predict diseases and extract radiologic findings for diagnosis. With ambulatory chest radiographs (CXRs) frequently ordered, ...we investigated detecting type 2 diabetes (T2D) by combining radiographic and EHR data using a DL model. Our model, developed from 271,065 CXRs and 160,244 patients, was tested on a prospective dataset of 9,943 CXRs. Here we show the model effectively detected T2D with a ROC AUC of 0.84 and a 16% prevalence. The algorithm flagged 1,381 cases (14%) as suspicious for T2D. External validation at a distinct institution yielded a ROC AUC of 0.77, with 5% of patients subsequently diagnosed with T2D. Explainable AI techniques revealed correlations between specific adiposity measures and high predictivity, suggesting CXRs' potential for enhanced T2D screening.
A major bottleneck in developing clinically impactful machine learning models is a lack of labeled training data for model supervision. Thus, medical researchers increasingly turn to weaker, noisier ...sources of supervision, such as leveraging extractions from unstructured text reports to supervise image classification. A key challenge in weak supervision is combining sources of information that may differ in quality and have correlated errors. Recently, a statistical theory of weak supervision called data programming has shown promise in addressing this challenge. Data programming now underpins many deployed machine-learning systems in the technology industry, even for critical applications. We propose a new technique for applying data programming to the problem of cross-modal weak supervision in medicine, wherein weak labels derived from an auxiliary modality (e.g., text) are used to train models over a different target modality (e.g., images). We evaluate our approach on diverse clinical tasks via direct comparison to institution-scale, hand-labeled datasets. We find that our supervision technique increases model performance by up to 6 points area under the receiver operating characteristic curve (ROC-AUC) over baseline methods by improving both coverage and quality of the weak labels. Our approach yields models that on average perform within 1.75 points ROC-AUC of those supervised with physician-years of hand labeling and outperform those supervised with physician-months of hand labeling by 10.25 points ROC-AUC, while using only person-days of developer time and clinician work—a time saving of 96%. Our results suggest that modern weak supervision techniques such as data programming may enable more rapid development and deployment of clinically useful machine-learning models.
•We propose cross-modal data programming (XMDP) for machine learning (ML) in medicine•XMDP reduces labeling costs by 96% versus hand labeling on 4 diverse medical ML tasks•Days of XMDP and years of hand labeling often yield similarly performing models•XMDP performance continually improves as more unlabeled data becomes available
Machine learning can achieve record-breaking performance on many tasks, but machine learning development is often hindered by insufficient hand-labeled data for model training. This issue is particularly prohibitive in areas such as medical diagnostic analysis, where data are private and require expensive labeling by clinicians.
A promising approach to handle this bottleneck is weak supervision, where machine learning models are trained using cheaper, noisier labels. We extend a recent, theoretically grounded weak supervision paradigm—data programming—wherein subject matter expert users write labeling functions to label training data imprecisely rather than hand-labeling data points. We show that our approach allows us to train machine learning models using person-days of effort that previously required person-years of hand labeling. Our methods could enable researchers and practitioners to leverage machine learning models over high-dimensional data (e.g., images, time series) even when labeled training sets are unavailable.
Machine learning (ML) models have achieved record-breaking performance on many tasks, but development is often blocked by a lack of large, hand-labeled training datasets for model supervision. We extend data programming—a theoretically grounded technique for supervision using cheaper, noisier labels—to train medical ML models using person-days of effort that previously required person-years of hand labeling. We find that our weakly supervised models perform similarly to their hand-labeled counterparts and that their performance improves as additional unlabeled data becomes available.
Coronary artery calcium (CAC) can be identified on nongated chest computed tomography (CT) scans, but this finding is not consistently incorporated into care. A deep learning algorithm enables ...opportunistic CAC screening of nongated chest CT scans. Our objective was to evaluate the effect of notifying clinicians and patients of incidental CAC on statin initiation.
NOTIFY-1 (Incidental Coronary Calcification Quality Improvement Project) was a randomized quality improvement project in the Stanford Health Care System. Patients without known atherosclerotic cardiovascular disease or a previous statin prescription were screened for CAC on a previous nongated chest CT scan from 2014 to 2019 using a validated deep learning algorithm with radiologist confirmation. Patients with incidental CAC were randomly assigned to notification of the primary care clinician and patient versus usual care. Notification included a patient-specific image of CAC and guideline recommendations regarding statin use. The primary outcome was statin prescription within 6 months.
Among 2113 patients who met initial clinical inclusion criteria, CAC was identified by the algorithm in 424 patients. After chart review and additional exclusions were made, a radiologist confirmed CAC among 173 of 194 patients (89.2%) who were randomly assigned to notification or usual care. At 6 months, the statin prescription rate was 51.2% (44/86) in the notification arm versus 6.9% (6/87) with usual care (
<0.001). There was also more coronary artery disease testing in the notification arm (15.1% 13/86 versus 2.3% 2/87;
=0.008).
Opportunistic CAC screening of previous nongated chest CT scans followed by clinician and patient notification led to a significant increase in statin prescriptions. Further research is needed to determine whether this approach can reduce atherosclerotic cardiovascular disease events.
URL: https://www.
gov; Unique identifier: NCT04789278.
Coronary artery calcium (CAC) is a strong predictor of cardiovascular events across all racial and ethnic groups. CAC can be quantified on nonelectrocardiography (ECG)-gated computed tomography (CT) ...performed for other reasons, allowing for opportunistic screening for subclinical atherosclerosis.
The authors investigated whether incidental CAC quantified on routine non–ECG-gated CTs using a deep-learning (DL) algorithm provided cardiovascular risk stratification beyond traditional risk prediction methods.
Incidental CAC was quantified using a DL algorithm (DL-CAC) on non–ECG-gated chest CTs performed for routine care in all settings at a large academic medical center from 2014 to 2019. We measured the association between DL-CAC (0, 1-99, or ≥100) with all-cause death (primary outcome), and the secondary composite outcomes of death/myocardial infarction (MI)/stroke and death/MI/stroke/revascularization using Cox regression. We adjusted for age, sex, race, ethnicity, comorbidities, systolic blood pressure, lipid levels, smoking status, and antihypertensive use. Ten-year atherosclerotic cardiovascular disease risk was calculated using the pooled cohort equations.
Of 5,678 adults without ASCVD (51% women, 18% Asian, 13% Hispanic/Latinx), 52% had DL-CAC >0. Those with DL-CAC ≥100 had an average 10-year ASCVD risk of 24%; yet, only 26% were on statins. After adjustment, patients with DL-CAC ≥100 had increased risk of death (HR: 1.51; 95% CI: 1.28-1.79), death/MI/stroke (HR: 1.57; 95% CI: 1.33-1.84), and death/MI/stroke/revascularization (HR: 1.69; 95% CI: 1.45-1.98) compared with DL-CAC = 0.
Incidental CAC ≥100 was associated with an increased risk of all-cause death and adverse cardiovascular outcomes, beyond traditional risk factors. DL-CAC from routine non–ECG-gated CTs identifies patients at increased cardiovascular risk and holds promise as a tool for opportunistic screening to facilitate earlier intervention.
Display omitted
Patients with pneumonia often present to the emergency department (ED) and require prompt diagnosis and treatment. Clinical decision support systems for the diagnosis and management of pneumonia are ...commonly utilized in EDs to improve patient care. The purpose of this study is to investigate whether a deep learning model for detecting radiographic pneumonia and pleural effusions can improve functionality of a clinical decision support system (CDSS) for pneumonia management (ePNa) operating in 20 EDs.
In this retrospective cohort study, a dataset of 7434 prior chest radiographic studies from 6551 ED patients was used to develop and validate a deep learning model to identify radiographic pneumonia, pleural effusions, and evidence of multilobar pneumonia. Model performance was evaluated against 3 radiologists' adjudicated interpretation and compared with performance of the natural language processing of radiology reports used by ePNa.
The deep learning model achieved an area under the receiver operating characteristic curve of 0.833 (95% confidence interval CI: 0.795, 0.868) for detecting radiographic pneumonia, 0.939 (95% CI: 0.911, 0.962) for detecting pleural effusions and 0.847 (95% CI: 0.800, 0.890) for identifying multilobar pneumonia. On all 3 tasks, the model achieved higher agreement with the adjudicated radiologist interpretation compared with ePNa.
A deep learning model demonstrated higher agreement with radiologists than the ePNa CDSS in detecting radiographic pneumonia and related findings. Incorporating deep learning models into pneumonia CDSS could enhance diagnostic performance and improve pneumonia management.
Cardiac MRI allows for a comprehensive assessment of myocardial structure, function, and tissue characteristics. Here we describe a foundational vision system for cardiac MRI, capable of representing ...the breadth of human cardiovascular disease and health. Our deep learning model is trained via self-supervised contrastive learning, by which visual concepts in cine-sequence cardiac MRI scans are learned from the raw text of the accompanying radiology reports. We train and evaluate our model on data from four large academic clinical institutions in the United States. We additionally showcase the performance of our models on the UK BioBank, and two additional publicly available external datasets. We explore emergent zero-shot capabilities of our system, and demonstrate remarkable performance across a range of tasks; including the problem of left ventricular ejection fraction regression, and the diagnosis of 35 different conditions such as cardiac amyloidosis and hypertrophic cardiomyopathy. We show that our deep learning system is capable of not only understanding the staggering complexity of human cardiovascular disease, but can be directed towards clinical problems of interest yielding impressive, clinical grade diagnostic accuracy with a fraction of the training data typically required for such tasks.