Coronary artery calcium (CAC) score is a clinically validated marker of cardiovascular disease risk. We developed and validated a novel cardiovascular risk stratification system based on ...deep-learning-predicted CAC from retinal photographs.
We used 216 152 retinal photographs from five datasets from South Korea, Singapore, and the UK to train and validate the algorithms. First, using one dataset from a South Korean health-screening centre, we trained a deep-learning algorithm to predict the probability of the presence of CAC (ie, deep-learning retinal CAC score, RetiCAC). We stratified RetiCAC scores into tertiles and used Cox proportional hazards models to evaluate the ability of RetiCAC to predict cardiovascular events based on external test sets from South Korea, Singapore, and the UK Biobank. We evaluated the incremental values of RetiCAC when added to the Pooled Cohort Equation (PCE) for participants in the UK Biobank.
RetiCAC outperformed all single clinical parameter models in predicting the presence of CAC (area under the receiver operating characteristic curve of 0·742, 95% CI 0·732–0·753). Among the 527 participants in the South Korean clinical cohort, 33 (6·3%) had cardiovascular events during the 5-year follow-up. When compared with the current CAC risk stratification (0, >0–100, and >100), the three-strata RetiCAC showed comparable prognostic performance with a concordance index of 0·71. In the Singapore population-based cohort (n=8551), 310 (3·6%) participants had fatal cardiovascular events over 10 years, and the three-strata RetiCAC was significantly associated with increased risk of fatal cardiovascular events (hazard ratio HR trend 1·33, 95% CI 1·04–1·71). In the UK Biobank (n=47 679), 337 (0·7%) participants had fatal cardiovascular events over 10 years. When added to the PCE, the three-strata RetiCAC improved cardiovascular risk stratification in the intermediate-risk group (HR trend 1·28, 95% CI 1·07–1·54) and borderline-risk group (1·62, 1·04–2·54), and the continuous net reclassification index was 0·261 (95% CI 0·124–0·364).
A deep learning and retinal photograph-derived CAC score is comparable to CT scan-measured CAC in predicting cardiovascular events, and improves on current risk stratification approaches for cardiovascular disease events. These data suggest retinal photograph-based deep learning has the potential to be used as an alternative measure of CAC, especially in low-resource settings.
Yonsei University College of Medicine; Ministry of Health and Welfare, Korea Institute for Advancement of Technology, South Korea; Agency for Science, Technology, and Research; and National Medical Research Council, Singapore.
Abstract
Background
ageing is an important risk factor for a variety of human pathologies. Biological age (BA) may better capture ageing-related physiological changes compared with chronological age ...(CA).
Objective
we developed a deep learning (DL) algorithm to predict BA based on retinal photographs and evaluated the performance of our new ageing marker in the risk stratification of mortality and major morbidity in general populations.
Methods
we first trained a DL algorithm using 129,236 retinal photographs from 40,480 participants in the Korean Health Screening study to predict the probability of age being ≥65 years (‘RetiAGE’) and then evaluated the ability of RetiAGE to stratify the risk of mortality and major morbidity among 56,301 participants in the UK Biobank. Cox proportional hazards model was used to estimate the hazard ratios (HRs).
Results
in the UK Biobank, over a 10-year follow up, 2,236 (4.0%) died; of them, 636 (28.4%) were due to cardiovascular diseases (CVDs) and 1,276 (57.1%) due to cancers. Compared with the participants in the RetiAGE first quartile, those in the RetiAGE fourth quartile had a 67% higher risk of 10-year all-cause mortality (HR = 1.67 1.42–1.95), a 142% higher risk of CVD mortality (HR = 2.42 1.69–3.48) and a 60% higher risk of cancer mortality (HR = 1.60 1.31–1.96), independent of CA and established ageing phenotypic biomarkers. Likewise, compared with the first quartile group, the risk of CVD and cancer events in the fourth quartile group increased by 39% (HR = 1.39 1.14–1.69) and 18% (HR = 1.18 1.10–1.26), respectively. The best discrimination ability for RetiAGE alone was found for CVD mortality (c-index = 0.70, sensitivity = 0.76, specificity = 0.55). Furthermore, adding RetiAGE increased the discrimination ability of the model beyond CA and phenotypic biomarkers (increment in c-index between 1 and 2%).
Conclusions
the DL-derived RetiAGE provides a novel, alternative approach to measure ageing.
Currently in the United Kingdom, cardiovascular disease (CVD) risk assessment is based on the QRISK3 score, in which 10% 10-year CVD risk indicates clinical intervention. However, this benchmark has ...limited efficacy in clinical practice and the need for a more simple, non-invasive risk stratification tool is necessary. Retinal photography is becoming increasingly acceptable as a non-invasive imaging tool for CVD. Previously, we developed a novel CVD risk stratification system based on retinal photographs predicting future CVD risk. This study aims to further validate our biomarker, Reti-CVD, (1) to detect risk group of ≥ 10% in 10-year CVD risk and (2) enhance risk assessment in individuals with QRISK3 of 7.5-10% (termed as borderline-QRISK3 group) using the UK Biobank.
Reti-CVD scores were calculated and stratified into three risk groups based on optimized cut-off values from the UK Biobank. We used Cox proportional-hazards models to evaluate the ability of Reti-CVD to predict CVD events in the general population. C-statistics was used to assess the prognostic value of adding Reti-CVD to QRISK3 in borderline-QRISK3 group and three vulnerable subgroups.
Among 48,260 participants with no history of CVD, 6.3% had CVD events during the 11-year follow-up. Reti-CVD was associated with an increased risk of CVD (adjusted hazard ratio HR 1.41; 95% confidence interval CI, 1.30-1.52) with a 13.1% (95% CI, 11.7-14.6%) 10-year CVD risk in Reti-CVD-high-risk group. The 10-year CVD risk of the borderline-QRISK3 group was greater than 10% in Reti-CVD-high-risk group (11.5% in non-statin cohort n = 45,473, 11.5% in stage 1 hypertension cohort n = 11,966, and 14.2% in middle-aged cohort n = 38,941). C statistics increased by 0.014 (0.010-0.017) in non-statin cohort, 0.013 (0.007-0.019) in stage 1 hypertension cohort, and 0.023 (0.018-0.029) in middle-aged cohort for CVD event prediction after adding Reti-CVD to QRISK3.
Reti-CVD has the potential to identify individuals with ≥ 10% 10-year CVD risk who are likely to benefit from earlier preventative CVD interventions. For borderline-QRISK3 individuals with 10-year CVD risk between 7.5 and 10%, Reti-CVD could be used as a risk enhancer tool to help improve discernment accuracy, especially in adult groups that may be pre-disposed to CVD.
The application of deep learning to retinal photographs has yielded promising results in predicting age, sex, blood pressure, and haematological parameters. However, the broader applicability of ...retinal photograph-based deep learning for predicting other systemic biomarkers and the generalisability of this approach to various populations remains unexplored.
With use of 236 257 retinal photographs from seven diverse Asian and European cohorts (two health screening centres in South Korea, the Beijing Eye Study, three cohorts in the Singapore Epidemiology of Eye Diseases study, and the UK Biobank), we evaluated the capacities of 47 deep-learning algorithms to predict 47 systemic biomarkers as outcome variables, including demographic factors (age and sex); body composition measurements; blood pressure; haematological parameters; lipid profiles; biochemical measures; biomarkers related to liver function, thyroid function, kidney function, and inflammation; and diabetes. The standard neural network architecture of VGG16 was adopted for model development.
In addition to previously reported systemic biomarkers, we showed quantification of body composition indices (muscle mass, height, and bodyweight) and creatinine from retinal photographs. Body muscle mass could be predicted with an R2 of 0·52 (95% CI 0·51–0·53) in the internal test set, and of 0·33 (0·30–0·35) in one external test set with muscle mass measurement available. The R2 value for the prediction of height was 0·42 (0·40–0·43), of bodyweight was 0·36 (0·34–0·37), and of creatinine was 0·38 (0·37–0·40) in the internal test set. However, the performances were poorer in external test sets (with the lowest performance in the European cohort), with R2 values ranging between 0·08 and 0·28 for height, 0·04 and 0·19 for bodyweight, and 0·01 and 0·26 for creatinine. Of the 47 systemic biomarkers, 37 could not be predicted well from retinal photographs via deep learning (R2≤0·14 across all external test sets).
Our work provides new insights into the potential use of retinal photographs to predict systemic biomarkers, including body composition indices and serum creatinine, using deep learning in populations with a similar ethnic background. Further evaluations are warranted to validate these findings and evaluate the clinical utility of these algorithms.
Agency for Science, Technology, and Research and National Medical Research Council, Singapore; Korea Institute for Advancement of Technology.
BackgroundTo develop computer-aided detection (CADe) of ORL abnormalities in the retinal pigmented epithelium, interdigitation zone and ellipsoid zone via optical coherence tomography (OCT).MethodsIn ...this retrospective study, healthy participants with normal ORL, and patients with abnormality of ORL including choroidal neovascularisation (CNV) or retinitis pigmentosa (RP) were included. First, an automatic segmentation deep learning (DL) algorithm, CADe, was developed for the three outer retinal layers using 120 handcraft masks of ORL. This automatic segmentation algorithm generated 4000 segmentations, which included 2000 images with normal ORL and 2000 (1000 CNV and 1000 RP) images with focal or wide defects in ORL. Second, based on the automatically generated segmentation images, a binary classifier (normal vs abnormal) was developed. Results were evaluated by area under the receiver operating characteristic curve (AUC).ResultsThe DL algorithm achieved an AUC of 0.984 (95% CI 0.976 to 0.993) for individual image evaluation in the internal test set of 797 images. In addition, performance analysis of a publicly available external test set (n=968) had an AUC of 0.957 (95% CI 0.944 to 0.970) and a second clinical external test set (n=1124) had an AUC of 0.978 (95% CI 0.970 to 0.986). Moreover, the CADe highlighted well normal parts of ORL and omitted highlights in abnormal ORLs of CNV and RP.ConclusionThe CADe can use OCT images to segment ORL and differentiate between normal ORL and abnormal ORL. The CADe classifier also performs visualisation and may aid future physician diagnosis and clinical applications.
Recently, laser refractive surgery options, including laser epithelial keratomileusis, laser in situ keratomileusis, and small incision lenticule extraction, successfully improved patients' quality ...of life. Evidence-based recommendation for an optimal surgery technique is valuable in increasing patient satisfaction. We developed an interpretable multiclass machine learning model that selects the laser surgery option on the expert level.
A multiclass XGBoost model was constructed to classify patients into four categories including laser epithelial keratomileusis, laser in situ keratomileusis, small incision lenticule extraction, and contraindication groups. The analysis included 18,480 subjects who intended to undergo refractive surgery at the B&VIIT Eye center. Training (n = 10,561) and internal validation (n = 2640) were performed using subjects who visited between 2016 and 2017. The model was trained based on clinical decisions of highly experienced experts and ophthalmic measurements. External validation (n = 5279) was conducted using subjects who visited in 2018. The SHapley Additive ex-Planations technique was adopted to explain the output of the XGBoost model.
The multiclass XGBoost model exhibited an accuracy of 81.0% and 78.9% when tested on the internal and external validation datasets, respectively. The SHapley Additive ex-Planations explanations for the results were consistent with prior knowledge from ophthalmologists. The explanation from one-versus-one and one-versus-rest XGBoost classifiers was effective for easily understanding users in the multicategorical classification problem.
This study suggests an expert-level multiclass machine learning model for selecting the refractive surgery for patients. It also provided a clinical understanding in a multiclass problem based on an explainable artificial intelligence technique.
Explainable machine learning exhibits a promising future for increasing the practical use of artificial intelligence in ophthalmic clinics.
Purpose: To analyze the efficacy of a deep learning (DL)-based artificial intelligence (AI)-based algorithm in detecting the presence of diabetic retinopathy (DR) and glaucoma suspect as compared to ...the diagnosis by specialists secondarily to explore whether the use of this algorithm can reduce the cross-referral in three clinical settings: a diabetologist clinic, retina clinic, and glaucoma clinic. Methods: This is a prospective observational study. Patients between 35 and 65 years of age were recruited from glaucoma and retina clinics at a tertiary eye care hospital and a physician's clinic. Non-mydriatic fundus photography was performed according to the disease-specific protocols. These images were graded by the AI system and specialist graders and comparatively analyzed. Results: Out of 1085 patients, 362 were seen at glaucoma clinics, 341 were seen at retina clinics, and 382 were seen at physician clinics. The kappa agreement between AI and the glaucoma grader was 85% 95% confidence interval (CI): 77.55-92.45%, and retina grading had 91.90% (95% CI: 87.78-96.02%). The retina grader from the glaucoma clinic had 85% agreement, and the glaucoma grader from the retina clinic had 73% agreement. The sensitivity and specificity of AI glaucoma grading were 79.37% (95% CI: 67.30-88.53%) and 99.45 (95% CI: 98.03-99.93), respectively; DR grading had 83.33% (95 CI: 51.59-97.91) and 98.86 (95% CI: 97.35-99.63). The cross-referral accuracy of DR and glaucoma was 89.57% and 95.43%, respectively. Conclusion: DL-based AI systems showed high sensitivity and specificity in both patients with DR and glaucoma; also, there was a good agreement between the specialist graders and the AI system.
Recently, it has become more important to screen candidates that undergo corneal refractive surgery to prevent complications. Until now, there is still no definitive screening method to confront the ...possibility of a misdiagnosis. We evaluate the possibilities of machine learning as a clinical decision support to determine the suitability to corneal refractive surgery. A machine learning architecture was built with the aim of identifying candidates combining the large multi-instrument data from patients and clinical decisions of highly experienced experts. Five heterogeneous algorithms were used to predict candidates for surgery. Subsequently, an ensemble classifier was developed to improve the performance. Training (10,561 subjects) and internal validation (2640 subjects) were conducted using subjects who had visited between 2016 and 2017. External validation (5279 subjects) was performed using subjects who had visited in 2018. The best model, i.e., the ensemble classifier, had a high prediction performance with the area under the receiver operating characteristic curves of 0.983 (95% CI, 0.977-0.987) and 0.972 (95% CI, 0.967-0.976) when tested in the internal and external validation set, respectively. The machine learning models were statistically superior to classic methods including the percentage of tissue ablated and the Randleman ectatic score. Our model was able to correctly reclassify a patient with postoperative ectasia as an ectasia-risk group. Machine learning algorithms using a wide range of preoperative information achieved a comparable performance to screen candidates for corneal refractive surgery. An automated machine learning analysis of preoperative data can provide a safe and reliable clinical decision for refractive surgery.
Despite the importance of preventing chronic kidney disease (CKD), predicting high-risk patients who require active intervention is challenging, especially in people with preserved kidney function. ...In this study, a predictive risk score for CKD (Reti-CKD score) was derived from a deep learning algorithm using retinal photographs. The performance of the Reti-CKD score was verified using two longitudinal cohorts of the UK Biobank and Korean Diabetic Cohort. Validation was done in people with preserved kidney function, excluding individuals with eGFR <90 mL/min/1.73 m
or proteinuria at baseline. In the UK Biobank, 720/30,477 (2.4%) participants had CKD events during the 10.8-year follow-up period. In the Korean Diabetic Cohort, 206/5014 (4.1%) had CKD events during the 6.1-year follow-up period. When the validation cohorts were divided into quartiles of Reti-CKD score, the hazard ratios for CKD development were 3.68 (95% Confidence Interval CI, 2.88-4.41) in the UK Biobank and 9.36 (5.26-16.67) in the Korean Diabetic Cohort in the highest quartile compared to the lowest. The Reti-CKD score, compared to eGFR based methods, showed a superior concordance index for predicting CKD incidence, with a delta of 0.020 (95% CI, 0.011-0.029) in the UK Biobank and 0.024 (95% CI, 0.002-0.046) in the Korean Diabetic Cohort. In people with preserved kidney function, the Reti-CKD score effectively stratifies future CKD risk with greater performance than conventional eGFR-based methods.