Abstract
Aims
The aim of this study was to develop, validate, and illustrate an updated prediction model (SCORE2) to estimate 10-year fatal and non-fatal cardiovascular disease (CVD) risk in ...individuals without previous CVD or diabetes aged 40–69 years in Europe.
Methods and results
We derived risk prediction models using individual-participant data from 45 cohorts in 13 countries (677 684 individuals, 30 121 CVD events). We used sex-specific and competing risk-adjusted models, including age, smoking status, systolic blood pressure, and total- and HDL-cholesterol. We defined four risk regions in Europe according to country-specific CVD mortality, recalibrating models to each region using expected incidences and risk factor distributions. Region-specific incidence was estimated using CVD mortality and incidence data on 10 776 466 individuals. For external validation, we analysed data from 25 additional cohorts in 15 European countries (1 133 181 individuals, 43 492 CVD events). After applying the derived risk prediction models to external validation cohorts, C-indices ranged from 0.67 (0.65–0.68) to 0.81 (0.76–0.86). Predicted CVD risk varied several-fold across European regions. For example, the estimated 10-year CVD risk for a 50-year-old smoker, with a systolic blood pressure of 140 mmHg, total cholesterol of 5.5 mmol/L, and HDL-cholesterol of 1.3 mmol/L, ranged from 5.9% for men in low-risk countries to 14.0% for men in very high-risk countries, and from 4.2% for women in low-risk countries to 13.7% for women in very high-risk countries.
Conclusion
SCORE2—a new algorithm derived, calibrated, and validated to predict 10-year risk of first-onset CVD in European populations—enhances the identification of individuals at higher risk of developing CVD across Europe.
Graphical Abstract
Development process, key features and illustrative example of the SCORE2 risk prediction algorithms for European populations.
Difficult and expensive financing has always been a problem for domestic and foreign enterprises, and how to effectively improve financing efficiency and improve the financing environment is a key ...issue to be studied. LightGBM is an advanced machine learning algorithm, which uses histogram algorithm and Leaf-wise strategy with depth limitation to improve the accuracy of the model. However, there are almost no cases of applying this method to corporate financing risk prediction. Therefore, the paper establishes the LightGBM model to predict the financing risk profile of 186 enterprises. In order to compare the prediction performance of LightGBM for enterprise financing risk, the paper conducted comparison experiments using k-nearest-neighbors algorithm, decision tree algorithm, and random forest algorithm on the same data set. The experiments show that LightGBM has better prediction results than the other three algorithms for several metrics in corporate financing risk prediction. Therefore, we believe that the LightGBM algorithm can be used as an effective tool to predict the financing risk of enterprises.
Clinical research and medical practice can be advanced through the prediction of an individual's health state, trajectory, and responses to treatments. However, the majority of current clinical risk ...prediction models are based on regression approaches or machine learning algorithms that are static, rather than dynamic. To benefit from the increasing emergence of large, heterogeneous data sets, such as electronic health records (EHRs), novel tools to support improved clinical decision making through methods for individual-level risk prediction that can handle multiple variables, their interactions, and time-varying values are necessary.
We introduce a novel dynamic approach to clinical risk prediction for survival, longitudinal, and multivariate (SLAM) outcomes, called random forest for SLAM data analysis (RF-SLAM). RF-SLAM is a continuous-time, random forest method for survival analysis that combines the strengths of existing statistical and machine learning methods to produce individualized Bayes estimates of piecewise-constant hazard rates. We also present a method-agnostic approach for time-varying evaluation of model performance.
We derive and illustrate the method by predicting sudden cardiac arrest (SCA) in the Left Ventricular Structural (LV) Predictors of Sudden Cardiac Death (SCD) Registry. We demonstrate superior performance relative to standard random forest methods for survival data. We illustrate the importance of the number of preceding heart failure hospitalizations as a time-dependent predictor in SCA risk assessment.
RF-SLAM is a novel statistical and machine learning method that improves risk prediction by incorporating time-varying information and accommodating a large number of predictors, their interactions, and missing values. RF-SLAM is designed to easily extend to simultaneous predictions of multiple, possibly competing, events and/or repeated measurements of discrete or continuous variables over time.
LV Structural Predictors of SCD Registry (clinicaltrials.gov, NCT01076660), retrospectively registered 25 February 2010.
Risk-Based lung cancer screening: A systematic review Toumazis, Iakovos; Bastani, Mehrad; Han, Summer S. ...
Lung cancer (Amsterdam, Netherlands),
September 2020, 2020-09-00, 20200901, Volume:
147
Journal Article
Peer reviewed
•Risk-based lung cancer screening guidelines are actively being pursued.•Several risk prediction models have been developed for lung cancer screening, for specific sub-populations.•This review ...summarizes existing lung cancer risk prediction models and their application in lung cancer screening.•Screening programs that incorporate risk prediction models could enhance the health benefit accrued from screening.•More research is warranted to optimize risk-based lung cancer screening.
Lung cancer remains the leading cause of cancer related deaths worldwide. Lung cancer screening using low-dose computed tomography (LDCT) has been shown to reduce lung cancer specific mortality. In 2013, the United States Preventive Services Task Force (USPSTF) recommended annual lung cancer screening with LDCT for smokers aged between 55 years to 80 years, with at least 30 pack-years of smoking exposure that currently smoke or who have quit smoking within 15 years. Risk-based lung cancer screening is an alternative approach that defines screening eligibility based on the personal risk of individuals. Selection of individuals for lung cancer screening based on their personal lung cancer risk has been shown to improve the sensitivity and specificity associated with the eligibility criteria of the screening program as compared to the 2013 USPSTF criteria. Numerous risk prediction models have been developed to estimate the lung cancer risk of individuals incorporating sociodemographic, smoking, and clinical risk factors associated with lung cancer, including age, smoking history, sex, race/ethnicity, personal and family history of cancer, and history of emphysema and chronic obstructive pulmonary disease (COPD), among others. Some risk prediction models include biomarker information, such as germline mutations or protein-based biomarkers as independent risk predictors, in addition to clinical, smoking, and sociodemographic risk factors. While, the majority of lung cancer risk prediction models are suitable for selecting high-risk individuals for lung cancer screening, some risk models have been developed to predict the probability of malignancy of screen-detected solidary pulmonary nodules or to optimize the screening frequency of eligible individuals by incorporating past screening findings. In this systematic review, we provide an overview of existing risk prediction models and their applications to lung cancer screening. We discuss potential strengths and limitations of lung cancer screening using risk prediction models and future research directions.
Subsyndromal delirium is a dynamic, recognizable condition commonly observed in intensive care unit (ICU) patients that can lead to poor patient prognosis, and its prompt recognition and management ...can prevent disease progression. However, no evidence-based predictive tool has been developed specifically to assess the occurrence of subsyndromal delirium in the ICU.
To develop and validate a novel, simple and effective tool for estimating the risk of subsyndromal delirium among ICU patients.
A prospective, nested case-controlled study.
A total of 731 patients were recruited from the central ICU of a tertiary hospital in southwestern China from August 2021 to November 2022.
The least absolute shrinkage and selection operator was applied to screen potential features for univariate and multivariate logistic regression. A nomogram was constructed using the selected variables. The performance of the nomogram was evaluated by combining the area under the receiver operating characteristic curve (AUC), calibration curves and decision curve analysis (DCA).
The prevalence of subsyndromal delirium among ICU patients was 23.06 %. Multiple logistic regression analysis revealed that the independent predictive factors for subsyndromal delirium among ICU patients were vision impairment, a history of falls, the use of restraint, blood transfusion, the use of antibiotics, surgery, the Caprini score, and the Braden score, all of which were used to construct the nomogram. The AUCs for the model were 0.710 (95 % CI, 0.654–0.766, P < 0.001) and 0.825 (95 % CI, 0.732–0.917, P < 0.001) in the training and validation cohorts, respectively, indicating that the model had high accuracy in distinguishing patients with and without subsyndromal delirium. The calibration curve of the nomogram showed good consistency between the predicted and actual probabilities. The DCA indicated that the nomogram has clinical application for patients in the ICU.
We developed an easy-to-use nomogram for identifying subsyndromal delirium in ICU patients with satisfactory predictive ability based on simple and easily accessible clinical features. The nomogram can identify ICU patients at high-risk for subsyndromal delirium and may be a useful subsyndromal delirium tool for current ICU physicians.
Display omitted
•Short-term forecasting and warning were achieved for shallow-buried station structures.•The impact of the preceding time-series width on model accuracy was explored.•The ...applicability of different optimization strategies to the models was compared.•A method for the surface risk analysis of subway station structures was proposed.•The accuracy and practical convenience of the prediction method were demonstrated.
In the context of rapid urbanization, ensuring the safety of subway station construction is vital for the stability of urban infrastructure. Conventional intelligent construction risk prediction methods typically utilize large volumes of monitoring data for training to enhance model accuracy, often neglecting the relationship between the time-series width of the data and the prediction results. To address this issue, and to better serve the construction of shallow-buried subway stations at an earlier stage, this study proposed a bagging algorithm with an improved base learner combination strategy. This algorithm forms the basis for the Bayesian optimization-based random forest model (BA-RF) and the marine predators’ algorithm-optimized random forest model (MPA-RF). By examining trends in real-time data, such as surface and building settlements above the main structure, displacement at key points of the vertical shafts, and crown settlement, the short-term maximum values of key displacements were predicted. This study emphasized the impact of the time width of the input data on the accuracy of the predictive models. Through empirical analysis, the optimal time-series width was determined, allowing for effective short-term structural risk prediction and early warning using a smaller time series. The findings indicate that the BA-RF model, utilizing an improved base learner strategy, achieves higher prediction accuracy than the more complex MPA-RF model, effectively mitigating overfitting. Specifically, when the preceding measured data time widths were 5, 15, and 25 d, the BA-RF model’s mean absolute error was 0.168, 0.160, and 0.349, respectively, whereas the root mean square error was 0.853, 0.463, and 0.509, respectively. Combined with short-term future prediction applications at construction sites, it was demonstrated that appropriately selecting the time-series width can significantly enhance prediction accuracy even with relatively small data volumes. This study provides a method for selecting training data for intelligent risk management during subway station construction and offers practical data selection strategies for risk assessment in other large-scale construction projects. Thus, this method has significant scientific and practical applications.
Managing supply chain risks has received increased attention in recent years, aiming to shield supply chains from disruptions by predicting their occurrence and mitigating their adverse effects. At ...the same time, the resurgence of Artificial Intelligence (AI) has led to the investigation of machine learning techniques and their applicability in supply chain risk management. However, most works focus on prediction performance and neglect the importance of interpretability so that results can be understood by supply chain practitioners, helping them make decisions that can mitigate or prevent risks from occurring. In this work, we first propose a supply chain risk prediction framework using data-driven AI techniques and relying on the synergy between AI and supply chain experts. We then explore the trade-off between prediction performance and interpretability by implementing and applying the framework on the case of predicting delivery delays in a real-world multi-tier manufacturing supply chain. Experiment results show that prioritising interpretability over performance may require a level of compromise, especially with regard to average precision scores.
•Machine learning models can be leveraged to accurately predict supply chain risks.•Choosing more interpretable models may require a compromise in performance.•Performance is slightly more affected in the case of prediction-related metrics.•Decision tree models can reveal correlations that influence SCRM decision-making.
Severe obesity is a rapidly growing global health threat. Although often attributed to unhealthy lifestyle choices or environmental factors, obesity is known to be heritable and highly polygenic; the ...majority of inherited susceptibility is related to the cumulative effect of many common DNA variants. Here we derive and validate a new polygenic predictor comprised of 2.1 million common variants to quantify this susceptibility and test this predictor in more than 300,000 individuals ranging from middle age to birth. Among middle-aged adults, we observe a 13-kg gradient in weight and a 25-fold gradient in risk of severe obesity across polygenic score deciles. In a longitudinal birth cohort, we note minimal differences in birthweight across score deciles, but a significant gradient emerged in early childhood and reached 12 kg by 18 years of age. This new approach to quantify inherited susceptibility to obesity affords new opportunities for clinical prevention and mechanistic assessment.
Display omitted
•A genome-wide polygenic score can quantify inherited susceptibility to obesity•Polygenic score effect on weight emerges early in life and increases into adulthood•Effect of polygenic score can be similar to a rare, monogenic obesity mutation•High polygenic score is a strong risk factor for severe obesity and associated diseases
A genome-wide polygenic score quantifies inherited susceptibility to obesity, integrating information from 2.1 million common genetic variants to identify adults at risk of severe obesity.
The present study aims to develop an effective risk-prediction score (RPS) to improve screening efficiency and contribute to secondary prevention of colorectal cancer (CRC).
Screening for colorectal ...lesions.
14,398 high-risk individuals aged 50–65 years were included. The baseline characteristics of participants with and without colorectal lesions (CL) were compared using a Chi-squared test. The overall population was randomly split into a training set and a test set in the ratio of 80% and 20%. One-factor and multifactor logistic regression analyses were performed in the training set to construct the RPS (scores of 0–9.62). Area under curve (AUC) was calculated as an estimate of predictive performance using the receiver-operating characteristic (ROC) curve in the test set.
In the study population, being male, advanced age, current or previous smoking, weekly alcohol consumption, high body mass index (BMI ≥24 kg/m2), and previously detected colonic polyp were associated with higher risk of CL. Compared to the low-risk group (0–2.31 points), the ORs and 95% confidence intervals (CIs) for the moderate-risk group (2.31–3.85 points) and high-risk group (3.85–8.42 points) were 1.58 (1.44, 1.73) and 2.52 (2.30, 2.76), respectively. For every 1-point increase in score, participants had a 27% increased risk of CL (OR:1.27, 95% CI: 1.24, 1.30). For participants with CL predicted by RPS, the area under the working characteristic curve was 0.61 (P < 0.001).
Our RPS can quickly and efficiently identify multiple lesions of the colorectum. Combining RPS with existing screening strategies facilitates the identification of very high-risk individuals and may help to prevent CRC.