Supervised machine learning for cardiovascular risk prediction and clinical event forecasting: a systematic review of multi-modal models, cross-validation strategies, and fairness-aware approaches

Dr. Mohammed Hasan Ali

doi:10.55529/jaimlnn.52.138.152

Authors

Dr. Mohammed Hasan Ali Associate Professor, College of Technical Engineering, Imam Ja’afar Al-Sadiq University, Al-Muthanna 66002, Iraq.

Keywords:

Supervised Machine Learning, Cardiovascular Risk, XGBoost, Random Forest, Fairness-Aware ML, Cross-Validation.

Abstract

Background and Objectives: Cardiovascular disease (CVD) remains the leading global cause of mortality, accounting for approximately 17.9 million deaths annually. Although conventional cardiovascular risk prediction tools such as the Framingham Risk Score, SCORE2, and Pooled Cohort Equations (PCE) are widely used, supervised machine learning (SML) approaches have shown considerable potential to improve predictive accuracy and clinical decision-making. Despite the rapid advancement of gradient boosting, ensemble learning, and fairness-aware machine learning techniques, a comprehensive systematic review evaluating SML models for cardiovascular risk prediction, including their predictive performance, validation strategies, multi-modal data integration, and demographic fairness, has been lacking. Methods: A systematic search of PubMed/MEDLINE, Embase, IEEE Xplore, Web of Science, and the ACM Digital Library was conducted for studies published between January 2017 and January 2025, following PRISMA 2020 guidelines. The review protocol was registered in PROSPERO (CRD42025421673). Eligible studies included those developing or externally validating SML models for cardiovascular risk prediction or clinical event forecasting. Study quality and bias were assessed using the PROBAST-AI framework, including its fairness and calibration domain. Results: Thirty-nine studies involving 37 cardiovascular datasets and more than 7.4 million patient records were included. Random Forest (67%) and XGBoost (56%) were the most frequently used algorithms. The median best-reported AUC was 0.906 (IQR: 0.889–0.929). SML models outperformed traditional cardiovascular scoring systems in 35 of 39 studies (90%). Multi-modal models integrating clinical, imaging, and genomic data achieved the highest predictive performance, with a median AUC of 0.929 compared to 0.893 for clinical-only models. Fairness-regularised approaches reduced the mean maximum inter-demographic AUC gap from 0.091 to 0.039. Conclusions: SML models, particularly ensemble and multi-modal approaches, substantially improve cardiovascular risk prediction beyond conventional scoring systems while fairness-aware training enhances demographic equity and supports broader clinical adoption.

Supervised machine learning for cardiovascular risk prediction and clinical event forecasting: a systematic review of multi-modal models, cross-validation strategies, and fairness-aware approaches

Authors

Keywords:

Abstract

Published

How to Cite

Issue

Section

Most read articles by the same author(s)

Similar Articles

SidebarMenu

Downloads

Current Issue

Information

Make a Submission