Development and validation of an artificial intelligence-based model for cardiovascular disease prediction using longitudinal data – BMC Medical Informatics and Decision Making

Background This research uses machine learning (ML) models to determine significant predictive factors for CVD events in Iran, a country with high mortality rates. The study evaluates the effectiveness of deep learning and mixed-effects logistic models in predicting 10-year CVD incidence using longitudinal TLGS data. Methods A total of 4,872 adults aged ≥ 30 years without a history of CVD at baseline (2006–2008) were followed until 2020. Following exclusions due to prevalent CVD or insufficient follow-up data, the final analytic sample comprised 1,942 men and 2,930 women. Baseline demographic, behavioral, and biochemical characteristics were utilized as input features. A clinical history examination was employed to identify and confirm incident CVD events, such as stroke and coronary heart disease (CHD). Using longitudinal data collected during a 10-year study period, deep learning models based on Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) architectures were utilized to detect variations in risk factor dynamics. The area under the receiver operating characteristic curve (AUC) was employed to assess the model’s performance. Results During the follow-up period, 545 participants (11.2%) experienced CVD events. In both sexes, the GRU model outperformed the LSTM model. In women, the GRU achieved an AUC of 0.739, whereas in men it achieved 0.738, in both cases outperforming LSTM. Compared with the traditional mixed-effects model, discrimination was comparable in women (GRU 0.739 vs. 0.74) and higher in men (0.738 vs. 0.70). Even though the models were restricted to 21 commonly employed clinical variables, their performance was similar to that of larger-scale studies that included hundreds of variables. Conclusions Deep learning models, such as GRU, can effectively leverage longitudinal medical data to predict future cardiovascular disease events. The models achieved performance comparable to studies that used far more variables, despite relying on a limited feature set.