Comparison of Machine Learning Techniques in Prediction of Mortality following Cardiac Surgery: Analysis of over 220,000 patients from a Large National Database

Shubhra Sinha, Tim Dong, Arnaldo Dimagli, Hunaid a Vohra, Chris Holmes, Umberto Benedetto, Gianni d Angelini*

*Corresponding author for this work

Research output: Contribution to journalArticle (Academic Journal)peer-review

10 Citations (Scopus)

Abstract

Objectives
To perform a systematic comparison of in-hospital mortality risk prediction post-cardiac surgery, between the predominant scoring system-European System for Cardiac Operative Risk Evaluation(EuroSCOREII), logistic regression(LR) retrained on the same variables and alternative machine learning techniques(ML)-random forest(RF), neural networks(NN),XGBoost and weighted support vector machine(SVM).

Methods
Retrospective analyses of prospectively routinely collected data on adult patients undergoing cardiac surgery in the UK from January 2012-March 2019. Data was temporally split 70:30 into training and validation subsets. Mortality prediction models were created using the 18 variables of EuroSCOREII. Comparisons of discrimination, calibration and clinical utility were then conducted. Changes in model performance, variable-importance over time and hospital/operation-based model performance were also reviewed.

Results
Of the 227,087 adults who underwent cardiac surgery during the study period there were 6,258 deaths(2.76%). In the testing cohort, there was an improvement in discrimination XGBoost(95% confidence interval(CI) Area Under the Curve(AUC): 0.834-0.834,F1 score:0.276-0.280) and RF(95%CI AUC:0.833-0.834,F1:0.277-0.281) compared with EuroSCOREII(95%CI AUC:0.817-0.818,F1: 0.243-0.245).There was no significant improvement in calibration with ML and retrained-LR compared to EuroSCOREII. However, EuroSCOREII overestimated risk across all deciles of risk and over time. The calibration drift was lowest in NN, XGBoost and RF compared with EuroSCOREII. Decision curve analysis showed XGBoost and RF to have greater net benefit than EuroSCOREII.

Conclusions
ML techniques showed some statistical improvements over retrained-LR and EuroSCOREII. The clinical impact of this improvement is modest at present. However the incorporation of additional risk factors in future studies may improve upon these findings and warrants further study.
Original languageEnglish
Article numberezad183
JournalEuropean Journal of Cardio-Thoracic Surgery
Volume63
Issue number6
Early online date8 May 2023
DOIs
Publication statusPublished - 16 Jun 2023

Bibliographical note

Publisher Copyright:
VC The Author(s) 2023. Published by Oxford University Press on behalf of the European Association for Cardio-Thoracic Surgery.

Fingerprint

Dive into the research topics of 'Comparison of Machine Learning Techniques in Prediction of Mortality following Cardiac Surgery: Analysis of over 220,000 patients from a Large National Database'. Together they form a unique fingerprint.

Cite this