Abstract
BACKGROUND
Interest in the clinical usefulness of machine learning for risk prediction has bloomed recently. Cardiac surgery patients are at high risk of complications and therefore presurgical risk assessment is of crucial relevance. We aimed to compare the performance of machine learning algorithms over traditional logistic regression (LR) model to predict in-hospital mortality following cardiac surgery.
METHODS
A single-centre data set of prospectively collected information from patients undergoing adult cardiac surgery from 1996 to 2017 was split into 70% training set and 30% testing set. Prediction models were developed using neural network, random forest, naive Bayes and retrained LR based on features included in the EuroSCORE. Discrimination was assessed using area under the receiver operating characteristic curve, and calibration analysis was undertaken using the calibration belt method. Model calibration drift was assessed by comparing Goodness of fit χ2 statistics observed in 2 equal bins from the testing sample ordered by procedure date.
RESULTS
A total of 28 761 cardiac procedures were performed during the study period. The in-hospital mortality rate was 2.7%. Retrained LR [area under the receiver operating characteristic curve 0.80; 95% confidence interval (CI) 0.77–0.83] and random forest model (0.80; 95% CI 0.76–0.83) showed the best discrimination. All models showed significant miscalibration. Retrained LR proved to have the weakest calibration drift.
CONCLUSIONS
Our findings do not support the hypothesis that machine learning methods provide advantage over LR model in predicting operative mortality after cardiac surgery.
Interest in the clinical usefulness of machine learning for risk prediction has bloomed recently. Cardiac surgery patients are at high risk of complications and therefore presurgical risk assessment is of crucial relevance. We aimed to compare the performance of machine learning algorithms over traditional logistic regression (LR) model to predict in-hospital mortality following cardiac surgery.
METHODS
A single-centre data set of prospectively collected information from patients undergoing adult cardiac surgery from 1996 to 2017 was split into 70% training set and 30% testing set. Prediction models were developed using neural network, random forest, naive Bayes and retrained LR based on features included in the EuroSCORE. Discrimination was assessed using area under the receiver operating characteristic curve, and calibration analysis was undertaken using the calibration belt method. Model calibration drift was assessed by comparing Goodness of fit χ2 statistics observed in 2 equal bins from the testing sample ordered by procedure date.
RESULTS
A total of 28 761 cardiac procedures were performed during the study period. The in-hospital mortality rate was 2.7%. Retrained LR [area under the receiver operating characteristic curve 0.80; 95% confidence interval (CI) 0.77–0.83] and random forest model (0.80; 95% CI 0.76–0.83) showed the best discrimination. All models showed significant miscalibration. Retrained LR proved to have the weakest calibration drift.
CONCLUSIONS
Our findings do not support the hypothesis that machine learning methods provide advantage over LR model in predicting operative mortality after cardiac surgery.
Original language | English |
---|---|
Article number | ezaa229 |
Number of pages | 7 |
Journal | European Journal of Cardio-Thoracic Surgery |
Early online date | 18 Aug 2020 |
DOIs | |
Publication status | E-pub ahead of print - 18 Aug 2020 |
Research Groups and Themes
- Bristol Heart Institute