The performance of automated case-mix adjustment regression model building methods in a health outcome prediction setting

Min-Hua Jen*, Alex Bottle, Graham Kirkwood, Ron Johnston, Paul Aylin

*Corresponding author for this work

Research output: Contribution to journalArticle (Academic Journal)

8 Citations (Scopus)

Abstract

We have previously described a system for monitoring a number of healthcare outcomes using case-mix adjustment models. It is desirable to automate the model fitting process in such a system if monitoring covers a large number of outcome measures or subgroup analyses. Our aim was to compare the performance of three different variable selection strategies: "manual", "automated" backward elimination and re-categorisation, and including all variables at once, irrespective of their apparent importance, with automated re-categorisation. Logistic regression models for predicting in-hospital mortality and emergency readmission within 28 days were fitted to an administrative database for 78 diagnosis groups and 126 procedures from 1996 to 2006 for National Health Services hospital trusts in England. The performance of models was assessed with Receiver Operating Characteristic (ROC) c statistics, (measuring discrimination) and Brier score (assessing the average of the predictive accuracy). Overall, discrimination was similar for diagnoses and procedures and consistently better for mortality than for emergency readmission. Brier scores were generally low overall (showing higher accuracy) and were lower for procedures than diagnoses, with a few exceptions for emergency readmission within 28 days. Among the three variable selection strategies, the automated procedure had similar performance to the manual method in almost all cases except low-risk groups with few outcome events. For the rapid generation of multiple case-mix models we suggest applying automated modelling to reduce the time required, in particular when examining different outcomes of large numbers of procedures and diseases in routinely collected administrative health data.

Original languageEnglish
Pages (from-to)267-278
Number of pages12
JournalHealth Care Management Science
Volume14
Issue number3
DOIs
Publication statusPublished - Sep 2011

Keywords

  • Automated modelling
  • Brier scores
  • Hospital administrative database
  • Receiver operating characteristic (ROC)
  • LOGISTIC-REGRESSION
  • VARIABLE SELECTION
  • BOOTSTRAP METHODS
  • ROC CURVE
  • RISK
  • AREA

Cite this