We meta-analyzed the diagnostic accuracy of the Patient Health Questionnaire-9 (PHQ-9) depression screening tool. We compared results for two statistical methods proposed by Steinhauser and by Jones to account for missing cut-offs, with results from a series of bivariate random effects models (BRM) estimated separately at each cut-off. We applied the methods to a dataset that contained information only on cut-offs that were reported in the primary publications,
and to the full IPD dataset that contained information for all cut-offs for every study. For each method, we estimated pooled sensitivity and specificity and associated 95% confidence intervals for each cut-off and area under the curve (AUC).
The full IPD dataset comprised data from 45 studies, 15,020 subjects and 1,972 cases of major depression, and included information on every possible cut-off.
When using data available in publications, using statistical approaches out-performed the BRM applied to the same data.
AUC was similar for all approaches when using the full IPD dataset, though pooled estimates were slightly different.
Overall, using statistical methods to fill in missing cut-off data recovered the receiver operating characteristic (ROC) curve from the full IPD dataset well when using only the published subset. All methods performed similarly when applied to the full IPD dataset.
- individual participant data
- diagnostic accuracy
- bivariate random effects model