Precision-Recall-Gain Curves: PR Analysis Done Right

Peter Flach, Meelis Kull

Research output: Chapter in Book/Report/Conference proceedingConference Contribution (Conference Proceeding)

758 Downloads (Pure)

Abstract

Precision-Recall analysis abounds in applications of binary classification where true negatives do not add value and hence should not affect assessment of the classifier's performance. Perhaps inspired by the many advantages of receiver operating characteristic (ROC) curves and the area under such curves for accuracy-based performance assessment, many researchers have taken to report Precision-Recall (PR) curves and associated areas as performance metric. We demonstrate in this paper that this practice is fraught with difficulties, mainly because of incoherent scale assumptions -- e.g., the area under a PR curve takes the arithmetic mean of precision values whereas the Fβ score applies the harmonic mean. We show how to fix this by plotting PR curves in a different coordinate system, and demonstrate that the new Precision-Recall-Gain curves inherit all key advantages of ROC curves. In particular, the area under Precision-Recall-Gain curves conveys an expected F1 score on a harmonic scale, and the convex hull of a Precision-Recall-Gain curve allows us to calibrate the classifier's scores so as to determine, for each operating point on the convex hull, the interval of β values for which the point optimises Fβ. We demonstrate experimentally that the area under traditional PR curves can easily favour models with lower expected F1 score than others, and so the use of Precision-Recall-Gain curves will result in better model selection.
Original languageEnglish
Title of host publicationAdvances in Neural Information Processing Systems 28
EditorsC. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, R. Garnett
PublisherCurran Associates, Inc
Pages838-846
Number of pages9
Publication statusPublished - 12 Dec 2015

    Fingerprint

Structured keywords

  • Jean Golding

Cite this

Flach, P., & Kull, M. (2015). Precision-Recall-Gain Curves: PR Analysis Done Right. In C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, & R. Garnett (Eds.), Advances in Neural Information Processing Systems 28 (pp. 838-846). Curran Associates, Inc. https://papers.nips.cc/paper/5867-precision-recall-gain-curves-pr-analysis-done-right