An Improved Model Selection Heuristic for AUC

Wu Shaomin, Peter Flach, Ferri Cesar, Kok Joost N., Koronacki Jacek, Mantaras Ramon Lopez de, Matwin Stan, Mladenic Dunja, Skowron Andrzej

Research output: Chapter in Book/Report/Conference proceedingConference Contribution (Conference Proceeding)

30 Citations (Scopus)

Abstract

The area under the ROC curve (AUC) has been widely used to measure ranking performance for binary classification tasks. AUC only employs the classifier’s scores to rank the test instances; thus, it ignores other valuable information conveyed by the scores, such as sensitivity to small differences in the score values. However, as such differences are inevitable across samples, ignoring them may lead to overfitting the validation set when selecting models with high AUC. This problem is tackled in this paper. On the basis of ranks as well as scores, we introduce a new metric called scored AUC (sAUC), which is the area under the sROC curve. The latter measures how quickly AUC deteriorates if positive scores are decreased. We study the interpretation and statistical properties of sAUC. Experimental results on UCI data sets convincingly demonstrate the effectiveness of the new metric for classifier evaluation and selection in the case of limited validation data.
Translated title of the contributionAn Improved Model Selection Heuristic for AUC
Original languageEnglish
Title of host publication18th European Conference on Machine Learning
Pages478-489
Publication statusPublished - 2007

Bibliographical note

ISBN: 9783540749752
Publisher: Springer
Name and Venue of Conference: 18th European Conference on Machine Learning
Other identifier: 2000765

Fingerprint

Dive into the research topics of 'An Improved Model Selection Heuristic for AUC'. Together they form a unique fingerprint.

Cite this