Subgroup Discovery with Proper Scoring Rules

Hao Song, Meelis Kull, Peter Flach, Georgios Kalogridis

Research output: Chapter in Book/Report/Conference proceedingConference Contribution (Conference Proceeding)

2 Citations (Scopus)
286 Downloads (Pure)

Abstract

Subgroup Discovery is the process of finding and describing sufficiently large subsets of a given population that have unusual distributional characteristics with regard to some target attribute. Such subgroups can be used as a statistical summary which improves on the default summary of stating the overall distribution in the population. A natural way to evaluate such summaries is to quantify the difference between predicted and empirical distribution of the target. In this paper we propose to use proper scoring rules, a well-known family of evaluation measures for assessing the goodness of probability estimators, to obtain theoretically well-founded evaluation measures for subgroup discovery. From this perspective, one subgroup is better than another if it has lower divergence of target probability estimates from the actual labels on average. We demonstrate empirically on both synthetic and real-world data that this leads to higher quality statistical summaries than the existing methods based on measures such as Weighted Relative Accuracy.
Original languageEnglish
Title of host publicationMachine Learning and Knowledge Discovery in Databases
Subtitle of host publicationEuropean Conference, ECML PKDD 2016, Riva del Garda, Italy, September 19-23, 2016, Proceedings, Part II
PublisherSpringer
Pages492-510
Number of pages19
ISBN (Electronic)9783319462271
ISBN (Print)9783319462264
DOIs
Publication statusPublished - 2016

Publication series

NameLecture Notes in Computer Science
PublisherSpringer
Volume9852
ISSN (Print)0302-9743

Structured keywords

  • Jean Golding

Fingerprint Dive into the research topics of 'Subgroup Discovery with Proper Scoring Rules'. Together they form a unique fingerprint.

  • Projects

    SPHERE (EPSRC IRC)

    Craddock, I. J., Coyle, D. T., Flach, P. A., Kaleshi, D., Mirmehdi, M., Piechocki, R. J., Stark, B. H., Ascione, R., Ashburn, A. M., Burnett, M. E., Aldamen, D., Gooberman-Hill, R. J. S., Harwin, W. S., Hilton, G., Holderbaum, W., Holley, A. P., Manchester, V. A., Meller, B. J., Stack, E. & Gilchrist, I. D.

    1/10/1330/09/18

    Project: Research, Parent

    Cite this

    Song, H., Kull, M., Flach, P., & Kalogridis, G. (2016). Subgroup Discovery with Proper Scoring Rules. In Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2016, Riva del Garda, Italy, September 19-23, 2016, Proceedings, Part II (pp. 492-510). (Lecture Notes in Computer Science; Vol. 9852). Springer. https://doi.org/10.1007/978-3-319-46227-1_31