Rule induction for subgroup discovery with CN2-SD

N Lavrac, PA Flach, B Kasek, L Todorovski

Research output: Chapter in Book/Report/Conference proceedingConference Contribution (Conference Proceeding)

Abstract

Rule learning is typically used in solving classification and prediction tasks. However, learning of classification rules can be adapted also to subgroup discovery. This paper shows how this can be achieved by modifying the CN2 rule learning algorithm. Modifications include a new covering algorithm (weighted covering algorithm), a new search heuristic (weighted relative accuracy), probabilistic classification of instances, and a new measure for evaluating the results of subgroup discovery (area under ROC curve). The main advantage of the proposed approach is that each rule with high weighted accuracy represents a ?chunk? of knowledge about the problem, due to the appropriate tradeo? between accuracy and coverage, achieved through the use of the weighted relative accuracy heuristic. Moreover, unlike the classical covering algorithm, in which only the first few induced rules may be of interest as subgroup descriptors with su?cient coverage (since subsequently induced rules are induced from biased example subsets), the subsequent rules induced by the weighted covering algorithm allow for discovering interesting subgroup properties of the entire population. Experimental results on 17 UCI datasets are very promising, demonstrating big improvements in number of induced rules, rule coverage and rule significance, as well as smaller improvements in rule accuracy and area under ROC curve.
Translated title of the contributionRule induction for subgroup discovery with CN2-SD
Original languageEnglish
Title of host publicationUnknown
EditorsM. Bohanec, B. Kasek, N. Lavrac, D. Mladenic
PublisherUniversity of Helsinki
Pages77 - 87
Number of pages10
Publication statusPublished - Aug 2002

Bibliographical note

Conference Proceedings/Title of Journal: ECML/PKDD'02 workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning

Fingerprint

Dive into the research topics of 'Rule induction for subgroup discovery with CN2-SD'. Together they form a unique fingerprint.

Cite this