Formalizing complex prior information to quantify subjective interestingness of frequent pattern sets

K-N Kontonasios, Bie Tijl De

Research output: Working paperWorking paper and Preprints

8 Citations (Scopus)

Abstract

In this paper, we are concerned with the problem of model- ing prior information of a data miner about the data, with the purpose of quantifying subjective interestingness of patterns. Recent results have achieved this for the speci¯c case of prior expectations on the row and column marginals, based on the Maximum Entropy principle [2, 12]. In the current paper, we extend these ideas to make them applicable to more general prior information, such as knowledge of frequencies of itemsets, a cluster structure in the data, or the presence of dense areas in the database. As in [2, 12], we show how information theory can be used quantify subjective interestingness against this model as a background. Our method presents an e±cient, °exible, and rigorous alternative to the randomization approach presented in [6]. This randomization method was developed for very similar purposes, but su®ers from convergence issues and computational limitations. Furthermore, randomization tech- niques can only be used for empirical hypothesis testing as a way to quantify interestingness, severely limiting their applicability. We demon- strate our method by searching for interesting patterns in real-life data with respect to various realistic types of prior information, and we note that like the approach from [6], our work can be used for iterative data mining.
Translated title of the contributionFormalizing complex prior information to quantify subjective interestingness of frequent pattern sets
Original languageEnglish
PublisherUniversity of Bristol
Number of pages16
Publication statusPublished - 2011

Fingerprint Dive into the research topics of 'Formalizing complex prior information to quantify subjective interestingness of frequent pattern sets'. Together they form a unique fingerprint.

  • Projects

    FROM FREQUENT ITEMSETS TO INFORMATIVE PATTERNS

    De Bie, T. E. P.

    1/10/091/04/13

    Project: Research

    Cite this