Subjectively Interesting Component Analysis: Data Projections that Contrast with Prior Expectations

Bo Kang, Jefrey Lijffijt, Raul Santos-Rodriguez, Tijl De Bie

Research output: Chapter in Book/Report/Conference proceedingConference Contribution (Conference Proceeding)

5 Citations (Scopus)
330 Downloads (Pure)

Abstract

Methods that find insightful low-dimensional projections are essential to effectively explore high-dimensional data. Principal Component Analysis is used pervasively to find low dimensional projections, not only because it is straightforward to use, but it is also often effective, because the variance in data is often dominated by relevant structure. However, even if the projections highlight real structure in the data, not all structure is interesting to every user. If a user is already aware of, or not interested in the dominant structure, Principal Component Analysis is less effective for finding interesting components. We introduce a new method called Subjectively Interesting Component Analysis (SICA), designed to find data projections that are subjectively interesting, i.e, projections that truly surprise the end-user. It is
rooted in information theory and employs an explicit model of a user's prior expectations about the data. The corresponding optimization problem is a simple eigenvalue problem, and the result is a trade-o between explained variance and novelty. We present five case studies on synthetic data, images, time-series, and spatial data, to illustrate how SICA enables users to find (subjectively) interesting projections.
Original languageEnglish
Title of host publicationProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD '16
Place of PublicationNew York, NY, USA
PublisherAssociation for Computing Machinery (ACM)
Pages1615-1624
Number of pages10
VolumeAugust 2016
ISBN (Print)9781450342322
DOIs
Publication statusPublished - 13 Aug 2016
EventACM KDD 2016 - San Francisco, United States
Duration: 13 Aug 201617 Aug 2016

Conference

ConferenceACM KDD 2016
CountryUnited States
CitySan Francisco
Period13/08/1617/08/16

Keywords

  • Exploratory Data Mining
  • Dimensionality Reduction
  • Information Theory
  • Subjective Interestingness

Fingerprint Dive into the research topics of 'Subjectively Interesting Component Analysis: Data Projections that Contrast with Prior Expectations'. Together they form a unique fingerprint.

Cite this