Versatile Decision Trees for Learning Over Multiple Contexts

Reem Al-Otaibi, Ricardo B.C. Prudencio, Meelis Kull, Peter Flach

Research output: Chapter in Book/Report/Conference proceedingConference Contribution (Conference Proceeding)

7 Citations (Scopus)
350 Downloads (Pure)

Abstract

Discriminative models for classification assume that training and deployment data are drawn from the same distribution. The performance of these models can vary significantly when they are learned and deployed in different contexts with different data distributions. In the literature, this phenomenon is called dataset shift. In this paper, we address several important issues in the dataset shift problem. First, how can we automatically detect that there is a significant difference between training and deployment data to take action or adjust the model appropriately? Secondly, different shifts can occur in real applications (e.g., linear and non-linear), which require the use of diverse solutions. Thirdly, how should we combine the original model of the training data with other models to achieve better performance? This work offers two main contributions towards these issues. We propose a Versatile Model that is rich enough to handle different kinds of shift without making strong assumptions such as linearity, and furthermore does not require labelled data to identify the data shift at deployment. Empirical results on both synthetic shift and real datasets shift show strong performance gains by achieved the proposed model.
Original languageEnglish
Title of host publicationMachine Learning and Knowledge Discovery in Databases
PublisherSpringer Verlag
Pages184-199
Number of pages16
Volume9284
ISBN (Electronic)978-3-319-23528-8
ISBN (Print)978-3-319-23527-1
DOIs
Publication statusPublished - 29 Aug 2015
EventEuropean Conference on Machine Learning and Knowledge Discovery (ECML PKDD) 2015 - Portugal, Porto, United Kingdom
Duration: 7 Sept 201511 Sept 2015

Publication series

NameLecture Notes in Computer Science
PublisherSpringer
ISSN (Print)0302-9743

Conference

ConferenceEuropean Conference on Machine Learning and Knowledge Discovery (ECML PKDD) 2015
Country/TerritoryUnited Kingdom
CityPorto
Period7/09/1511/09/15

Research Groups and Themes

  • Jean Golding

Keywords

  • Versatile model
  • Decision Trees
  • Dataset shift
  • Percentile
  • Kolmogorov-Smirnov test

Fingerprint

Dive into the research topics of 'Versatile Decision Trees for Learning Over Multiple Contexts'. Together they form a unique fingerprint.

Cite this