To Tune or Not to Tune? In Search of Optimal Configurations for Data Analytics

Ayat Fekry, Lucian Carata, Thomas Pasquier, Andrew Rice, Andy Hopper

Research output: Chapter in Book/Report/Conference proceedingConference Contribution (Conference Proceeding)

32 Citations (Scopus)
330 Downloads (Pure)

Abstract

This experimental study presents a number of issues that pose a challenge for practical configuration tuning and its deployment in data analytics frameworks. These issues include: 1) the assumption of a static workload or environment, ignoring the dynamic characteristics of the analytics environment ( e.g., increase in input data size, changes in allocation of resources). 2) the amortization of tuning costs and how this influences what workloads can be tuned in practice in a cost-effective manner. 3) the need for a comprehensive incremental tuning solution for a diverse set of workloads. We adapt different ML techniques in order to obtain efficient incremental tuning in our problem domain, and propose Tuneful, a configuration tuning framework. We show how it is designed to overcome the above issues and illustrate its applicability by running a wide array of experiments in cloud environments provided by two different service providers.
Original languageEnglish
Title of host publicationKnowledge Discovery and Data Mining (KDD)
PublisherAssociation for Computing Machinery (ACM)
Pages2494–2504
Number of pages11
ISBN (Print)978-1-4503-7998-4/20/08
DOIs
Publication statusPublished - 1 Aug 2020

Keywords

  • Data analytics
  • Configuration tuning
  • Bayesian Optimization
  • Cost amortization

Fingerprint

Dive into the research topics of 'To Tune or Not to Tune? In Search of Optimal Configurations for Data Analytics'. Together they form a unique fingerprint.

Cite this