To Tune or Not to Tune? In Search of Optimal Configurations for Data Analytics

Ayat Fekry, Lucian Carata, Thomas Pasquier, Andrew Rice, Andy Hopper

    Research output: Chapter in Book/Report/Conference proceedingConference Contribution (Conference Proceeding)

    47 Citations (Scopus)
    372 Downloads (Pure)

    Abstract

    This experimental study presents a number of issues that pose a challenge for practical configuration tuning and its deployment in data analytics frameworks. These issues include: 1) the assumption of a static workload or environment, ignoring the dynamic characteristics of the analytics environment ( e.g., increase in input data size, changes in allocation of resources). 2) the amortization of tuning costs and how this influences what workloads can be tuned in practice in a cost-effective manner. 3) the need for a comprehensive incremental tuning solution for a diverse set of workloads. We adapt different ML techniques in order to obtain efficient incremental tuning in our problem domain, and propose Tuneful, a configuration tuning framework. We show how it is designed to overcome the above issues and illustrate its applicability by running a wide array of experiments in cloud environments provided by two different service providers.
    Original languageEnglish
    Title of host publicationKnowledge Discovery and Data Mining (KDD)
    PublisherAssociation for Computing Machinery
    Pages2494–2504
    Number of pages11
    ISBN (Print)978-1-4503-7998-4/20/08
    DOIs
    Publication statusPublished - 1 Aug 2020

    Keywords

    • Data analytics
    • Configuration tuning
    • Bayesian Optimization
    • Cost amortization

    Fingerprint

    Dive into the research topics of 'To Tune or Not to Tune? In Search of Optimal Configurations for Data Analytics'. Together they form a unique fingerprint.

    Cite this