Abstract
This experimental study presents a number of issues that pose a challenge for practical configuration tuning and its deployment in data analytics frameworks. These issues include: 1) the assumption of a static workload or environment, ignoring the dynamic characteristics of the analytics environment ( e.g., increase in input data size, changes in allocation of resources). 2) the amortization of tuning costs and how this influences what workloads can be tuned in practice in a cost-effective manner. 3) the need for a comprehensive incremental tuning solution for a diverse set of workloads. We adapt different ML techniques in order to obtain efficient incremental tuning in our problem domain, and propose Tuneful, a configuration tuning framework. We show how it is designed to overcome the above issues and illustrate its applicability by running a wide array of experiments in cloud environments provided by two different service providers.
Original language | English |
---|---|
Title of host publication | Knowledge Discovery and Data Mining (KDD) |
Publisher | Association for Computing Machinery (ACM) |
Pages | 2494–2504 |
Number of pages | 11 |
ISBN (Print) | 978-1-4503-7998-4/20/08 |
DOIs | |
Publication status | Published - 1 Aug 2020 |
Keywords
- Data analytics
- Configuration tuning
- Bayesian Optimization
- Cost amortization