Towards Seamless Configuration Tuning of Big Data Analytics

Ayat Fekry, Lucian Carata, Thomas Pasquier, Andrew Rice, Andy Hopper

Research output: Chapter in Book/Report/Conference proceedingConference Contribution (Conference Proceeding)

2 Citations (Scopus)
176 Downloads (Pure)

Abstract

The execution of distributed data processing workloads (such as those running on top of Hadoop or Spark) in cloud environments presents a unique opportunity to explore multiple trade-offs between elasticity (and types of resources being allocated), overall runtime and total costs. However, beyond high-level constraints and objectives, it's not the end-users who should be mainly concerned with those optimizations, but the cloud providers. They have both the vantage point to collect actionable information, economies of scale and position to adjust parameters when dynamic conditions change, in order to fulfil SLOs that go beyond classic measures of latency and throughput.

This is at odds with the existing approach of making software (including the interfaces to the cloud and the processing frameworks) as configurable as possible. We propose that rather than configurability, self-tunability (or the illusion of it as far as the end-user is concerned) is a better long-term goal.
Original languageEnglish
Title of host publication2019 IEEE International Conference on Distributed Computing Systems (ICDCS 2019)
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
DOIs
Publication statusPublished - 2019

Structured keywords

  • Cyber Security

Keywords

  • Tuning
  • Sparks
  • Cloud computing
  • Runtime
  • Optimisation
  • Piplines
  • Measurement
  • Big data
  • data analysis
  • data handling
  • parallel processing
  • big data analytics
  • distributed data workloads
  • cloud environments
  • multiple trade-offs
  • elasticity
  • high-level constraints
  • cloud providers
  • vantage point
  • actionable information
  • dynamic conditions change
  • classic measures
  • processing frameworks
  • end-user
  • seamless configuration tuning
  • SLO
  • Configuration Tuning
  • Data intensive computing

Fingerprint

Dive into the research topics of 'Towards Seamless Configuration Tuning of Big Data Analytics'. Together they form a unique fingerprint.

Cite this