Abstract
The execution of distributed data processing workloads (such as those running on top of Hadoop or Spark) in cloud environments presents a unique opportunity to explore multiple trade-offs between elasticity (and types of resources being allocated), overall runtime and total costs. However, beyond high-level constraints and objectives, it's not the end-users who should be mainly concerned with those optimizations, but the cloud providers. They have both the vantage point to collect actionable information, economies of scale and position to adjust parameters when dynamic conditions change, in order to fulfil SLOs that go beyond classic measures of latency and throughput.
This is at odds with the existing approach of making software (including the interfaces to the cloud and the processing frameworks) as configurable as possible. We propose that rather than configurability, self-tunability (or the illusion of it as far as the end-user is concerned) is a better long-term goal.
This is at odds with the existing approach of making software (including the interfaces to the cloud and the processing frameworks) as configurable as possible. We propose that rather than configurability, self-tunability (or the illusion of it as far as the end-user is concerned) is a better long-term goal.
Original language | English |
---|---|
Title of host publication | 2019 IEEE International Conference on Distributed Computing Systems (ICDCS 2019) |
Publisher | Institute of Electrical and Electronics Engineers (IEEE) |
DOIs | |
Publication status | Published - 2019 |
Structured keywords
- Cyber Security
Keywords
- Tuning
- Sparks
- Cloud computing
- Runtime
- Optimisation
- Piplines
- Measurement
- Big data
- data analysis
- data handling
- parallel processing
- big data analytics
- distributed data workloads
- cloud environments
- multiple trade-offs
- elasticity
- high-level constraints
- cloud providers
- vantage point
- actionable information
- dynamic conditions change
- classic measures
- processing frameworks
- end-user
- seamless configuration tuning
- SLO
- Configuration Tuning
- Data intensive computing