Accurate Performance Modeling to Enable Efficient Cooperative Transition Decision

In modern cloud environments, Distributed Stream Processing (DSP) systems are critical for enabling real-time data analysis in large-scale applications. Parallelism is often a desired property of DSP workloads to meet the timeliness and scaling requirements of current applications, necessitating the use of distributed and multi-core cloud resources. However, understanding and predicting the performance of DSP workloads is challenging due to the heterogeneous nature of cloud resources. Accurate performance modeling is essential for optimizing decisions to meet mission-critical requirements, such as timeliness, in heterogeneous environments. PDSP-Bench and ZeroTune address these challenges by providing a benchmarking system and zero-shot performance prediction models, respectively. Their results can be leveraged to perform mechanism transitions or optimize joint operator placement and parallelism to fulfill mission-critical requirements.

This demo presents PDSP-Bench, a novel benchmarking system that evaluates parallel DSP applications on heterogeneous cloud infrastructure. It addresses key limitations in existing benchmarking systems for DSP by offering enumeration for various hardware configurations and workloads. PDSP-Bench enables the benchmarking of synthetic and real-world workloads, providing deep insights into DSP system performance, e.g., the impact of parallelism across diverse streaming applications and heterogeneous hardware.

Additionally, the demo features ZeroTune, a novel learned cost model and optimizer for DSP workload parallelism tuning. Using transfer learning techniques similar to those in Large Language Models, ZeroTune provides highly accurate performance predictions and generalization for unseen workloads and hardware configurations. In addition, it uses predicted performance to optimize parallelism for unseen DSP workloads across varying cloud hardware. It significantly improves workload execution times, offering speed-ups and improved resource efficiency.

Using the heterogeneous resources of the CloudLab testbed, we showcase the scalability, accuracy, and performance benefits of PDSP-Bench and ZeroTune, demonstrating their impact on optimizing DSP systems in heterogeneous cloud environments.

Participating subprojects and awards

2 participating subprojects (C2, D2)
2 Software project

Demo

Further Information

Publications related to demo:

Paper 1: ZeroTune: Learned Zero-Shot Cost Models for Parallelism Tuning in Stream Processing (ICDE 2024)
Paper 2 (opens in new tab): PDSP-Bench: A Benchmarking System for Parallel and Distributed Stream Processing
Paper 3: Zero-shot parallelism tuning (aiDM@Sigmod 2023)
Paper 4: Benchmark Event Sources Systems and Hardware (Middleware 2022)
Paper 5: Zero-shot cost model for parallelism problem (DEBS 2022)
Paper 6: Autonomous Resource Management (Middleware 2021)

Other related publications:

Paper 1: Costream: Learned Cost Models for Operator Placement in Edge-Cloud Environments (ICDE 2024)
Paper 2: Zero-shot cost model for stream processing queries (DEBS 2022)
Paper 3: Zero-shot cost model for database queries (VLDB 2022)
Paper 4: Multi-task zero-shot cost model for database queries (VLDB 2022)
Paper 5: One model to rule them all (CIDR 2021)