Skip to content

pratyushagnihotri/ZeroTune

Repository files navigation

ZeroTune Logo
Welcome to ZeroTune - Code and Documentation

ZeroTune is a novel zero-shot learned approach for determining cost-effective parallelism degrees in distributed stream processing systems.

Citation

Please cite our papers, if you find this work useful or use it in your paper as a baseline.

 @inproceedings {agnihotri24icde,
  author = {Agnihotri, Pratyush and Koldehofe, Boris and Stiegele, Paul and Heinrich, Roman and Binnig, Carsten and Luthra, Manisha},
  title = {ZeroTune: Learned Zero-Shot Cost Model for Parallelism Tuning in Stream Processing},
  year = {2024},
  booktitle={40th IEEE International Conference on Data Engineering (ICDE)},
  pages = {1–14},
  numpages = {14}
  }

 @inproceedings {agnihotri23aidm,
  author = {Agnihotri, Pratyush and Koldehofe, Boris and Binnig, Carsten and Luthra, Manisha},
  title = {Zero-Shot Cost Models for Parallel Stream Processing},
  year = {2023},
  isbn = {9798400701931},
  url = {[Zero-Shot Cost Models for Parallel Stream Processing | Proceedings of the Sixth International Workshop on Exploiting Artificial Intelligence Techniques for Data Management](https://doi.org/10.1145/3593078.3593934)},
  doi = {10.1145/3593078.3593934},
  booktitle = {Proceedings of the Sixth International Workshop on Exploiting Artificial Intelligence Techniques for Data Management (aiDM@SIGMOD)},
  articleno = {5},
  numpages = {5},
  series = {aiDM '23}
  }

 @inproceedings {agnihotri22debs,
  author = {Agnihotri, Pratyush and Koldehofe, Boris and Binnig, Carsten and Luthra, Manisha},
  title = {PANDA: performance prediction for parallel and dynamic stream processing},
  year = {2022},
  isbn = {9781450393089},
  publisher = {Association for Computing Machinery},
  address = {New York, NY, USA},
  url = {https://doi.org/10.1145/3524860.3543281},
  doi = {10.1145/3524860.3543281},
  booktitle = {Proceedings of the 16th ACM International Conference on Distributed and Event-Based Systems},
  pages = {180–181},
  numpages = {2},
  series = {DEBS '22}
  }

Dedicated Repository for Paper Submission:

This repository is created to support our paper submission titled "ZeroTune", showcasing the capabilities of zero-shot model.

Exploring ZeroTune's Key Components:

  • zerotune-management: The main instructions to setup is in zerotune-management. It consists collection of scripts that facilitate the seamless setup of both local and remote clusters. These clusters serve as the foundation for the parallel query plan generator and environment for zero-shot model for training and test purpose.

  • zerotune-plan-generator: Apache flink client application which functions as an essential tool for generating synthetic and benchmark parallel query plans. These plans are vital for the training and testing of data, a crucial aspect of our zero-shot learning model.

  • zerotune-learning: zero-shot model that specializes in providing accurate cost predictions for distributed parallel stream processing.

  • Flink-Observation: Modified the fork of Apache Flink for custom logging of observation of workload characteristics and login them in MongoDB database.

  • zerotune-VM_image: VM image includes all the necessary code and dependencies that are required to generate training data and training of the model.