Skip to content

Commit

Permalink
JOSS: References
Browse files Browse the repository at this point in the history
  • Loading branch information
perdelt committed Dec 28, 2024
1 parent b9399f4 commit 66b575c
Show file tree
Hide file tree
Showing 2 changed files with 21 additions and 12 deletions.
Binary file added docs/Experiment-Setup-Microservices.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
33 changes: 21 additions & 12 deletions paper.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ Bexhoma (Benchmark Experiment Host Manager) is a Python tool that helps with man

The basic workflow is [@10.1007/978-3-030-84924-5_6;@10.1007/978-3-030-94437-7_6]: start a containerized version of the DBMS, install monitoring software, import data, run benchmarks and shut down everything with a single command. A more advanced workflow is: Plan a sequence of such experiments, run plan as a batch and join results for comparison. It is possible to scale-out drivers for generating and loading data and for benchmarking to simulate cloud-native environments. Benchmarks included are YCSB, TPC-H, TPC-DS and TPC-C (HammerDB and Benchbase version).

![components of a benchmark.\label{fig:workflow}](docs/workflow-sketch-simple.png){ width=1440}
![workflow in bexhoma.\label{fig:workflow}](docs/workflow-sketch-simple.png){ width=1440}

Bexhoma serves as the orchestrator [@10.1007/978-3-030-94437-7_6] for distributed parallel benchmarking experiments in a Kubernetes Cloud. It starts a monitoring container of Prometheus and metrics collector containers of cAdvisor.
For analytical use cases, the Python package dbmsbenchmarker, [@Erdelt2022DBMSBenchmarker], is used as query executor and evaluator as in [@10.1007/978-3-030-84924-5_6;@10.1007/978-3-030-94437-7_6]. For transactional use cases, HammerDB's TPC-C, Benchbase's TPC-C and YCSB are used as drivers for generating and loading data and for running the workload as in [@10.1007/978-3-031-68031-1_9].
Expand All @@ -37,27 +37,36 @@ See the [homepage](https://github.com/Beuth-Erdelt/Benchmark-Experiment-Host-Man

# Statement of Need

In [@10.1007/978-3-030-84924-5_6] we introduced the package.

The first aspect of Bexhoma is the need for a framework to support all aspects of a benchmarking experiment.
In [@10.1007/978-3-319-67162-8_12] the authors present a cloud-centric analysis of eight evaluation frameworks.
In [@10.1007/978-3-030-12079-5_4] the authors inspect several frameworks and collect requirements for a DBMS benchmarking framework in an interview based method and per interest group.
In [@10.1007/978-3-319-15350-6_6] the authors list important components for benchmarking, like Benchmark Coordinator, Measurement Manager, Workload Executor. They plead for a benchmarking middleware to support the process, to "*take care of the hassle of distributed benchmarking and managing the measurement infrastructure*". This is supposed to help the benchmark designer to concentrate on the core competences: specifying workload profiles and analyzing obtained measurements. In [@10.1007/978-3-030-84924-5_6] we extract the following requirements and introduce the package:

* Help with time-consuming initial setup and configuration
* Metadata collection
* Generality and versatility
* Extensibility and abstraction
* Usability and configurability
* Track everything
* Repeatability/ reproducibility
* Metadata collection / Track everything
* Generality / Versatility
* Extensibility / Abstraction
* Usability / Configurability
* Repeatability / Reproducibility

In [@10.1007/978-3-319-15350-6_6] the authors list important components for benchmarking, like Benchmark Coordinator, Measurement Manager, Workload Executor. They plead for a benchmarking middleware to support the process, to "*take care of the hassle of distributed benchmarking and managing the measurement infrastructure*". This is supposed to help the benchmark designer to concentrate on the core competences: specifying workload profiles and analyzing obtained measurements

## Summary of Solution

* Virtualization with Docker containers
* Orchestration with Kubernetes
* Monitoring with cAdvisor / Prometheus, since it is a common practise in cluster management


![components of bexhoma.\label{fig:components}](docs/Experiment-Setup-Microservices.png){ width=1440}

* **SUT (DBMS)**: *deployment*, container `dbms`, container for cAdvisor for sidecar monitoring, PVC for persistent storage, service for connection, port 9091
* **Multi-host DBMS**: *statefulset* for worker, job for initialization
* **Monitoring**: *deployment* of Prometheus
* **Metrics collectors**: either sidecar of single-host DBMS or *daemonset* for all nodes of cluster.
* **Loader (schema and index creation)**: fire-and-forget thread in the orchestrator
* **Ingestion**: job of pods for data generation and for ingestion of data into the DBMS, synchronized using a Redis queue
* **Benchmarking**: job of pods for running the driver, synchronized using a Redis queue

# A Basic Example

The [documentation](https://bexhoma.readthedocs.io/en/latest/) contains a lot of examples.
Expand Down Expand Up @@ -86,8 +95,8 @@ This
* using 16 (`-nlt`) threads
* runs streams of TPC-C queries (per DBMS)
* running for 5 (`-sd`) minutes
* each stream (pod) having 16 threads to simulate 16 users (`-nbt`)
* `-nbp`: first stream 1 pos, second stream 2 pods (8 threads each)
* each stream having 16 threads to simulate 16 users (`-nbt`)
* `-nbp`: first stream 1 pods, second stream 2 pods (8 threads each)
* with a maximum of 1 DBMS per time (`-ms`)
* tests if results match workflow (`-tr`)
* shows a summary
Expand Down

0 comments on commit 66b575c

Please sign in to comment.