Performance benchmark workflow #4282

abraakie · 2024-04-23T21:28:02Z

Description

This will run a benchmark workflow for each push and pull request against a base branch (main, 1., 2.).
For pull requests the results of the benchmark are compared to the results of previous runs on the target branch. If the threshold (currently 150%) is exceeded, the benchmark workflow will fail.

This workflow helps identifying performance impacts of code changes in an early state.

Issues Resolved

Resolves [RFC] Security Performance Test Suite #3903

Testing

The workflow is tested manually. The result fluctuation is reasonable.

Check List

New functionality includes testing
New functionality has been documented
Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Aki Abramowski <[email protected]>

peternied

Thanks for creating the PR @abraakie Maybe you could help me understand more of the vision for these changes by extracting out parts of the RFC [1] that you are addressing vs those that you aren't planning vs things you are still implementing for this PR?

[1] [RFC] Security Performance Test Suite #3903

peternied · 2024-04-24T18:49:43Z

.github/workflows/benchmark.yml

+ - name: Prepare Security Plugin
+ run: mv ./build/distributions/opensearch-security-*.zip ./benchmark/docker/
+ - name: Build Docker Image
+ run: docker build --platform linux/amd64 --build-arg VERSION=${{ env.OPENSEARCH_VERSION }} -f benchmark/docker/benchmark.dockerfile -t opensearchproject/security-benchmark:latest benchmark/docker/


Why containerize with docker? We already have a supported model for starting and executing tests against a cluster, see this workflow for an example:
https://github.com/opensearch-project/security/blob/main/.github/workflows/plugin_install.yml

peternied · 2024-04-24T18:51:06Z

.github/workflows/benchmark.yml

+ run: mv ./build/distributions/opensearch-security-*.zip ./benchmark/docker/
+ - name: Build Docker Image
+ run: docker build --platform linux/amd64 --build-arg VERSION=${{ env.OPENSEARCH_VERSION }} -f benchmark/docker/benchmark.dockerfile -t opensearchproject/security-benchmark:latest benchmark/docker/
+ - name: Run Benchmark Cluster


I don't think test will produce consistent metrics when run in a cluster. If we are trying to detect 'large' issues, a cluster adds more variables. What results have you seen in your testing - can you help build my confidence on these choices?

peternied · 2024-04-24T18:54:39Z

.github/workflows/benchmark.yml

+ - name: Install Benchmark
+ run: pip install opensearch-benchmark
+ - name: Execute Benchmarks
+ run: opensearch-benchmark execute-test --pipeline=benchmark-only --results-format=csv --results-file=./results.csv --on-error=abort --workload=percolator --target-host=https://localhost:9200 --client-options=basic_auth_user:admin,basic_auth_password:${{ env.OPENSEARCH_ADMIN_PASSWORD }},verify_certs:false,timeout:60


This workflow uses percolator which doesn't exercise any of the security plugin's functionality, until we are doing tests with multiple users and different feature configurations this isn't any more useful than that nightly benchmarks against that are already published and managed via https://opensearch.org/benchmarks/

peternied · 2024-04-24T18:55:13Z

.github/workflows/benchmark.yml

+ with:
+ path: ./cache
+ key: ${{ runner.os }}-benchmark
+ - name: Store Benchmark Results


Can you link to a couple of workflows that show no-regression vs regression so we can better understand how this looks?

peternied · 2024-04-24T18:56:09Z

benchmark/result_rewriter.py

@@ -0,0 +1,53 @@
+import json


Can we see about making a PR to benchmarks to support another output format natively rather than baking it into the security repo?

codecov · 2024-04-25T21:45:34Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 66.05%. Comparing base (c09fad5) to head (06015f0).
Report is 1 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #4282      +/-   ##
==========================================
- Coverage   66.06%   66.05%   -0.01%     
==========================================
  Files         301      301              
  Lines       21711    21711              
  Branches     3506     3506              
==========================================
- Hits        14343    14341       -2     
- Misses       5607     5610       +3     
+ Partials     1761     1760       -1

see 3 files with indirect coverage changes

cwperks · 2024-05-13T20:04:08Z

.github/workflows/benchmark.yml

+ run: opensearch-benchmark execute-test --pipeline=benchmark-only --results-format=csv --results-file=./results.csv --on-error=abort --workload=percolator --target-host=https://localhost:9200 --client-options=basic_auth_user:admin,basic_auth_password:${{ env.OPENSEARCH_ADMIN_PASSWORD }},verify_certs:false,timeout:60
+ - name: Prepare Benchmark Results
+ run: python benchmark/result_rewriter.py ./results.csv ./results.json
+ - name: Download previous benchmark data


Can this eventually be extended to the last n runs? i.e. Get average or median of last 10 runs instead of only the previous run?

add security performance benchmark workflow

06015f0

Signed-off-by: Aki Abramowski <[email protected]>

peternied reviewed Apr 24, 2024

View reviewed changes

cwperks reviewed May 13, 2024

View reviewed changes

cwperks mentioned this pull request Jun 6, 2024

Replace BouncyCastle's OpenBSDBCrypt use with password4j for password hashing and verification #4381

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance benchmark workflow #4282

Performance benchmark workflow #4282

abraakie commented Apr 23, 2024

peternied left a comment

peternied Apr 24, 2024

peternied Apr 24, 2024

peternied Apr 24, 2024

peternied Apr 24, 2024

cwperks May 13, 2024

peternied Apr 24, 2024

codecov bot commented Apr 25, 2024

cwperks May 13, 2024

Performance benchmark workflow #4282

Are you sure you want to change the base?

Performance benchmark workflow #4282

Conversation

abraakie commented Apr 23, 2024

Description

Issues Resolved

Testing

Check List

peternied left a comment

Choose a reason for hiding this comment

peternied Apr 24, 2024

Choose a reason for hiding this comment

peternied Apr 24, 2024

Choose a reason for hiding this comment

peternied Apr 24, 2024

Choose a reason for hiding this comment

peternied Apr 24, 2024

Choose a reason for hiding this comment

cwperks May 13, 2024

Choose a reason for hiding this comment

peternied Apr 24, 2024

Choose a reason for hiding this comment

codecov bot commented Apr 25, 2024

Codecov Report

cwperks May 13, 2024

Choose a reason for hiding this comment