-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add command "tuning" (#508) #515
Conversation
To help community members understand the intentions better and since this is a newly suggested subcommand, we recommend contributors to add an example output from this feature. |
osbenchmark/tuning/optimal_finder.py
Outdated
def run_benchmark(params): | ||
commands = ["opensearch-benchmark", "execute-test"] | ||
for k, v in params.items(): | ||
commands.append(k) | ||
if v: | ||
commands.append(v) | ||
|
||
proc = None | ||
try: | ||
proc = subprocess.Popen( | ||
commands, | ||
stdout=subprocess.PIPE, | ||
stderr=subprocess.PIPE) | ||
|
||
_, stderr = proc.communicate() | ||
return proc.returncode == 0, stderr.decode('ascii') | ||
except KeyboardInterrupt as e: | ||
proc.terminate() | ||
print("Process is terminated!") | ||
raise e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the convenience of this but at the same time I'm not sure if this is best practice. This makes it feel like more of a wrapper utilizing OSB rather than a built-in feature for OSB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's similar to compare
command which also utilizes OSB and can potentially reside outside OSB, but it's also strongly coupled with OSB as it always has to work together with OSB. I feel it would just bring unnecessary churns to user if the command is in its own package that users have to install both to make it work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
compare
subcommand is a light-weight operation that pulls from the test execution store (whether that is local or from an external metric datastore) and does some diffs between them. Essentially, this is still a a recommendation tool that wraps OSB.
@IanHoang this is an example output from console ____ _____ __ ____ __ __
/ __ \____ ___ ____ / ___/___ ____ ___________/ /_ / __ )___ ____ _____/ /_ ____ ___ ____ ______/ /__
/ / / / __ \/ _ \/ __ \\__ \/ _ \/ __ `/ ___/ ___/ __ \ / __ / _ \/ __ \/ ___/ __ \/ __ `__ \/ __ `/ ___/ //_/
/ /_/ / /_/ / __/ / / /__/ / __/ /_/ / / / /__/ / / / / /_/ / __/ / / / /__/ / / / / / / / / /_/ / / / ,<
\____/ .___/\___/_/ /_/____/\___/\__,_/_/ \___/_/ /_/ /_____/\___/_/ /_/\___/_/ /_/_/ /_/ /_/\__,_/_/ /_/|_|
/_/
[INFO] There will be 4 tests to run with 2 bulk sizes, 2 batch sizes, 1 client numbers.
[INFO] Running benchmark with: bulk size: 100, number of clients: 1, batch size: 50
[INFO] Running benchmark with: bulk size: 200, number of clients: 1, batch size: 50
[INFO] Running benchmark with: bulk size: 100, number of clients: 1, batch size: 100
[INFO] Running benchmark with: bulk size: 200, number of clients: 1, batch size: 100
[INFO] The optimal variable combination is: bulk size: 200, batch size: 50, number of clients: 1
---------------------------------
[INFO] SUCCESS (took 235 seconds)
--------------------------------- Also, I added support for output aggregated intermediate benchmark results to a file, if the file is in markdown format, the data in file would look like below:
|
Signed-off-by: Liyun Xiu <[email protected]>
Signed-off-by: Liyun Xiu <[email protected]>
Signed-off-by: Liyun Xiu <[email protected]>
@IanHoang any more comments on this PR? |
@chishui I provided some input on the issue here: #508 (comment) Regarding this PR, I'm still concerned over how this subcommand invokes OSB's command line |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments yesterday regarding this.
@chishui please feel free to reopen this. Closing for now as no activity. We can work together and collaborate to find a suitable solution for your needs! |
Description
When ingesting data to OpenSearch using bulk API, using different variables could result in different ingestion performance. For example, the amount of document in bulk API, how many OpenSearch clients are used to send requests, batch size (a variable for batch ingestion opensearch-project/OpenSearch#12457) etc. It's not easy for user to experiment with all the combinations of the variables and find the option which could lead to optimal ingestion performance.
This tool is to help dealing with the pain point of tuning these variables which could impact ingestion performance and automatically find the optimal combination of the variables. It utilizes the OpenSearch-Benchmark, uses different variable combinations to run benchmark, collects their outputs, analyzes and visualizes the results.
Issues Resolved
#508
Testing
The first version of this PR is only to demonstrate ideas, tests will be added later.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.