-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds a command-line argument to specify more latency percentiles at the end of a workload. #441
Conversation
…at the end of a workload Signed-off-by: Peter Alfonsi <[email protected]>
osbenchmark/metrics.py
Outdated
@@ -1337,6 +1343,10 @@ def __init__(self, benchmark_version, benchmark_revision, environment_name, | |||
self.revision = revision | |||
self.results = results | |||
self.meta_data = meta_data | |||
self.latency_percentiles = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to what is done for meta_data
parameters and results
, we can move the if statement on line 1347 to line 1325, and just have it reassigned to latency_percentiles
. This way, we will be keeping the section that modifies the parameters together, leveraging the None
assigned to latency_percentiles
in the init parameter, and keeping line 1346 identical to the other parameter formats self.latency_percentiles = latency_percentiles
.
osbenchmark/metrics.py
Outdated
|
||
def percentiles_for_sample_size(sample_size, latency_percentiles=None): | ||
# If latency_percentiles is present, as a list, also display those values (assuming there are enough samples) | ||
percentiles = [50, 90, 99, 99.9, 99.99, 100] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be better to make this a constant that's referenced and instead of initializing it here. That would make it easier to find and update if needed
osbenchmark/benchmark.py
Outdated
@@ -554,6 +554,11 @@ def add_workload_source(subparser): | |||
default=False, | |||
help="If any processes is running, it is going to kill them and allow Benchmark to continue to run." | |||
) | |||
test_execution_parser.add_argument( | |||
"--latency-percentiles", | |||
help="A comma-separated list of percentiles to report for latency.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similar to other argparse arguments, could you add the default in parenthesis? Here's a good example and can be dynamically done if we store this as a constant somewhere in metrics.py. That way if we ever do change the default, we won't have to do a replace and find.:
help=f"Comma-separated list of client options to use. (default: {opts.ClientOptions.DEFAULT_CLIENT_OPTIONS})")
osbenchmark/metrics.py
Outdated
effective_sample_size = 10 ** (int(math.log10(sample_size))) # round down to nearest power of ten | ||
delta = 0.000001 # If (p / 100) * effective_sample_size is within this value of a whole number, | ||
# assume the discrepancy is due to floating point and allow it | ||
filtered_percentiles = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this re-initialization needed if it's already initialized on line 1685?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch - I'd changed the structure of this fn and forgot to remove this initialization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this! Left some comments. Also, when you get a chance, could you perform a run with --test-mode
and this new parameter and include the output in PR description?
Signed-off-by: Peter Alfonsi <[email protected]>
238fe32
to
f710dec
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for these changes!
Description
Adds a new optional argument,
--latency-percentiles
. The user specifies additional percentiles in a comma-separated list to be reported at the end of the workload, assuming the sample size is large enough for them to make sense. The existing percentiles (50, 90, 99, 99.9, 99.99, 100) are the default, but providing a value for--latency-percentiles
overrides this.Example usage:
opensearch-benchmark execute-test --pipeline=benchmark-only --workload-path=/home/ec2-user/osb/opensearch-benchmark-workloads/modified_nyc_taxis --target-host=http://localhost:9200 --latency-percentiles=0,10,20,25,30,40,60,70,75,80,80.1,90.1,99.99
Result:
Result with --test-mode on (only 100 shows, because of the small sample size):
Issues Resolved
Resolves #435
Testing
Added UT for the logic to decide which percentiles to report based on sample size. Manually tested workloads using the new argument with various values, or no value, and confirmed the printout at the end was as expected. I couldn't find a good way to non-manually test it end-to-end in the way other ITs are done - please let me know how I should do this.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.