Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds a command-line argument to specify more latency percentiles at the end of a workload. #441

Merged
merged 2 commits into from
Jan 24, 2024

Conversation

peteralfonsi
Copy link
Contributor

@peteralfonsi peteralfonsi commented Jan 19, 2024

Description

Adds a new optional argument, --latency-percentiles. The user specifies additional percentiles in a comma-separated list to be reported at the end of the workload, assuming the sample size is large enough for them to make sense. The existing percentiles (50, 90, 99, 99.9, 99.99, 100) are the default, but providing a value for --latency-percentiles overrides this.

Example usage:
opensearch-benchmark execute-test --pipeline=benchmark-only --workload-path=/home/ec2-user/osb/opensearch-benchmark-workloads/modified_nyc_taxis --target-host=http://localhost:9200 --latency-percentiles=0,10,20,25,30,40,60,70,75,80,80.1,90.1,99.99

Result:

| Mean Throughput | cheap-dropoff | 85.01 | ops/s |
| Median Throughput | cheap-dropoff | 85.01 | ops/s |
| Max Throughput | cheap-dropoff | 85.07 | ops/s |
| 0th percentile latency | cheap-dropoff | 1.04427 | ms |
| 10th percentile latency | cheap-dropoff | 1.23004 | ms |
| 20th percentile latency | cheap-dropoff | 1.35193 | ms |
| 25th percentile latency | cheap-dropoff | 1.40766 | ms |
| 30th percentile latency | cheap-dropoff | 1.46785 | ms |
| 40th percentile latency | cheap-dropoff | 1.57731 | ms |
| 60th percentile latency | cheap-dropoff | 1.78795 | ms |
| 70th percentile latency | cheap-dropoff | 1.89244 | ms |
| 75th percentile latency | cheap-dropoff | 1.94228 | ms |
| 80th percentile latency | cheap-dropoff | 1.99657 | ms |
| 80.1th percentile latency | cheap-dropoff | 1.9981 | ms |
| 90.1th percentile latency | cheap-dropoff | 2.11699 | ms |
| 99.99th percentile latency | cheap-dropoff | 3.86577 | ms |
| 0th percentile service time | cheap-dropoff | 0.814199 | ms |
| 10th percentile service time | cheap-dropoff | 0.858729 | ms |
| 20th percentile service time | cheap-dropoff | 0.870694 | ms |
| 25th percentile service time | cheap-dropoff | 0.876321 | ms |
| 30th percentile service time | cheap-dropoff | 0.882662 | ms |
| 40th percentile service time | cheap-dropoff | 0.902266 | ms |
| 60th percentile service time | cheap-dropoff | 0.928254 | ms |
| 70th percentile service time | cheap-dropoff | 0.94132 | ms |
| 75th percentile service time | cheap-dropoff | 0.954198 | ms |
| 80th percentile service time | cheap-dropoff | 1.00435 | ms |
| 80.1th percentile service time | cheap-dropoff | 1.0062 | ms |
| 90.1th percentile service time | cheap-dropoff | 1.2032 | ms |
| 99.99th percentile service time | cheap-dropoff | 2.88157 | ms |
| error rate | cheap-dropoff | 0 | % |

Result with --test-mode on (only 100 shows, because of the small sample size):

|                                                Mean Throughput | cheap-dropoff |      179.69 |  ops/s |
|                                              Median Throughput | cheap-dropoff |      179.69 |  ops/s |
|                                                 Max Throughput | cheap-dropoff |      179.69 |  ops/s |
|                                       100th percentile latency | cheap-dropoff |     7.01146 |     ms |
|                                  100th percentile service time | cheap-dropoff |     1.11934 |     ms |
|                                                     error rate | cheap-dropoff |           0 |      % |

Issues Resolved

Resolves #435

Testing

  • New functionality includes testing

Added UT for the logic to decide which percentiles to report based on sample size. Manually tested workloads using the new argument with various values, or no value, and confirmed the printout at the end was as expected. I couldn't find a good way to non-manually test it end-to-end in the way other ITs are done - please let me know how I should do this.


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

…at the end of a workload

Signed-off-by: Peter Alfonsi <[email protected]>
@@ -1337,6 +1343,10 @@ def __init__(self, benchmark_version, benchmark_revision, environment_name,
self.revision = revision
self.results = results
self.meta_data = meta_data
self.latency_percentiles = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to what is done for meta_data parameters and results, we can move the if statement on line 1347 to line 1325, and just have it reassigned to latency_percentiles. This way, we will be keeping the section that modifies the parameters together, leveraging the None assigned to latency_percentiles in the init parameter, and keeping line 1346 identical to the other parameter formats self.latency_percentiles = latency_percentiles.


def percentiles_for_sample_size(sample_size, latency_percentiles=None):
# If latency_percentiles is present, as a list, also display those values (assuming there are enough samples)
percentiles = [50, 90, 99, 99.9, 99.99, 100]
Copy link
Collaborator

@IanHoang IanHoang Jan 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be better to make this a constant that's referenced and instead of initializing it here. That would make it easier to find and update if needed

@@ -554,6 +554,11 @@ def add_workload_source(subparser):
default=False,
help="If any processes is running, it is going to kill them and allow Benchmark to continue to run."
)
test_execution_parser.add_argument(
"--latency-percentiles",
help="A comma-separated list of percentiles to report for latency.",
Copy link
Collaborator

@IanHoang IanHoang Jan 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar to other argparse arguments, could you add the default in parenthesis? Here's a good example and can be dynamically done if we store this as a constant somewhere in metrics.py. That way if we ever do change the default, we won't have to do a replace and find.:

help=f"Comma-separated list of client options to use. (default: {opts.ClientOptions.DEFAULT_CLIENT_OPTIONS})")

https://github.com/opensearch-project/opensearch-benchmark/blob/9ffbec01149b3a75f58c3d49bb2f9ea39ca1fbd8/osbenchmark/benchmark.py#L174C7-L175C41

effective_sample_size = 10 ** (int(math.log10(sample_size))) # round down to nearest power of ten
delta = 0.000001 # If (p / 100) * effective_sample_size is within this value of a whole number,
# assume the discrepancy is due to floating point and allow it
filtered_percentiles = []
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this re-initialization needed if it's already initialized on line 1685?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch - I'd changed the structure of this fn and forgot to remove this initialization

Copy link
Collaborator

@IanHoang IanHoang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this! Left some comments. Also, when you get a chance, could you perform a run with --test-mode and this new parameter and include the output in PR description?

Signed-off-by: Peter Alfonsi <[email protected]>
Copy link
Collaborator

@IanHoang IanHoang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for these changes!

@IanHoang IanHoang merged commit d81aaf8 into opensearch-project:main Jan 24, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow user to provide desired percentiles for latency/other metrics
2 participants