Adds a command-line argument to specify more latency percentiles at the end of a workload. #441
Conversation
…at the end of a workload Signed-off-by: Peter Alfonsi <petealft@amazon.com>
osbenchmark/metrics.py
Outdated
| self.revision = revision | ||
| self.results = results | ||
| self.meta_data = meta_data | ||
| self.latency_percentiles = None |
There was a problem hiding this comment.
Similar to what is done for meta_data parameters and results, we can move the if statement on line 1347 to line 1325, and just have it reassigned to latency_percentiles. This way, we will be keeping the section that modifies the parameters together, leveraging the None assigned to latency_percentiles in the init parameter, and keeping line 1346 identical to the other parameter formats self.latency_percentiles = latency_percentiles.
osbenchmark/metrics.py
Outdated
|
|
||
| def percentiles_for_sample_size(sample_size, latency_percentiles=None): | ||
| # If latency_percentiles is present, as a list, also display those values (assuming there are enough samples) | ||
| percentiles = [50, 90, 99, 99.9, 99.99, 100] |
There was a problem hiding this comment.
Might be better to make this a constant that's referenced and instead of initializing it here. That would make it easier to find and update if needed
osbenchmark/benchmark.py
Outdated
| ) | ||
| test_execution_parser.add_argument( | ||
| "--latency-percentiles", | ||
| help="A comma-separated list of percentiles to report for latency.", |
There was a problem hiding this comment.
similar to other argparse arguments, could you add the default in parenthesis? Here's a good example and can be dynamically done if we store this as a constant somewhere in metrics.py. That way if we ever do change the default, we won't have to do a replace and find.:
help=f"Comma-separated list of client options to use. (default: {opts.ClientOptions.DEFAULT_CLIENT_OPTIONS})")
osbenchmark/metrics.py
Outdated
| effective_sample_size = 10 ** (int(math.log10(sample_size))) # round down to nearest power of ten | ||
| delta = 0.000001 # If (p / 100) * effective_sample_size is within this value of a whole number, | ||
| # assume the discrepancy is due to floating point and allow it | ||
| filtered_percentiles = [] |
There was a problem hiding this comment.
Is this re-initialization needed if it's already initialized on line 1685?
There was a problem hiding this comment.
Good catch - I'd changed the structure of this fn and forgot to remove this initialization
IanHoang
left a comment
There was a problem hiding this comment.
Thanks for doing this! Left some comments. Also, when you get a chance, could you perform a run with --test-mode and this new parameter and include the output in PR description?
Signed-off-by: Peter Alfonsi <petealft@amazon.com>
238fe32 to
f710dec
Compare
IanHoang
left a comment
There was a problem hiding this comment.
Thank you for these changes!
Description
Adds a new optional argument,
--latency-percentiles. The user specifies additional percentiles in a comma-separated list to be reported at the end of the workload, assuming the sample size is large enough for them to make sense. The existing percentiles (50, 90, 99, 99.9, 99.99, 100) are the default, but providing a value for--latency-percentilesoverrides this.Example usage:
opensearch-benchmark execute-test --pipeline=benchmark-only --workload-path=/home/ec2-user/osb/opensearch-benchmark-workloads/modified_nyc_taxis --target-host=http://localhost:9200 --latency-percentiles=0,10,20,25,30,40,60,70,75,80,80.1,90.1,99.99Result:
Result with --test-mode on (only 100 shows, because of the small sample size):
Issues Resolved
Resolves #435
Testing
Added UT for the logic to decide which percentiles to report based on sample size. Manually tested workloads using the new argument with various values, or no value, and confirmed the printout at the end was as expected. I couldn't find a good way to non-manually test it end-to-end in the way other ITs are done - please let me know how I should do this.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.