-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make output metrics extendable #199
Comments
This is going to require flexibility in how the results metrics are defined, computed, processed and reported. It will take some consideration. |
Removes recall calculation from benchmarking logic as this is delayed until opensearch-project/opensearch-benchmark#199 can be implemented. Signed-off-by: John Mazanec <[email protected]>
Right, I guess there are a few other applications I can think of that may require similar functionality: Anomaly Detection, Learning to Rank. For these, recall/accuracy are KPIs. |
+1 on this. extendable metrics would help Anomaly Detection as well, we are starting to define how we benchmark AD in various ways such as our own execution time to get an anomaly result, recall/precision, and other KPIs on our own specific workloads and as a detector is running. I also want to add that this will greatly benefit ML-Commons as well |
@IanHoang as discussed offline, taking a look at this issue |
Added a new issue #435 which would allow the user to specify percentiles they want to see, which would be a subset of this issue |
Is your feature request related to a problem? Please describe.
For the k-NN plugin, I am working on adding a custom runner that will execute queries from a numeric data set and calculate the recall. k-NN plugin has an assortment of Approximate Nearest Neighbor algorithms. Generally, users will need to make tradeoffs between the approximateness of their system and the latency/throughput - so they need the ability to see both of these metrics when benchmarking.
In the custom query runner, I return the recall alongside the latency, but this only gets stored as request meta data - not as an outputted result.
Describe the solution you'd like
I would like the ability to specify that "recall" should be output as a metric in the results and define the aggregation as taking the mean average.
In a more general sense, I would like the ability to be able to define custom metrics for runners and define their aggregations and get them to show up in the results.
Describe alternatives you've considered
Additional context
The text was updated successfully, but these errors were encountered: