Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make output metrics extendable #199

Open
jmazanec15 opened this issue May 31, 2022 · 5 comments
Open

Make output metrics extendable #199

jmazanec15 opened this issue May 31, 2022 · 5 comments
Assignees
Labels
enhancement New feature or request good first issue Good for newcomers Medium Priority

Comments

@jmazanec15
Copy link
Member

Is your feature request related to a problem? Please describe.

For the k-NN plugin, I am working on adding a custom runner that will execute queries from a numeric data set and calculate the recall. k-NN plugin has an assortment of Approximate Nearest Neighbor algorithms. Generally, users will need to make tradeoffs between the approximateness of their system and the latency/throughput - so they need the ability to see both of these metrics when benchmarking.

In the custom query runner, I return the recall alongside the latency, but this only gets stored as request meta data - not as an outputted result.

Describe the solution you'd like

I would like the ability to specify that "recall" should be output as a metric in the results and define the aggregation as taking the mean average.

In a more general sense, I would like the ability to be able to define custom metrics for runners and define their aggregations and get them to show up in the results.

Describe alternatives you've considered

  1. Find the metadata and collect my metric this way -- requires a lot of manual effort

Additional context

  1. Add Querying Functionality to OSB k-NN#409
  2. Support Benchmarking K-NN Plugin and Vectorsearch Workload #103
@jmazanec15 jmazanec15 added the enhancement New feature or request label May 31, 2022
@gkamat
Copy link
Collaborator

gkamat commented Jun 6, 2022

This is going to require flexibility in how the results metrics are defined, computed, processed and reported. It will take some consideration.

jmazanec15 added a commit to jmazanec15/k-NN-1 that referenced this issue Jun 7, 2022
Removes recall calculation from benchmarking logic as this is delayed
until
opensearch-project/opensearch-benchmark#199
can be implemented.

Signed-off-by: John Mazanec <[email protected]>
@jmazanec15
Copy link
Member Author

Right, I guess there are a few other applications I can think of that may require similar functionality: Anomaly Detection, Learning to Rank. For these, recall/accuracy are KPIs.

@amitgalitz
Copy link
Member

+1 on this. extendable metrics would help Anomaly Detection as well, we are starting to define how we benchmark AD in various ways such as our own execution time to get an anomaly result, recall/precision, and other KPIs on our own specific workloads and as a detector is running. I also want to add that this will greatly benefit ML-Commons as well

@cgchinmay
Copy link
Collaborator

@IanHoang as discussed offline, taking a look at this issue

@peteralfonsi
Copy link
Contributor

Added a new issue #435 which would allow the user to specify percentiles they want to see, which would be a subset of this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers Medium Priority
Projects
Development

No branches or pull requests

6 participants