Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Additional components in profile breakdown for KNN query to increase visibility #2286

Open
shatejas opened this issue Nov 25, 2024 · 1 comment

Comments

@shatejas
Copy link
Collaborator

shatejas commented Nov 25, 2024

Is your feature request related to a problem?

KNNQuery execution has many moving parts. It intelligently selects between exact knn search and approximate search, it rescores when there is oversampling factor present and it also executes filter query if any. With varying number of use cases, its difficult to figure out time taken by each or a combination of these individually to optimize the query or debug latencies. The request is to add time taken be each of these components when profile = true which increases visibility

What solution would you like?
I propose two new additional components in query break down exact_knn_search and ann_search along with the support to have visibility on filter query in knn in the current breakdown tree. Here is the sample response

"profile": {
		"shards": [
			{
				"id": "[lChSC1BdRFWt9dfBJoO3UA][hotels-hnsw][0]",
				"inbound_network_time_in_millis": 0,
				"outbound_network_time_in_millis": 0,
				"searches": [
					{
						"query": [
							{
								"type": "KNNQuery",
								"description": "",
								"time_in_nanos": 20503209,
								"breakdown": {
									"set_min_competitive_score_count": 0,
									"exact_knn_search": 5436542,
									"match_count": 0,
									"shallow_advance_count": 0,
									"next_doc": 7584,
									"score_count": 3,
									"compute_max_score_count": 0,
									"advance": 0,
									"advance_count": 0,
									"score": 6083,
									"shallow_advance": 0,
									"create_weight_count": 1,
									"build_scorer": 11771833,
									"ann_search": 0,
									"set_min_competitive_score": 0,
									"match": 0,
									"next_doc_count": 4,
									"compute_max_score": 0,
									"build_scorer_count": 2,
									"create_weight": 3281167,
									"ann_search_count": 0,
									"exact_knn_search_count": 1
								},
								"children": [
									{
										"type": "BooleanQuery",
										"description": "+IndexOrDocValuesQuery(indexQuery=rating:[8 TO 10], dvQuery=rating:[8 TO 10])",
										"time_in_nanos": 4047005,
										"breakdown": {
											"set_min_competitive_score_count": 0,
											"exact_knn_search": 0,
											"match_count": 0,
											"shallow_advance_count": 0,
											"next_doc": 28545,
											"score_count": 3,
											"compute_max_score_count": 0,
											"advance": 0,
											"advance_count": 0,
											"score": 3876,
											"shallow_advance": 0,
											"create_weight_count": 0,
											"build_scorer": 4014584,
											"ann_search": 0,
											"set_min_competitive_score": 0,
											"match": 0,
											"next_doc_count": 15,
											"compute_max_score": 0,
											"build_scorer_count": 2,
											"create_weight": 0,
											"ann_search_count": 0,
											"exact_knn_search_count": 0
										},

Here is the POC code that adds this ability, it requires changes in Opensearch and knn-plugin

Opensearch: https://github.com/shatejas/OpenSearch/tree/knnProfiler
KNN plugin : main...shatejas:k-NN:knnprofilerquery

Whats missing in POC

  • Support for rescoring is missing
  • NativeEngineKNNQuery is not validated with profile = true

What alternatives have you considered?

Another approach would be to use KNNStats to be able to have metrics around these components. The solution is not yet explored

@navneet1v
Copy link
Collaborator

@shatejas thanks tejas for creating the GH issue and providing a POC. A similar GH issue is created here: #1985

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Backlog (Hot)
Development

No branches or pull requests

2 participants