-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PoC DO NOT MERGE - Semantic Query #11
PoC DO NOT MERGE - Semantic Query #11
Conversation
@@ -71,6 +71,8 @@ protected UpdateByQueryRequest buildRequest(RestRequest request, NamedWriteableR | |||
consumers.put("script", o -> internal.setScript(Script.parse(o))); | |||
consumers.put("max_docs", s -> setMaxDocsValidateIdentical(internal, ((Number) s).intValue())); | |||
|
|||
// TODO There surely must be a better way of doing this | |||
request.params().put("_source_includes", "*"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Included source exclusion from the demo. Hacky but gets the job done for now
* sort them according to the provided order. This can be useful for instance to ensure that shards that contain recent | ||
* data are executed first when sorting by descending timestamp. | ||
*/ | ||
final class CoordinatorQueryRewriteSearchPhase extends SearchPhase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New search phase for rewriting queries in the coordinator node
ThreadPool threadPool, | ||
SearchResponse.Clusters clusters | ||
) { | ||
if (preFilter) { | ||
if (runCoordinatorPhase) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added as a pre-step for search phases
this.fieldNamesToInferenceModel = Map.of(); | ||
} | ||
|
||
public CoordinatorRewriteContext( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A new constructor for the rewriting in the coordinator search phase
@@ -63,4 +66,9 @@ public CoordinatorRewriteContext getCoordinatorRewriteContext(Index index) { | |||
|
|||
return new CoordinatorRewriteContext(parserConfig, client, nowInMillis, timestampRange, dateFieldType); | |||
} | |||
|
|||
@Nullable | |||
public CoordinatorRewriteContext getCoordinatorRewriteContextForModels(Map<String, Set<String>> fieldToModelIds) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New method for obtaining the rewrite context for the new search phase
if (source == null) { | ||
return false; | ||
} | ||
return source.subSearches().stream().anyMatch(sqwb -> sqwb.getQueryBuilder() instanceof CoordinatorRewriteableQueryBuilder); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've used a marker interface for this - we can change that later
@@ -124,7 +125,8 @@ public static FetchSourceContext parseFromRestRequest(RestRequest request) { | |||
if (fetchSource != null || sourceIncludes != null || sourceExcludes != null) { | |||
return FetchSourceContext.of(fetchSource == null || fetchSource, sourceIncludes, sourceExcludes); | |||
} | |||
return null; | |||
|
|||
return FetchSourceContext.of(true, null, new String[]{"*." + SemanticTextFieldMapper.SPARSE_VECTOR_SUBFIELD_NAME}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hacky source exclusion
return inferenceResultsToQuery(fieldName, inferenceResultsSupplier.get()); | ||
} | ||
|
||
Set<String> modelNames = coordinatorRewriteContext.inferenceModelsForFieldName(fieldName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Retrieves model ids for the field and performs some validations.
PoC for a
semantic_query
query builder:Uses a new search phase for performing coordinator level query rewriting.
New information for field to model IDs is added to the coordinator rewrite context so the query can perform the actual inference on rewriting.
You can follow this gist to test.