-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds various Query overrides to Keyword Field to enable doc_values #10425
Adds various Query overrides to Keyword Field to enable doc_values #10425
Conversation
Gradle Check (Jenkins) Run Completed with:
|
Compatibility status:Checks if related components are compatible with change a80478b Incompatible componentsIncompatible components: [https://github.com/opensearch-project/performance-analyzer.git] Skipped componentsCompatible componentsCompatible components: [https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/neural-search.git] |
REF https://build.ci.opensearch.org/job/gradle-check/27203/console REPRODUCE WITH: ./gradlew ':qa:mixed-cluster:v2.11.0#mixedClusterTest' --tests "org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT" -Dtests.method="test {p0=search.highlight/40_keyword_ignore/Plain Highligher should skip highlighting ignored keyword values}" -Dtests.seed=66EE8CD404E0EC5B -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=bg -Dtests.timezone=America/Guatemala -Druntime.java=20 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT > test {p0=search.highlight/40_keyword_ignore/Plain Highligher should skip highlighting ignored keyword values} FAILED |
The issue here seems to be that the change I introduced seems to not take into consideration the highlight object in a query
this above query fails to return the highlight field. |
@msfroh need some help understanding the highlighter subphase here. The use of |
Gradle Check (Jenkins) Run Completed with:
|
Met with @harshavamsi today to discuss the failing highlighting. IMO, we should open a Lucene issue to unwrap the For now, we can handle it in Line 99 in 1447d75
|
@harshavamsi, can you please add YAML integ tests to cover the docvalue-only cases? I'm sure that we don't have tests for those because they wouldn't have worked until this commit. (Yay! We're making searches work that wouldn't have previously been possible.) |
Gradle Check (Jenkins) Run Completed with:
|
b2d0d7f
to
db3568c
Compare
Gradle Check (Jenkins) Run Completed with:
|
db3568c
to
29f79fe
Compare
Gradle Check (Jenkins) Run Completed with:
|
Codecov Report
@@ Coverage Diff @@
## main #10425 +/- ##
============================================
- Coverage 71.26% 71.23% -0.04%
+ Complexity 58722 58429 -293
============================================
Files 4870 4845 -25
Lines 276608 275430 -1178
Branches 40207 40117 -90
============================================
- Hits 197137 196202 -935
+ Misses 63060 62816 -244
- Partials 16411 16412 +1
|
Gradle Check (Jenkins) Run Completed with:
|
29062ab
to
dacaa7d
Compare
Gradle Check (Jenkins) Run Completed with:
|
dacaa7d
to
986b741
Compare
Gradle Check (Jenkins) Run Completed with:
|
986b741
to
c819524
Compare
Gradle Check (Jenkins) Run Completed with:
|
5907130
to
7e1de58
Compare
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
Gradle failure is un-related to my change:
|
87556c0
to
9ec68e0
Compare
Gradle Check (Jenkins) Run Completed with:
|
@harshavamsi -- code looks good to me. I'm retrying Gradle check. If it still fails, you might need to rebase. Have you raised a documentation issue? We probably need to update the documentation for keyword fields at least to cover the fact that queries can still run on a non-indexed keyword field if it has doc values. |
Gradle Check (Jenkins) Run Completed with:
|
The keywordfield mapper provides access to various query types, e.g. the termsQuery, fuzzyQuery. These are inherited as is from the StringType. But we do not take into account the fact that keyword fields can have doc_values enabled. This PR adds the ability for various queries to first check if doc_values are enabled and if so out-source the work to lucene to decide if it's better to use index values or doc_values when running queries. Signed-off-by: Harsha Vamsi Kalluri <[email protected]>
9ec68e0
to
a80478b
Compare
Gradle Check (Jenkins) Run Completed with:
|
Failing test is flaky and unrelated to my change. @reta @nknize @rishabhmaurya @andrross can you take a look? |
|
An "unstable" result is acceptable -- it means that a test failed on some runs but passed on others. |
rest-api-spec/src/main/resources/rest-api-spec/test/search/340_keyword_doc_values.yml
Show resolved
Hide resolved
The backport to
To backport manually, run these commands in your terminal: # Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.x
# Create a new branch
git switch --create backport/backport-10425-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 63aff16c0d08bac44e1d1a6158ee3f838a043074
# Push it to GitHub
git push --set-upstream origin backport/backport-10425-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.x Then, create a pull request where the |
The keywordfield mapper provides access to various query types, e.g. the termsQuery, fuzzyQuery. These are inherited as is from the StringType. But we do not take into account the fact that keyword fields can have doc_values enabled. This PR adds the ability for various queries to first check if doc_values are enabled and if so out-source the work to lucene to decide if it's better to use index values or doc_values when running queries. Signed-off-by: Harsha Vamsi Kalluri <[email protected]> (cherry picked from commit 63aff16)
…#11031) * Adds various Query overrides to Keyword Field (#10425) The keywordfield mapper provides access to various query types, e.g. the termsQuery, fuzzyQuery. These are inherited as is from the StringType. But we do not take into account the fact that keyword fields can have doc_values enabled. This PR adds the ability for various queries to first check if doc_values are enabled and if so out-source the work to lucene to decide if it's better to use index values or doc_values when running queries. Signed-off-by: Harsha Vamsi Kalluri <[email protected]> (cherry picked from commit 63aff16) * Update CHANGELOG.md --------- Signed-off-by: Harsha Vamsi Kalluri <[email protected]> Co-authored-by: Michael Froh <[email protected]>
The keywordfield mapper provides access to various query types, e.g. the termsQuery, fuzzyQuery. These are inherited as is from the StringType. But we do not take into account the fact that keyword fields can have doc_values enabled. This PR adds the ability for various queries to first check if doc_values are enabled and if so out-source the work to lucene to decide if it's better to use index values or doc_values when running queries. Signed-off-by: Harsha Vamsi Kalluri <[email protected]> Signed-off-by: Shivansh Arora <[email protected]>
Description
The keywordfield mapper provides access to various query types, e.g. the termsQuery, fuzzyQuery. These are inherited as is from the StringType. But we do not take into account the fact that keyword fields can have doc_values enabled. This PR adds the ability for various queries to first check if doc_values are enabled and if so outsource the work to lucene to decide if it's better to use index values or doc_values when running queries.
Related Issues
Resolves #7057
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.