Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DerivedFields] PR1 - DerivedFieldScript and query execution logic #12746

Merged
merged 8 commits into from
Mar 25, 2024

Conversation

rishabhmaurya
Copy link
Contributor

@rishabhmaurya rishabhmaurya commented Mar 19, 2024

Description

Derived Fields PR1:

Contains building block logic for DerivedFields -

  1. DerivedFieldScript factory: This script factory will be used to execute scripts defined against derived fields of any type.
  2. DerivedFieldValueFetcher: The value fetcher contains logic to execute script and fetch the value in form of List<Object>. It expects DerivedFieldScript.LeafFactory as an input and sets the contract with consumer to call setNextReader() whenever a segment is switched.
  3. DerivedFieldQuery: This query will be used by any of the derived fields. It expects an input query and DerivedFieldValueFetcher. It uses 2-phase iterator approach with approximation iterator set to match all docs. On a match, it creates a lucene MemoryIndex for a given doc, fetches the value of the derived field from _source using DerivedFieldValueFetcher and executes the input query against.

This PR doesn't address any integration and is a standalone PR which will be used by subsequent PRs (#12808, #12569, )

Related Issues

Resolves #12508

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Failing checks are inspected and point to the corresponding known issue(s) (See: Troubleshooting Failing Builds)
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)
  • Public documentation issue/PR created

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link
Contributor

github-actions bot commented Mar 19, 2024

Compatibility status:

Checks if related components are compatible with change e6c7c63

Incompatible components

Incompatible components: [https://github.com/opensearch-project/security.git]

Skipped components

Compatible components

Compatible components: [https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/flow-framework.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/sql.git]

Copy link
Contributor

❌ Gradle check result for 668d1bb: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

✅ Gradle check result for f71522a: SUCCESS

Copy link

codecov bot commented Mar 21, 2024

Codecov Report

Attention: Patch coverage is 63.38028% with 26 lines in your changes are missing coverage. Please review.

Project coverage is 71.37%. Comparing base (b15cb0c) to head (e6c7c63).
Report is 81 commits behind head on main.

Files Patch % Lines
.../org/opensearch/index/query/DerivedFieldQuery.java 60.97% 15 Missing and 1 partial ⚠️
...java/org/opensearch/script/DerivedFieldScript.java 60.00% 8 Missing ⚠️
...nsearch/index/mapper/DerivedFieldValueFetcher.java 80.00% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #12746      +/-   ##
============================================
- Coverage     71.42%   71.37%   -0.05%     
- Complexity    59978    60220     +242     
============================================
  Files          4985     5014      +29     
  Lines        282275   283746    +1471     
  Branches      40946    41126     +180     
============================================
+ Hits         201603   202524     +921     
- Misses        63999    64358     +359     
- Partials      16673    16864     +191     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

❌ Gradle check result for 2002c2c: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 48b5680: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for beec404: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for b90f5dd: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❕ Gradle check result for e6c7c63: UNSTABLE

  • TEST FAILURES:
      2 org.opensearch.cluster.MinimumClusterManagerNodesIT.testThreeNodesNoClusterManagerBlock

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

@msfroh
Copy link
Collaborator

msfroh commented Mar 25, 2024

The codecov/patch check will get covered on subsequent commits.

@msfroh msfroh merged commit 70711cf into opensearch-project:main Mar 25, 2024
35 of 39 checks passed
@msfroh msfroh added the backport 2.x Backport to 2.x branch label Mar 28, 2024
opensearch-trigger-bot bot pushed a commit that referenced this pull request Mar 28, 2024
First in a series of commits to support derived fields, a form of schema-on-read.

This commit adds:
1. DerivedFieldScript factory: This script factory will be used to execute scripts defined against
   derived fields of any type.
2. DerivedFieldValueFetcher: The value fetcher contains logic to execute script and fetch the value
   in form of List<Object>. It expects DerivedFieldScript.LeafFactory as an input and sets the contract
   with consumer to call setNextReader() whenever a segment is switched.
3. DerivedFieldQuery: This query will be used by any of the derived fields. It expects an input query and
   DerivedFieldValueFetcher. It uses 2-phase iterator approach with approximation iterator set to match
   all docs. On a match, it creates a lucene MemoryIndex for a given doc, fetches the value of the derived
   field from _source using DerivedFieldValueFetcher and executes the input query against.

---------

Signed-off-by: Rishabh Maurya <[email protected]>
(cherry picked from commit 70711cf)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
msfroh pushed a commit that referenced this pull request Apr 2, 2024
…#12968)

First in a series of commits to support derived fields, a form of schema-on-read.

This commit adds:
1. DerivedFieldScript factory: This script factory will be used to execute scripts defined against
   derived fields of any type.
2. DerivedFieldValueFetcher: The value fetcher contains logic to execute script and fetch the value
   in form of List<Object>. It expects DerivedFieldScript.LeafFactory as an input and sets the contract
   with consumer to call setNextReader() whenever a segment is switched.
3. DerivedFieldQuery: This query will be used by any of the derived fields. It expects an input query and
   DerivedFieldValueFetcher. It uses 2-phase iterator approach with approximation iterator set to match
   all docs. On a match, it creates a lucene MemoryIndex for a given doc, fetches the value of the derived
   field from _source using DerivedFieldValueFetcher and executes the input query against.

---------


(cherry picked from commit 70711cf)

Signed-off-by: Rishabh Maurya <[email protected]>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
@rishabhmaurya rishabhmaurya mentioned this pull request Apr 11, 2024
6 tasks
shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
…rch-project#12746)

First in a series of commits to support derived fields, a form of schema-on-read.

This commit adds:
1. DerivedFieldScript factory: This script factory will be used to execute scripts defined against
   derived fields of any type.
2. DerivedFieldValueFetcher: The value fetcher contains logic to execute script and fetch the value
   in form of List<Object>. It expects DerivedFieldScript.LeafFactory as an input and sets the contract
   with consumer to call setNextReader() whenever a segment is switched.
3. DerivedFieldQuery: This query will be used by any of the derived fields. It expects an input query and
   DerivedFieldValueFetcher. It uses 2-phase iterator approach with approximation iterator set to match
   all docs. On a match, it creates a lucene MemoryIndex for a given doc, fetches the value of the derived
   field from _source using DerivedFieldValueFetcher and executes the input query against.

---------

Signed-off-by: Rishabh Maurya <[email protected]>
Signed-off-by: Shivansh Arora <[email protected]>
harshavamsi pushed a commit to harshavamsi/OpenSearch that referenced this pull request Apr 29, 2024
…rch-project#12746)

First in a series of commits to support derived fields, a form of schema-on-read.

This commit adds:
1. DerivedFieldScript factory: This script factory will be used to execute scripts defined against 
   derived fields of any type.
2. DerivedFieldValueFetcher: The value fetcher contains logic to execute script and fetch the value 
   in form of List<Object>. It expects DerivedFieldScript.LeafFactory as an input and sets the contract 
   with consumer to call setNextReader() whenever a segment is switched.
3. DerivedFieldQuery: This query will be used by any of the derived fields. It expects an input query and
   DerivedFieldValueFetcher. It uses 2-phase iterator approach with approximation iterator set to match 
   all docs. On a match, it creates a lucene MemoryIndex for a given doc, fetches the value of the derived
   field from _source using DerivedFieldValueFetcher and executes the input query against.

---------

Signed-off-by: Rishabh Maurya <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch enhancement Enhancement or improvement to existing feature or request Search Search query, autocomplete ...etc skip-changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Derived Fields] Add basic query support with derived fields
3 participants