Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Catch task description error #12834

Merged
merged 4 commits into from
Mar 22, 2024

Conversation

dblock
Copy link
Member

@dblock dblock commented Mar 21, 2024

Description

Alternative to (just the first two commits) to #12831.

A user that enables trace-level logging with OpenSearch will run into a serialization error on geo tasks that they otherwise would not run into. It comes from here. This also happens in a test, and #12783 attempts to ignore the failed serialization error. This PR always returns a task description.

Reproduce with

./gradlew ':server:test' --tests "org.opensearch.search.geo.GeoPointShapeQueryTests.testQueryLinearRing" -Dtests.opensearch.logger.level=TRACE

Related Issues

Closes #11688
Closes #12324

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Failing checks are inspected and point to the corresponding known issue(s) (See: Troubleshooting Failing Builds)
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)
  • Public documentation issue/PR created

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@reta
Copy link
Collaborator

reta commented Mar 21, 2024

@reta Do you see any bad unintended consequences here?

No, I don't, thanks @andrross , I also feel like this is the direction we should be taking, I've suggested a bit more wider fix here #12834 (comment)

@dblock
Copy link
Member Author

dblock commented Mar 21, 2024

@reta implemented your suggested approach + test

@andrross
Copy link
Member

A bit of a side note, given that a linear ring is a line string with some special constraints (start and end point must be the same), I wonder why ShapeType.LINEARRING was introduced? It seems like LinearRing could have just been a child of Line (which it currently is) with a constructor enforcing its special constraint, but otherwise could have delegated all other logic to the parent Line class.

Copy link
Contributor

✅ Gradle check result for 316479d: SUCCESS

Copy link

codecov bot commented Mar 21, 2024

Codecov Report

Attention: Patch coverage is 71.42857% with 2 lines in your changes are missing coverage. Please review.

Project coverage is 71.31%. Comparing base (b15cb0c) to head (71d3f02).
Report is 71 commits behind head on main.

Files Patch % Lines
...va/org/opensearch/action/search/SearchRequest.java 66.66% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #12834      +/-   ##
============================================
- Coverage     71.42%   71.31%   -0.11%     
- Complexity    59978    60170     +192     
============================================
  Files          4985     5011      +26     
  Lines        282275   283664    +1389     
  Branches      40946    41117     +171     
============================================
+ Hits         201603   202294     +691     
- Misses        63999    64595     +596     
- Partials      16673    16775     +102     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@reta
Copy link
Collaborator

reta commented Mar 21, 2024

A bit of a side note, given that a linear ring is a line string with some special constraints (start and end point must be the same), I wonder why ShapeType.LINEARRING was introduced?

Hard to say to be fair, wanna take a look? ;-)

@github-actions github-actions bot added the Other label Mar 21, 2024
Copy link
Contributor

❕ Gradle check result for 71d3f02: UNSTABLE

  • TEST FAILURES:
      3 org.opensearch.cluster.MinimumClusterManagerNodesIT.testThreeNodesNoClusterManagerBlock

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

@reta
Copy link
Collaborator

reta commented Mar 21, 2024

It's all yours @dblock !

@andrross andrross added the backport 2.x Backport to 2.x branch label Mar 21, 2024
@reta reta merged commit 0c956e8 into opensearch-project:main Mar 22, 2024
37 of 39 checks passed
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.x
# Create a new branch
git switch --create backport/backport-12834-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 0c956e81373c9d5878710c1bafe471431000a64f
# Push it to GitHub
git push --set-upstream origin backport/backport-12834-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-12834-to-2.x.

@reta
Copy link
Collaborator

reta commented Mar 22, 2024

@dblock could you please backport to 2.x manually? thank you!

sb.append("source[");
try {
sb.append(source.toString(FORMAT_PARAMS));
} catch (final OpenSearchException ex) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@reta / @dblock / @andrross : This is not right. buildDescription() shouldn't swallow exception. Anything unexpected should fail the request as early as possible. Earlier, IOException was wrapped and thrown and now that also gets swallowed. Why an object which is throwing UnsupportedOperationException not handled properly and reached till here at all? In order to fix an user behavior (which may not be an issue, as actual request may not reach till this point as well, unless someone has tested an actual user rest request and confirmed the behavior?). Tomorrow, somebody would wrap NPE also as OpenSearchException and start swallowing here as apparently buildDescription should succeed for some reason.

Copy link
Collaborator

@reta reta Mar 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why an object which is throwing UnsupportedOperationException not handled properly and reached till here at all?

@shwetathareja There are few levels this exception will be raised and it will be handled properly, it is not swallowed. The issue that the description of the request (serialization of the POJO) - the string representation - triggers the problem first, even before the request is processed (and validated). So the suggested fix here is to provide partial description of the request that will eventually fail for exactly the same cause. This is what tests are verifying - the request fails with UnsupportedOperationException

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specifically in the request path the request is read, then traced (including its description, which throws exception), then validated & al. If we switch the order of things and decide whether the request should be accepted before tracing it, then tracing will fail to show what the request being validated was.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the rest request will come, first the request has to be parsed/ deserialized, shouldn't it fail there, rather fail during serialization flow which comes after that.

I tried indexing and search linearring shape type with trace logging enabled on OpenSearch 2.11 cluster.
trace logging enabled

    "logger": {
      "org": {
        "opensearch": {
          "tasks": "TRACE"
        }
      }
    }

Below are the errors, I got:

During Indexing:

"caused_by": {
        "type": "illegal_argument_exception",
        "reason": "No enum constant org.opensearch.geometry.ShapeType.LINERRING"
      }

During search

    "caused_by": {
      "type": "illegal_argument_exception",
      "reason": "unknown geo_shape [linearring]"
    }

The error during search is as expected from the parsing flow:

final GeoShapeType type = GeoShapeType.forName(subParser.text());

Do you have an example of a rest request which will reach the serialization error ( from .getDescription()) which is fixed in the code above or the validation error you mentioned.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not an end-to-end test, The test simply testing a particular flow for GeoShapeQueryBuilder and expecting an exception.

Copy link
Collaborator

@reta reta Mar 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The builder could validate the supported shapes in AbstractGeometryQueryBuilder when creating the POJO itself.

I already replied to that here #12834 (comment)

I think the ideal option is @andrross 's idea to get rid of LinearRing if possible - this is supporting abstraction

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dblock : Are you reverting this change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I am still convinced that the implementation merged here is better than what we had. Today tasks can fail to getDescription() and the better place to guard it is in the code we added it in. If we get rid of LinearRing, then we don't need to protect from failed serialization.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we remove the Linearring, the only exception, you are taking care in getDescription() is IOException which we shouldn't catch and swallow even for getDescription() to work. I don't think there is need for this exception handling considering it has always been like this, not impacting any other tests/ flow yet.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree re:once we remove the Linearring. I opened #12910, I'll try to get to it.

dblock added a commit to dblock/OpenSearch that referenced this pull request Mar 22, 2024
* Always return a task description even when it cannot be serialized.

Signed-off-by: dblock <[email protected]>

* Expect tasks to fail.

Signed-off-by: dblock <[email protected]>

* Only catch exceptions when getting description.

Signed-off-by: dblock <[email protected]>

* Added <error: ...> to mark error more clearly.

Signed-off-by: dblock <[email protected]>

---------

Signed-off-by: dblock <[email protected]>
(cherry picked from commit 0c956e8)
@dblock dblock deleted the catch-task-description-error branch March 22, 2024 12:22
@dblock
Copy link
Member Author

dblock commented Mar 22, 2024

@dblock could you please backport to 2.x manually? thank you!

#12858

dblock added a commit to dblock/OpenSearch that referenced this pull request Mar 25, 2024
* Always return a task description even when it cannot be serialized.

Signed-off-by: dblock <[email protected]>

* Expect tasks to fail.

Signed-off-by: dblock <[email protected]>

* Only catch exceptions when getting description.

Signed-off-by: dblock <[email protected]>

* Added <error: ...> to mark error more clearly.

Signed-off-by: dblock <[email protected]>

---------

Signed-off-by: dblock <[email protected]>
(cherry picked from commit 0c956e8)
reta pushed a commit that referenced this pull request Mar 25, 2024
* Always return a task description even when it cannot be serialized.

Signed-off-by: dblock <[email protected]>

* Expect tasks to fail.

Signed-off-by: dblock <[email protected]>

* Only catch exceptions when getting description.

Signed-off-by: dblock <[email protected]>

* Added <error: ...> to mark error more clearly.

Signed-off-by: dblock <[email protected]>

---------

Signed-off-by: dblock <[email protected]>
(cherry picked from commit 0c956e8)
shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
* Always return a task description even when it cannot be serialized.

Signed-off-by: dblock <[email protected]>

* Expect tasks to fail.

Signed-off-by: dblock <[email protected]>

* Only catch exceptions when getting description.

Signed-off-by: dblock <[email protected]>

* Added <error: ...> to mark error more clearly.

Signed-off-by: dblock <[email protected]>

---------

Signed-off-by: dblock <[email protected]>
Signed-off-by: Shivansh Arora <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch backport-failed bug Something isn't working flaky-test Random test failure that succeeds on second run Geospatial Other Search Search query, autocomplete ...etc skip-changelog
Projects
None yet
4 participants