-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix issue of red index on close for remote enabled clusters #15990
Conversation
❌ Gradle check result for a1d5a87: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
I have added tests for the edge case that is mentioned in the referenced issue. I had added the tests first and then the main code changes -
|
❌ Gradle check result for ac864f0: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
ac864f0
to
29cf87f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes looks great .
server/src/internalClusterTest/java/org/opensearch/remotestore/RemoteStoreIT.java
Outdated
Show resolved
Hide resolved
server/src/internalClusterTest/java/org/opensearch/remotestore/RemoteStoreIT.java
Show resolved
Hide resolved
Signed-off-by: Ashish Singh <[email protected]>
❌ Gradle check result for d32dea3: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Ashish Singh <[email protected]>
❕ Gradle check result for 94f998e: UNSTABLE Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure. |
The backport to
To backport manually, run these commands in your terminal: # Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.x
# Create a new branch
git switch --create backport/backport-15990-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 f1acc7aad7db4c3c9ce2e0ac331b02105ddc85f5
# Push it to GitHub
git push --set-upstream origin backport/backport-15990-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.x Then, create a pull request where the |
* Fix red index on close for remote translog Signed-off-by: Ashish Singh <[email protected]> * Add UTs Signed-off-by: Ashish Singh <[email protected]> --------- Signed-off-by: Ashish Singh <[email protected]> (cherry picked from commit f1acc7a) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
…ch-project#15990) * Fix red index on close for remote translog Signed-off-by: Ashish Singh <[email protected]> * Add UTs Signed-off-by: Ashish Singh <[email protected]> --------- Signed-off-by: Ashish Singh <[email protected]>
…16082) * Fix red index on close for remote translog * Add UTs --------- (cherry picked from commit f1acc7a) Signed-off-by: Ashish Singh <[email protected]> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Ashish Singh <[email protected]>
…ch-project#15990) * Fix red index on close for remote translog Signed-off-by: Ashish Singh <[email protected]> * Add UTs Signed-off-by: Ashish Singh <[email protected]> --------- Signed-off-by: Ashish Singh <[email protected]>
…ch-project#15990) * Fix red index on close for remote translog Signed-off-by: Ashish Singh <[email protected]> * Add UTs Signed-off-by: Ashish Singh <[email protected]> --------- Signed-off-by: Ashish Singh <[email protected]>
…ch-project#15990) * Fix red index on close for remote translog Signed-off-by: Ashish Singh <[email protected]> * Add UTs Signed-off-by: Ashish Singh <[email protected]> --------- Signed-off-by: Ashish Singh <[email protected]>
Description
The close index operation involves following steps -
During a happy index close, we upload translog twice -
However, if there is a flush that has happened after the operation landed in the Lucene buffer but before the buffered sync (for sync translog) or the periodic async sync (for async translog), then the steps 3(a) and 3(b) becomes no-op and the GCP uploaded in the checkpoint file would be the one from the last translog sync. This causes the discrepancy between maxSeqNo and GCP and causing exception while creating ReadOnlyEngine leading to red index.
In this PR, changes are made to track the global checkpoint that has been updated as part of the successful translog upload to remote store. The new tracked global checkpoint is now also used in the
RemoteFsTranslog.syncNeeded()
method and checked against the current (translog writer) last synced global checkpoint.Related Issues
Resolves #15989
Check List
[ ] API changes companion pull request created, if applicable.[ ] Public documentation issue/PR created, if applicable.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.