Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bugfix] Remote translog does not honour pinned timestamp for low value of indexSettings().getRemoteTranslogExtraKeep() #16078

Merged

Conversation

sachinpkale
Copy link
Member

@sachinpkale sachinpkale commented Sep 25, 2024

Description

  • RemoteFsTimestampAwareTranslog has different logic for trimUnreferencedReaders() when compared to that of RemoteFsTranslog. It considers pinned timestamps and makes sure data against pinned timestamp is not deleted from remote translog.
  • RemoteFsTimestampAwareTranslog.trimUnreferencedReaders() calls super.trimUnreferencedReaders() with the intention of cleaning up local translog files.
  • But as RemoteFsTimestampAwareTranslog extends RemoteFsTranslog, this call to super goes to RemoteFsTranslog which is not aware of pinned timestamps.
  • In this PR, we make sure to call Translog.trimUnreferencedReaders() from RemoteFsTimestampAwareTranslog.

Why is it not failing always?

  • Even though RemoteFsTimestampAwareTranslog is inadvertently calling RemoteFsTranslog.trimUnreferencedReaders the super method becomes a no-op in most of the cases.
  • This is due to the following condition in RemoteFsTranslog.trimUnreferencedReaders

// cleans up remote translog files not referenced in latest uploaded metadata.
// This enables us to restore translog from the metadata in case of failover or relocation.
Set<Long> generationsToDelete = new HashSet<>();
for (long generation = minRemoteGenReferenced - 1 - indexSettings().getRemoteTranslogExtraKeep(); generation >= 0; generation--) {
if (fileTransferTracker.uploaded(Translog.getFilename(generation)) == false) {
break;
}
generationsToDelete.add(generation);
}
if (generationsToDelete.isEmpty() == false) {
deleteRemoteGeneration(generationsToDelete);
translogTransferManager.deleteStaleTranslogMetadataFilesAsync(remoteGenerationDeletionPermits::release);
deleteStaleRemotePrimaryTerms();
} else {
remoteGenerationDeletionPermits.release(REMOTE_DELETION_PERMITS);
}

  • The for loop makes sure to keep indexSettings().getRemoteTranslogExtraKeep() generations always. So, initial calls to
    trimUnreferencedReaders would be a no-op.
  • After calling RemoteFsTranslog.trimUnreferencedReaders, RemoteFsTimestampAwareTranslog removes entries for translog files from fileTracker that are no longer present in the local.
  • As fileTracker is now having only entries that are present on local, even after entering the for loop, RemoteFsTranslog.trimUnreferencedReaders becomes a no-op and breaks out of the loop.
  • The issue occurs specifically when indexSettings().getRemoteTranslogExtraKeep() is set to low value.

Related Issues

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link
Contributor

✅ Gradle check result for ac4facb: SUCCESS

Copy link

codecov bot commented Sep 25, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 71.97%. Comparing base (dc4dbce) to head (ac4facb).
Report is 19 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main   #16078      +/-   ##
============================================
- Coverage     71.98%   71.97%   -0.01%     
+ Complexity    64431    64378      -53     
============================================
  Files          5281     5281              
  Lines        301063   301067       +4     
  Branches      43491    43492       +1     
============================================
- Hits         216715   216705      -10     
- Misses        66526    66531       +5     
- Partials      17822    17831       +9     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@sachinpkale sachinpkale changed the title Bugfix in RemoteFsTimestampAwareTranslog.trimUnreferencedReaders [Bugfix] Remote translog does not honour pinned timestamp for low value of indexSettings().getRemoteTranslogExtraKeep() Sep 25, 2024
@sachinpkale sachinpkale marked this pull request as ready for review September 25, 2024 14:12
@gbbafna gbbafna merged commit 1bddf2f into opensearch-project:main Sep 30, 2024
66 of 69 checks passed
@gbbafna gbbafna added the backport 2.x Backport to 2.x branch label Sep 30, 2024
opensearch-trigger-bot bot pushed a commit that referenced this pull request Sep 30, 2024
)

Signed-off-by: Sachin Kale <[email protected]>
(cherry picked from commit 1bddf2f)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
gbbafna pushed a commit that referenced this pull request Sep 30, 2024
) (#16126)

(cherry picked from commit 1bddf2f)

Signed-off-by: Sachin Kale <[email protected]>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
hainenber pushed a commit to hainenber/OpenSearch that referenced this pull request Oct 1, 2024
ruai0511 pushed a commit to ruai0511/OpenSearch that referenced this pull request Oct 4, 2024
dk2k pushed a commit to dk2k/OpenSearch that referenced this pull request Oct 16, 2024
dk2k pushed a commit to dk2k/OpenSearch that referenced this pull request Oct 17, 2024
dk2k pushed a commit to dk2k/OpenSearch that referenced this pull request Oct 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch skip-changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants