Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Stuck segments upload leads to high refresh lag #11020

Closed
ashking94 opened this issue Oct 31, 2023 · 1 comment
Closed

[BUG] Stuck segments upload leads to high refresh lag #11020

ashking94 opened this issue Oct 31, 2023 · 1 comment
Assignees
Labels
bug Something isn't working Storage:Durability Issues and PRs related to the durability framework Storage:Remote

Comments

@ashking94
Copy link
Member

Describe the bug
There is a use case where when segments are getting uploaded to remote store and a flush/force merge happens, there is a possibility that the segment tracker's data can get erased while there is async refresh retry is happening. This leads to latch not getting counted down.

for (String src : filteredFiles) {
// Initializing listener here to ensure that the stats increment operations are thread-safe
UploadListener statsListener = createUploadListener();
ActionListener<Void> aggregatedListener = ActionListener.wrap(resp -> {
statsListener.onSuccess(src);
batchUploadListener.onResponse(resp);
}, ex -> {
logger.warn(() -> new ParameterizedMessage("Exception: [{}] while uploading segment files", ex), ex);
if (ex instanceof CorruptIndexException) {
indexShard.failShard(ex.getMessage(), ex);
}
statsListener.onFailure(src);
batchUploadListener.onFailure(ex);
});
statsListener.beforeUpload(src);
remoteDirectory.copyFrom(storeDirectory, src, IOContext.DEFAULT, aggregatedListener);

To Reproduce
Run indexing for a long time.

Expected behavior
Refreshes should never get stuck.

@ashking94 ashking94 added bug Something isn't working untriaged Storage:Durability Issues and PRs related to the durability framework Storage:Remote labels Oct 31, 2023
@ashking94 ashking94 self-assigned this Nov 1, 2023
@ashking94
Copy link
Member Author

This is fixed with #11896

@github-project-automation github-project-automation bot moved this from 🆕 New to ✅ Done in Storage Project Board Feb 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Storage:Durability Issues and PRs related to the durability framework Storage:Remote
Projects
Status: ✅ Done
Development

Successfully merging a pull request may close this issue.

1 participant