Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Too many scroll contexts in Elasticsearch due to scroll requests from ES fetchers #14619

Closed
lahsivjar opened this issue Nov 12, 2024 · 4 comments
Assignees
Labels
bug impact:high Short-term priority; add to current release, or definitely next. urgency:high

Comments

@lahsivjar
Copy link
Contributor

APM Server version (apm-server version): All APM servers using ES fetchers

Description of the problem including expected versus actual behavior: Too many scroll contexts are getting created in Elasticsearch. Looking at the code, we don't seem to be clearing the scroll context when the refresh encounters an error which could lead to scroll contexts piling up.

Steps to reproduce:

We can probably observe this locally by simulating an error in the metadata fetcher's refresh call (ref).

Provide logs (if relevant):

In a deployment reporting this error a lot of refresh cache error: context deadline exceeded can be observed.

@lahsivjar lahsivjar added the bug label Nov 12, 2024
@lahsivjar lahsivjar added impact:high Short-term priority; add to current release, or definitely next. urgency:high labels Nov 12, 2024
@lahsivjar
Copy link
Contributor Author

Putting it as high impact and urgency because, IIANM, APM-Server hogging all the scroll context limit in Elasticsearch might cause other search requests relying on scroll context to start failing - this is based on my understanding so far.

@kruskall
Copy link
Member

Thanks for opening an issue! 🙇

Not to argue impact/urgency but it's worth adding that according to https://www.elastic.co/guide/en/elasticsearch/reference/current/paginate-search-results.html#clear-scroll search context are removed after the scroll timeout (which we are setting) so they shouldn't leak forever

@lahsivjar
Copy link
Contributor Author

search context are removed after the scroll timeout (which we are setting) so they shouldn't leak forever

Fair point, I think there is still a missing link here as it doesn't make sense why the scroll contexts will accumulate if they should timeout.

@kruskall
Copy link
Member

PR merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug impact:high Short-term priority; add to current release, or definitely next. urgency:high
Projects
None yet
Development

No branches or pull requests

2 participants