-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Paginating ClusterManager Read APIs] Paginate _cat/segments API. #14259
Comments
Right now, a lot of the One option for pagination might involve limiting the number of nodes that get queried per request. |
Maybe there's a way to redesign this as a "walk the cluster" in a predictable sequence with a cursor that can remember its place even with cluster changes, therefore keeping pagination stable, and making no more than N fan-out requests ahead at any given time to gather more data. |
I think we could fan out to all the nodes for the first request, create checkpoint of the point in time state and then from there on serve batch node responses ensuring that all nodes are using the same point in time state. This could be costly for certain APIs though so we really need a trade-off of consistency vs cost |
I see the conversation is aligned towards pagination based on checkpoints. Alternatively, can we make it filter based? and the filter can be based on index, shard, node, etc. With filters, server need not retain the checkpoint and it is also simpler to implement. Thoughts? |
I see index name or index regex is already supported as target (filter). Can it be extended to shards and nodes? |
Cluster state today is point in time information. Maintaining checkpoints would be very expensive, overhead will increase depending on state changes happening in the cluster. These APIs are used for diagnostic purpose. I feel adding checkpoints of cluster state will be overkill.
The below API already support filtering on indices: But, the major problem is user don't always know what they are looking for and default response in large cluster would be very big for these APIs. We are thinking about adding more aggressive filtering support like node level in admin APIs separately but filtering only may not be sufficient. One thing I was thinking earlier was can we do implicit pagination like based on node. One challenge is node can leave and join the cluster and then the API responses could get confusing. Like in _cat/shards or _cat/segments API, a shard can appear multiple times if it was reassigned to a different node during pagination was in-progress. Paginating on indices with creation time can provide a more stable and deterministic pagination. |
By the way _cat/segments is not widely used API, pagination may not be P0 for this one. For this API, to start with we can enforce aggressive filtering but for _cat/shards pagination would be important. |
Is your feature request related to a problem? Please describe
As the number of indices and data/segments grow in an opensearch cluster, the response size and the latency of the _cat/segments API increases, which makes it difficult not only for the client to consume such large responses, but also stresses out the cluster by making it accumulate segment stats across all the indices.
Describe the solution you'd like
Blocking the API for large clusters.
Given that opensearch cluster can be used to store large amount of data, the number of segments can be very high (for eg, 1.6 million segments for 80TB data across 5k indices). If there isn't any strong use-case of the API for any monitoring activity, it can very well be blocked for large clusters to prevent unnecessary usage of nodes' resources.
Pagination:
If the API is not to be blocked, one of the proposal to paginate will depend on the way _cat/shards API is paginated. A list of shards can be generated for a requested page (as per the finalized approach of _cat/shards API) and all the segments belonging to those shards can be displayed in a single page.
The pageSize parameter would not be very well defined in that case. Instead of it directly being, number of segments in a page, it would rather be number of shards corresponding to which all the segments will be displayed.
This would also be susceptible to skewness in the cluster. If shards having relatively more data/segments get picked up in same page, response sizes can significantly vary.
Related component
Cluster Manager
Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: