Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Paginating ClusterManager Read APIs] Paginate _cat/segments API. #14259

Open
gargharsh3134 opened this issue Jun 13, 2024 · 8 comments
Open
Labels
Cluster Manager enhancement Enhancement or improvement to existing feature or request v2.16.0 Issues and PRs related to version 2.16.0

Comments

@gargharsh3134
Copy link
Contributor

gargharsh3134 commented Jun 13, 2024

Is your feature request related to a problem? Please describe

As the number of indices and data/segments grow in an opensearch cluster, the response size and the latency of the _cat/segments API increases, which makes it difficult not only for the client to consume such large responses, but also stresses out the cluster by making it accumulate segment stats across all the indices.

Describe the solution you'd like

  1. Blocking the API for large clusters.
    Given that opensearch cluster can be used to store large amount of data, the number of segments can be very high (for eg, 1.6 million segments for 80TB data across 5k indices). If there isn't any strong use-case of the API for any monitoring activity, it can very well be blocked for large clusters to prevent unnecessary usage of nodes' resources.

  2. Pagination:
    If the API is not to be blocked, one of the proposal to paginate will depend on the way _cat/shards API is paginated. A list of shards can be generated for a requested page (as per the finalized approach of _cat/shards API) and all the segments belonging to those shards can be displayed in a single page.
    The pageSize parameter would not be very well defined in that case. Instead of it directly being, number of segments in a page, it would rather be number of shards corresponding to which all the segments will be displayed.
    This would also be susceptible to skewness in the cluster. If shards having relatively more data/segments get picked up in same page, response sizes can significantly vary.

Related component

Cluster Manager

Describe alternatives you've considered

No response

Additional context

No response

@dblock
Copy link
Member

dblock commented Jul 1, 2024

[Catch All Triage - Attendees 1, 2, 3, 4, 5]

@msfroh
Copy link
Collaborator

msfroh commented Jul 1, 2024

Right now, a lot of the /_cat APIs work via a scatter-gather across all nodes. Essentially, the request hits a node, it fans out to all nodes, then stitches the responses together.

One option for pagination might involve limiting the number of nodes that get queried per request.

@dblock
Copy link
Member

dblock commented Jul 2, 2024

Maybe there's a way to redesign this as a "walk the cluster" in a predictable sequence with a cursor that can remember its place even with cluster changes, therefore keeping pagination stable, and making no more than N fan-out requests ahead at any given time to gather more data.

@Bukhtawar
Copy link
Collaborator

I think we could fan out to all the nodes for the first request, create checkpoint of the point in time state and then from there on serve batch node responses ensuring that all nodes are using the same point in time state. This could be costly for certain APIs though so we really need a trade-off of consistency vs cost

@rwali-aws rwali-aws added the v2.16.0 Issues and PRs related to version 2.16.0 label Jul 11, 2024
@backslasht
Copy link
Contributor

I see the conversation is aligned towards pagination based on checkpoints. Alternatively, can we make it filter based? and the filter can be based on index, shard, node, etc. With filters, server need not retain the checkpoint and it is also simpler to implement.

Thoughts?

@backslasht
Copy link
Contributor

I see index name or index regex is already supported as target (filter). Can it be extended to shards and nodes?

@shwetathareja
Copy link
Member

shwetathareja commented Jul 12, 2024

create checkpoint of the point in time state and then from there on serve batch node responses ensuring that all nodes are using the same point in time state.

Cluster state today is point in time information. Maintaining checkpoints would be very expensive, overhead will increase depending on state changes happening in the cluster. These APIs are used for diagnostic purpose. I feel adding checkpoints of cluster state will be overkill.

Alternatively, can we make it filter based?

The below API already support filtering on indices:
_cat/shards
_cat/segments
_cat/indices

But, the major problem is user don't always know what they are looking for and default response in large cluster would be very big for these APIs. We are thinking about adding more aggressive filtering support like node level in admin APIs separately but filtering only may not be sufficient. One thing I was thinking earlier was can we do implicit pagination like based on node. One challenge is node can leave and join the cluster and then the API responses could get confusing. Like in _cat/shards or _cat/segments API, a shard can appear multiple times if it was reassigned to a different node during pagination was in-progress. Paginating on indices with creation time can provide a more stable and deterministic pagination.

@shwetathareja
Copy link
Member

By the way _cat/segments is not widely used API, pagination may not be P0 for this one. For this API, to start with we can enforce aggressive filtering but for _cat/shards pagination would be important.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Cluster Manager enhancement Enhancement or improvement to existing feature or request v2.16.0 Issues and PRs related to version 2.16.0
Projects
Status: 🆕 New
Status: 2.17 (First RC 09/03, Release 09/17)
Development

No branches or pull requests

7 participants