Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Exclude_ip to remove search nodes not working properly #15347

Open
sivatarunp opened this issue Aug 22, 2024 · 1 comment
Open

[BUG] Exclude_ip to remove search nodes not working properly #15347

sivatarunp opened this issue Aug 22, 2024 · 1 comment

Comments

@sivatarunp
Copy link

Describe the bug

We have an OpenSearch 2.13.0 cluster which use searchable snapshots. We observed an issue where when we wanted to exclude few search nodes using cluster.routing.allocation.exclude._ip setting, the shards are stuck in relocation stage .

The cluster also seemed to have issues wrt ism polices not being triggered, and any restore operations hanging. Once the setting was removed, things came back to normal. Is it expected behaviour wrt search nodes? If so what is the ideal way to scale up and scale down the search nodes ?

Related component

Search:Searchable Snapshots

To Reproduce

  1. boot an OS cluster 2.13.0 version. Have around 40 search nodes
  2. index some data. Take snapshot and restore on search nodes.
  3. ensure you have enough data and shards > 400 per node
  4. exclude 10 search nodes

Expected behavior

Nodes should have been excluded and shards should have been relocated, without any issues with ISM/other cluster activities

Additional Details

Plugins
Please list all plugins currently enabled.

Screenshots
If applicable, add screenshots to help explain your problem.

Host/Environment (please complete the following information):

  • OS: [e.g. iOS]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

@mch2
Copy link
Member

mch2 commented Sep 4, 2024

Thanks for reporting @sivatarunp, we will try and reproduce on our end. If you could can you please provide the output to /_cat/shards here and /_cat/recovery?active_only=true when the relocation is stuck? Also how many shards per index are you configuring? Thanks.

@mch2 mch2 removed the untriaged label Sep 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: 🆕 New
Status: 🆕 New
Development

No branches or pull requests

3 participants