-
Notifications
You must be signed in to change notification settings - Fork 218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Scaling down nodePool doesn't reassing all shards #870
Comments
[Triage] Also when the state is RED have to tried to scale down to zero and then scale up (or a fresh restart)? Thank you |
I concur with @prudhvigodithi here. This looks like a problem with opensearch itself. From your description opensearch is not able to correctly recover some shards if one of the replicas is removed. Since the 1.x version is no longer developed it does not make sense to implement special logic for this in the operator. |
@prudhvigodithi I tried to do fresh restart/scale down&up and status stayed same. thanks guys for help and confirmation where the issue is. We are running on 2.x where we can, but on some k8s clusters we need to keep 1.x due to APP compatibility. They should provide new version which will work with Opensearch 2.x next year, so at least now we have another reason to push them to provide it ASAP. For now I forwarded this info in team to be super careful when scaling down and removing replicas. Should I close issue with "Close as not planned"? |
Thanks for confirmation @pbagona, I will close this issue for now, please feel free to comment or re-open if required. |
What is the bug?
When scaling down nodePool, Operator shows messages about drain of removed node, but after it finished health status is Red and some shards remain unassigned.
How can one reproduce the bug?
Current setup is I have 4 nodepools - master with 3 replica (role master), nodes with 2 replica each 300Gi storage (role data+ingest), ingests with 3 replicas each 100Gi storage (role ingests) and data with 5 replica each 1Ti storage (role data). Scaling down nodePool nodes introduces issues with shard allocation and health status of cluster.
What is the expected behavior?
Expected behavior is that after Operator drains node and decommissions it, cluster health status will be Green.
What is your host/environment?
k8s v1.27.13, OpenSearch k8s operator 2.5.1, OpenSearch cluster 1.3.16
Do you have any screenshots?
yes I will post screenshots
Do you have any additional context?
nodePool nodes and data existed first, then ingests was added and now goal is to remove old nodes nodePool.
I did this same setup with OpenSearch cluster version 2.XX on different k8s cluster and it worked as expected - when one nodePool was removed, operator drained nodes of that nodePool 1 by 1 and removed them and there was no interruption to service and after it finished cluster health status remained green.
When doing these same steps on OpensSearch cluster version 1.3.16, it results it cluster health status RED and some shard unable to allocate. Sometimes it is 1 shard that remains unallocated, sometimes more shards.
I tried removing nodePool in manifest specification all at once and then I tried just to scale it down by one replica but got same outcome.
In operator logs, I see that it correctly waits for node to drain and then decomissions it, but at that very moment cluster goes into RED state and I see error with allocations.
When I add removed nodepool/replica back to manifest, after pod is up and running status of cluster is back to Green and everything is behaving normally.
I tried this several times and get one of few errors about allocation every time.
Also as seen in screenshots bellow, before scaling down, nodes have 12.3gb used storage for disk.indicies but when one of nodes in nodePool gets removed, number of shards seems to be redistributed, but disk.indicies number stays same for all nodes, or changes just minimally but does not cover 12.3gb that should have been relocated to remaining nodes... and when nodePool is scaled back up to original and removed Pod mounts its old PV, everything is back to normal green state.
Cluster health status
Allocation before change
Example of allocation after change
Example of unallocated shard explanation
EDIT:
I did tried it again in order to collect more and noticed that when I scale down nodePool nodes from 2 to 1 ... Operator goes from int2-opensearch-nodes-0, int2-opensearch-nodes-1 to int2-opensearch-nodes-0 and drains node int2-opensearch-nodes-1, during this process, it reallocates some shards to node that is being drained and then node pod is terminated and removed from cluster and logs from operator are as I posted above
The text was updated successfully, but these errors were encountered: