Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vmcluster status is failed after minor change to vmstorage #683

Closed
uhthomas opened this issue Jul 2, 2023 · 8 comments
Closed

vmcluster status is failed after minor change to vmstorage #683

uhthomas opened this issue Jul 2, 2023 · 8 comments
Labels
question Further information is requested

Comments

@uhthomas
Copy link

uhthomas commented Jul 2, 2023

I thought it would be nice to give some more resources to vm in this commit.

It looks like the operator deleted the statefulset and is now stuck?

Status:
  Cluster Status:     failed
  Reason:             cannot handle rolling-update on sts: vmstorage-vm, err: actual pod count: 1 less then needed: 2, possible statefulset misconfiguration
  Update Fail Count:  0
Events:               <none>
❯ k -n vm get sts
NAME           READY   AGE
vmselect-vm    2/2     67m
vmstorage-vm   1/2     67m
❯ k -n vm get po
NAME                          READY   STATUS    RESTARTS   AGE
vminsert-vm-f6488d45b-8lck7   1/1     Running   0          67m
vminsert-vm-f6488d45b-fjs7j   1/1     Running   0          67m
vmselect-vm-0                 1/1     Running   0          67m
vmselect-vm-1                 1/1     Running   0          67m
vmstorage-vm-1                1/1     Running   0          67m
@uhthomas
Copy link
Author

uhthomas commented Jul 2, 2023

This also happens if the number of replicas is increased. I wanted to from 2 to 3 replicas and the operator got stuck again. It didn't bother changing the number of replicas on the stateful set 3 and was confused as to why there were only 2 replicas.

Status:
  Cluster Status:     failed
  Reason:             cannot handle rolling-update on sts: vmstorage-vm, err: actual pod count: 2 less then needed: 3, possible statefulset misconfiguration
  Update Fail Count:  0

@Haleygo
Copy link
Contributor

Haleygo commented Jul 2, 2023

@uhthomas Hello, what version of operator are you using?
And can you see the 'vmstorage-vm' sts events using k -n vm describe sts vmstorage-vm, it didn't seems to create right number of wanted pod.

@Haleygo Haleygo added the question Further information is requested label Jul 2, 2023
@uhthomas
Copy link
Author

uhthomas commented Jul 3, 2023

@uhthomas Hello, what version of operator are you using? And can you see the 'vmstorage-vm' sts events using k -n vm describe sts vmstorage-vm, it didn't seems to create right number of wanted pod.

I'm using v0.34.1. I can't describe the statefulset because the operator deletes it.

@Haleygo
Copy link
Contributor

Haleygo commented Jul 3, 2023

@uhthomas Hello, what version of operator are you using? And can you see the 'vmstorage-vm' sts events using k -n vm describe sts vmstorage-vm, it didn't seems to create right number of wanted pod.

I'm using v0.34.1. I can't describe the statefulset because the operator deletes it.

Operator shouldn't delete the sts "vmstorage-vm" unless VMStorage is nil or vmcluster is deleted.
What's the status of vmcluster and the new vmstorage-vm sts now.

@uhthomas
Copy link
Author

uhthomas commented Jul 3, 2023

@uhthomas Hello, what version of operator are you using? And can you see the 'vmstorage-vm' sts events using k -n vm describe sts vmstorage-vm, it didn't seems to create right number of wanted pod.

I'm using v0.34.1. I can't describe the statefulset because the operator deletes it.

Operator shouldn't delete the sts "vmstorage-vm" unless VMStorage is nil or vmcluster is deleted. What's the status of vmcluster and the new vmstorage-vm sts now.

I deleted it and recreated everything to unblock what I needed to do. I can create a new one and get it into a similar state though.

@uhthomas
Copy link
Author

uhthomas commented Jul 3, 2023

Hmm. It could be the same issue as spotahome/redis-operator#592? I can see the labels are incorrectly propogated.

Name:               vmselect-vm
Namespace:          vm
CreationTimestamp:  Sun, 02 Jul 2023 13:03:12 +0100
Selector:           app.kubernetes.io/component=monitoring,app.kubernetes.io/instance=vm,app.kubernetes.io/name=vmselect,managed-by=vm-operator
Labels:             app.kubernetes.io/component=monitoring
                    app.kubernetes.io/instance=vm
                    app.kubernetes.io/name=vmselect
                    applyset.kubernetes.io/part-of=applyset-xjZyH1FmMYtP-oSkfLUgubxDYIbsrD-IuDRLmezicIo-v1
                    managed-by=vm-operator

Could the operator be changed to not propagate labels? I imagine kubectl apply --prune is deleting the statefulsets.

@uhthomas
Copy link
Author

uhthomas commented Jul 3, 2023

Yeah...

https://github.com/uhthomas/automata/actions/runs/5436267134/jobs/9885991448#step:8:643

deployment.apps/vminsert-vm pruned
service/vminsert-vm pruned
service/vmselect-vm pruned
service/vmstorage-vm pruned
serviceaccount/vmcluster-vm pruned
statefulset.apps/vmselect-vm pruned
statefulset.apps/vmstorage-vm pruned

@Haleygo
Copy link
Contributor

Haleygo commented Jul 3, 2023

Hmm. It could be the same issue as spotahome/redis-operator#592? I can see the labels are incorrectly propogated.

Name:               vmselect-vm
Namespace:          vm
CreationTimestamp:  Sun, 02 Jul 2023 13:03:12 +0100
Selector:           app.kubernetes.io/component=monitoring,app.kubernetes.io/instance=vm,app.kubernetes.io/name=vmselect,managed-by=vm-operator
Labels:             app.kubernetes.io/component=monitoring
                    app.kubernetes.io/instance=vm
                    app.kubernetes.io/name=vmselect
                    applyset.kubernetes.io/part-of=applyset-xjZyH1FmMYtP-oSkfLUgubxDYIbsrD-IuDRLmezicIo-v1
                    managed-by=vm-operator

Could the operator be changed to not propagate labels? I imagine kubectl apply --prune is deleting the statefulsets.

You can add "VM_FILTERCHILDLABELPREFIXES=[applyset.kubernetes.io]" as env in operator like this, it should prevent label from propagation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants