-
Notifications
You must be signed in to change notification settings - Fork 716
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
liveness probe for etcd may cause database crash #2759
Comments
@ahrtr @serathius latest probe in kubeadm is here: |
Sorry for that. It's a known issue, which was fixed in 3.5.5. FYI. etcd-io/etcd#14419 |
What is the recommended solution? Should I force the etcd version |
Kubernetes bumped etcd 3.5.5 on the master branch in kubernetes/kubernetes#112489. I think etcd 3.5.5 should be bumped into previous stable releases as well. cc @dims @neolit123 |
etcd-io/etcd#14382 (comment)
|
from what i've seen these backports to bump etcd in older k8s releases do not get merged. but if someone wants to try, go ahead. |
@wjentner FYI. https://github.com/ahrtr/etcd-issues/blob/d134cb8d07425bf3bf530e6bb509c6e6bc6e7c67/etcd/etcd-db-editor/main.go#L16-L28 Please let me know whether it works or not. |
Though, in this case it should only be a patch bump and may be fine? We have been on 3.5.X since at least k8s 1.23 |
etcd patch backports did not get attention and approval either. +1 from me if someone wants to try. |
I have manually recovered our DB using the As a current workaround I have set the --quota-backend-bytes to 8Gb using the extraArgs in kubeadm such that the alarm is not being raised. I'm quite surprised that this problem has not yet occurred to more people since the default DB size for the alarm to be raised is around 2GB. |
check the kubeadm --patches functionality |
Thanks for the feedback.
I am curious how did you do it. |
@ahrtr I basically followed the disaster recovery docs: https://etcd.io/docs/v3.5/op-guide/recovery/
The cluster synced successfully, and all members were healthy afterward. |
Thanks @wjentner for the feedback, which makes sense. I just checked the source code, the key point why your steps work is that etcdutl updates the consistent index using the commitId. FYI. v3_snapshot.go#L272 |
Is this a BUG REPORT or FEATURE REQUEST?
BUG REPORT
Versions
kubeadm version (use
kubeadm version
):Environment:
kubectl version
):bare metal
uname -a
):containerd 1.6.8
calico
What happened?
This was first reported in the etcd repository: etcd-io/etcd#14497
Kubeadm creates a manifest for etcd that uses the
/health
endpoint of etcd in the liveness probe.When the etcd database exceeds a certain size, the alarm NO SPACE is triggered, causing etcd to go into maintenance mode and allowing only read and delete actions until the size is reduced and
etcdctl alarm disarm
is sent.When the alarm is triggered, this causes the
/health
check to no longer return a200
response, meaning that the etcd member goes into a CrashLoopBackoff.Because this happens almost simultaneously on all members, the continuous crashloops eventually cause a fatal error where etcd is no longer able to start up by itself.
The etcd maintainers mention that the
/health
endpoint should not be used for liveness probes.In our case, this caused all etcd members to run into this error:
As you can see, all members are on different indices and are not able to recover from any snapshot, which was likely caused by the continuous restarts.
What you expected to happen?
etcd should not crashloop causing the database to go into an unrecoverable state.
How to reproduce it (as minimally and precisely as possible)?
An easy way to trigger this behavior is to follow the etcd docs: https://etcd.io/docs/v3.5/op-guide/maintenance/#space-quota
--quota-backend-bytes=16777216
(16MB)$ while [ 1 ]; do dd if=/dev/urandom bs=1024 count=1024 | ETCDCTL_API=3 etcdctl put key || break; done
etcdctl endpoint --cluster status -w table
or alsoetcdctl alarm list
(ALARM NO SPACE) should occur./health
endpoint, which is no longer200
Anything else we need to know?
Without the --quota-backend-bytes flag, the alarm is raised at around a DB size of 2GB. We have a moderately small cluster running for almost three years and reached this size recently.
The text was updated successfully, but these errors were encountered: