Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Too much memory consumption of couchdb pods when running on distributions with cgroup v2. #161

Open
cax21 opened this issue Jun 13, 2024 · 2 comments

Comments

@cax21
Copy link

cax21 commented Jun 13, 2024

Describe the bug
CouchDB pods will not start (due mainly to OOM killed).
If we use no resource limits, we can observe the couchdb pods are using peak memory of almost 3.5GB.
Then it stabilizes to 1.6GB , which is way too much in any case.
Note that this is not observed at all on nodes running on cgroup v1, so we suspect cgroup v2 being the root cause of this resource issue.

Version of Helm and Kubernetes:
Kubernetes: 1.27.6 (rancher)
Helm : 3.15
Nodes running on : Rocky Linux 9.3
Linux Kernel : using cgroup v2 (see https://kubernetes.io/docs/concepts/architecture/cgroups/)

What happened:
At startup couchdb pods are using too much memory (almost 4GB) and get killed if the resources are not given to support such a high load.

What you expected to happen:
At startup, couchdb pods should consume only few MB of memory as they would do on nodes running with cgroup v1

How to reproduce it (as minimally and precisely as possible):
You can easily reproduce by deploying latest chart 4.5.6 on such cluster described above.
You will notice the 3 couchdb pods taking more than 3GB of memory.

Anything else we need to know:
Well, may be we could expect some specific tuning in the charts for handling cgroup v2 nodes, if possible.
We haven't found anything in the documentation about that.

@willholley
Copy link
Member

I'm not aware of any specific tuning for cgroups v2 - you are likely the first to experiment with this. If you have proposals for configuration changes that would be useful in the helm chart, please feel free to submit a PR.

@cax21
Copy link
Author

cax21 commented Jun 24, 2024

It seems to be an issue in k8s itself I think (the way it manages memory using cgroups v2)
I've tried to launch on k8s v1.30 equivalent (in fact a rancher k3s) running on rocky linux 9.4 nodes, and I see no problem at all.
All pods start correctly and consume less the 256MB.
I don't think there is anything wrong in the chart itself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants