-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Persistent memory leak of k3s control plane instance #10922
Comments
Process view ~10 hours ago: This shows that both k3s & containerd are growing. These views are tracking the systemd cgroup slices, not the workloads. So there should be no contamination of workload behavior on these stats (but separately, as noted above I minimized the number of pods running on this node & checked the stats of said pods -- all were within reason). |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Environmental Info:
K3s Version: v1.29.8
Node(s) CPU architecture, OS, and Version:
Linux venus-node-3 6.10.6-200.fc40.x86_64 #1 SMP PREEMPT_DYNAMIC Mon Aug 19 14:09:30 UTC 2024 x86_64 GNU/Linux
cmdline:
Cluster Configuration:
Describe the bug:
Steps To Reproduce:
I have a simple drop in:
Expected behavior:
I have a control plane node that runs out of memory after 1-2 days. I've experimented a bit, and this happens even when the node is cordoned and minimal pods are running.
Actual behavior:
Memory usage of k3s & containerd grows over a 1-2 day period to consume all memory on the host. This happened on v1.29.6 as well, upgraded to v1.29.8 but no change was observed.
Additional context / logs:
I also have
below
logs (similar toatop
if you aren't familiar) showing the cgroup & process level stats over a 24h+ period. You can see the RSS grow over the period uncontrollably:~10 hours ago:
~15 minutes ago:
Grafana stats:
The text was updated successfully, but these errors were encountered: