Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ip addresses leaking when there are too many ip in cooldown pool #2896

Open
abbshr opened this issue Apr 26, 2024 · 5 comments
Open

ip addresses leaking when there are too many ip in cooldown pool #2896

abbshr opened this issue Apr 26, 2024 · 5 comments
Labels

Comments

@abbshr
Copy link

abbshr commented Apr 26, 2024

What happened:

these are my ipadm's warm pool settings:

WARM_IP_TARGET = 2
MINIMUM_IP_TARGET = 23
WARM_ENI_TARGET = 0
IP_COOLDOWN_PERIOD = 30

during pressure test, we create a lot of Pods in a short time and just delete them when they became ready.
this operation make a lot of ip addresses go into cool-down state, and new created pod can not be assigned an IP until the IP_COOLDOWN_PERIOD exceeded.

Attach logs

Kubelet log:
new pod can't assign IP address:
image

IPadm log:
you can see, more and more cool down ip present, but the short never > 0:
image
image
image

when cool down ip = 4, and assigned ip = 19, we have no free ip in ipadm pool, but 23(MINIMUM_IP_TARGET) -19(assigned ip) > 2 (WARM_IP_TARGET), ipadm will not allocated more free IP from VPC:
image
image

after IP_COOLDOWN_PERIOD , new pod can be assigned an IP:
image

after reading the source code here:

I found that the vpc-cni plugin calculate as follow:

image

and AvailableAddress() implement here, just total attached ips - assigned ips:

image

This is fine most of the time.
But if there are enough ip in cool down state, suppose cool down cnt + assigned cnt = total attached cnt, there still no free ip can be assign to Pod, because "total - assigned" > "WARM_IP_TARGET", so ipadm won't try to allocate more ip from VPC.
Finally, the new created pod can not get a ip address even there r so many free ip in the VPC.

What you expected to happen:

pod should be assigned an IP when the VPC subnet has enough IP.

How to reproduce it (as minimally and precisely as possible):

  1. create a k8s cluster
  2. use just one node
  3. configure vpc-cni with WARM_IP_TARGET=1, MINIMUM_IP_TARGET=6 (small than VPC free IP Count)
  4. create pod one by one, to MINIMUM_IP_TARGET - WARM_IP_TARGET (5)
  5. do these within 30s: delete one pod and create two new pod, the last pod will fail by can not assign IP address to pod

Anything else we need to know?:

Environment: aws eks

  • Kubernetes version (use kubectl version): 1.29
  • CNI Version: 1.17
@abbshr abbshr added the bug label Apr 26, 2024
@orsenthil
Copy link
Member

do these within 30s: delete one pod and create two new pod, the last pod will fail by can not assign IP address to pod

The default cooldown period is 30 secs,after which the ips in the cooldown will be made available to the pods. Are you noticing that this isn't happening?

The reason for keeping the warm_ip_target is to ensure that VPC CNI doesn't do an unnecessary EC2 api call, which is avoid running to API throttles.

If tuning of ip cooldown period is preferable, it can be controlled using this flag - https://github.com/aws/amazon-vpc-cni-k8s/?tab=readme-ov-file#ip_cooldown_period-v1150

@abbshr
Copy link
Author

abbshr commented May 2, 2024 via email

Copy link

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days

@github-actions github-actions bot added the stale Issue or PR is stale label Aug 26, 2024
Copy link

github-actions bot commented Sep 9, 2024

Issue closed due to inactivity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 9, 2024
@yash97 yash97 reopened this Sep 12, 2024
@github-actions github-actions bot removed the stale Issue or PR is stale label Sep 13, 2024
Copy link

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days

@github-actions github-actions bot added the stale Issue or PR is stale label Nov 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants