ip addresses leaking when there are too many ip in cooldown pool #2896

abbshr · 2024-04-26T14:13:12Z

What happened:

these are my ipadm's warm pool settings:

WARM_IP_TARGET = 2
MINIMUM_IP_TARGET = 23
WARM_ENI_TARGET = 0
IP_COOLDOWN_PERIOD = 30

during pressure test, we create a lot of Pods in a short time and just delete them when they became ready.
this operation make a lot of ip addresses go into cool-down state, and new created pod can not be assigned an IP until the IP_COOLDOWN_PERIOD exceeded.

Attach logs

Kubelet log:
new pod can't assign IP address:

IPadm log:
you can see, more and more cool down ip present, but the short never > 0:

when cool down ip = 4, and assigned ip = 19, we have no free ip in ipadm pool, but 23(MINIMUM_IP_TARGET) -19(assigned ip) > 2 (WARM_IP_TARGET), ipadm will not allocated more free IP from VPC:

after IP_COOLDOWN_PERIOD , new pod can be assigned an IP:

after reading the source code here:

I found that the vpc-cni plugin calculate as follow:

and AvailableAddress() implement here, just total attached ips - assigned ips:

This is fine most of the time.
But if there are enough ip in cool down state, suppose cool down cnt + assigned cnt = total attached cnt, there still no free ip can be assign to Pod, because "total - assigned" > "WARM_IP_TARGET", so ipadm won't try to allocate more ip from VPC.
Finally, the new created pod can not get a ip address even there r so many free ip in the VPC.

What you expected to happen:

pod should be assigned an IP when the VPC subnet has enough IP.

How to reproduce it (as minimally and precisely as possible):

create a k8s cluster
use just one node
configure vpc-cni with WARM_IP_TARGET=1, MINIMUM_IP_TARGET=6 (small than VPC free IP Count)
create pod one by one, to MINIMUM_IP_TARGET - WARM_IP_TARGET (5)
do these within 30s: delete one pod and create two new pod, the last pod will fail by can not assign IP address to pod

Anything else we need to know?:

Environment: aws eks

Kubernetes version (use kubectl version): 1.29
CNI Version: 1.17

The text was updated successfully, but these errors were encountered:

orsenthil · 2024-05-01T20:15:55Z

do these within 30s: delete one pod and create two new pod, the last pod will fail by can not assign IP address to pod

The default cooldown period is 30 secs,after which the ips in the cooldown will be made available to the pods. Are you noticing that this isn't happening?

The reason for keeping the warm_ip_target is to ensure that VPC CNI doesn't do an unnecessary EC2 api call, which is avoid running to API throttles.

If tuning of ip cooldown period is preferable, it can be controlled using this flag - https://github.com/aws/amazon-vpc-cni-k8s/?tab=readme-ov-file#ip_cooldown_period-v1150

abbshr · 2024-05-02T11:30:27Z

I mean during the cooldown period, pod can't get an ip even if there are free ips in vpc subnet, which I think is not being expected. This happens because total ip count - assigned ip count < warm_ip_target, and assigned ip count + cooldown ip count = total ip count. Senthil Kumaran ***@***.***> 于 2024年5月2日周四 04:16写道：

…

do these within 30s: delete one pod and create two new pod, the last pod will fail by can not assign IP address to pod The default cooldown period is 30 secs,after which the ips in the cooldown will be made available to the pods. Are you noticing that this isn't happening? The reason for keeping the warm_ip_target is to ensure that VPC CNI doesn't do an unnecessary EC2 api call, which is avoid running to API throttles. If tuning of ip cooldown period is preferable, it can be controlled using this flag - https://github.com/aws/amazon-vpc-cni-k8s/?tab=readme-ov-file#ip_cooldown_period-v1150 — Reply to this email directly, view it on GitHub <#2896 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAXJWS7VYXAQ4BONAZEOOB3ZAFERBAVCNFSM6AAAAABG23UCQSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOBZGA3DKNZSGI> . You are receiving this because you authored the thread.Message ID: ***@***.***>

github-actions · 2024-08-26T00:03:51Z

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days

github-actions · 2024-09-09T00:04:21Z

Issue closed due to inactivity.

github-actions · 2024-11-13T00:04:00Z

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days

abbshr added the bug label Apr 26, 2024

orsenthil added the needs investigation label Jun 26, 2024

github-actions bot added the stale Issue or PR is stale label Aug 26, 2024

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 9, 2024

yash97 reopened this Sep 12, 2024

github-actions bot removed the stale Issue or PR is stale label Sep 13, 2024

github-actions bot added the stale Issue or PR is stale label Nov 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ip addresses leaking when there are too many ip in cooldown pool #2896

ip addresses leaking when there are too many ip in cooldown pool #2896

abbshr commented Apr 26, 2024

orsenthil commented May 1, 2024

abbshr commented May 2, 2024 via email

github-actions bot commented Aug 26, 2024

github-actions bot commented Sep 9, 2024

github-actions bot commented Nov 13, 2024

ip addresses leaking when there are too many ip in cooldown pool #2896

ip addresses leaking when there are too many ip in cooldown pool #2896

Comments

abbshr commented Apr 26, 2024

orsenthil commented May 1, 2024

abbshr commented May 2, 2024 via email

github-actions bot commented Aug 26, 2024

github-actions bot commented Sep 9, 2024

github-actions bot commented Nov 13, 2024