-
Notifications
You must be signed in to change notification settings - Fork 743
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ip addresses leaking when there are too many ip in cooldown pool #2896
Comments
The default cooldown period is 30 secs,after which the ips in the cooldown will be made available to the pods. Are you noticing that this isn't happening? The reason for keeping the warm_ip_target is to ensure that VPC CNI doesn't do an unnecessary EC2 api call, which is avoid running to API throttles. If tuning of ip cooldown period is preferable, it can be controlled using this flag - https://github.com/aws/amazon-vpc-cni-k8s/?tab=readme-ov-file#ip_cooldown_period-v1150 |
I mean during the cooldown period, pod can't get an ip even if there are
free ips in vpc subnet, which I think is not being expected.
This happens because total ip count - assigned ip count < warm_ip_target,
and assigned ip count + cooldown ip count = total ip count.
Senthil Kumaran ***@***.***> 于 2024年5月2日周四 04:16写道:
… do these within 30s: delete one pod and create two new pod, the last pod
will fail by can not assign IP address to pod
The default cooldown period is 30 secs,after which the ips in the cooldown
will be made available to the pods. Are you noticing that this isn't
happening?
The reason for keeping the warm_ip_target is to ensure that VPC CNI
doesn't do an unnecessary EC2 api call, which is avoid running to API
throttles.
If tuning of ip cooldown period is preferable, it can be controlled using
this flag -
https://github.com/aws/amazon-vpc-cni-k8s/?tab=readme-ov-file#ip_cooldown_period-v1150
—
Reply to this email directly, view it on GitHub
<#2896 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAXJWS7VYXAQ4BONAZEOOB3ZAFERBAVCNFSM6AAAAABG23UCQSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOBZGA3DKNZSGI>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days |
Issue closed due to inactivity. |
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days |
What happened:
these are my ipadm's warm pool settings:
during pressure test, we create a lot of Pods in a short time and just delete them when they became ready.
this operation make a lot of ip addresses go into cool-down state, and new created pod can not be assigned an IP until the IP_COOLDOWN_PERIOD exceeded.
Attach logs
Kubelet log:
new pod can't assign IP address:
IPadm log:
you can see, more and more cool down ip present, but the
short
never > 0:when cool down ip = 4, and assigned ip = 19, we have no free ip in ipadm pool, but 23(MINIMUM_IP_TARGET) -19(assigned ip) > 2 (WARM_IP_TARGET), ipadm will not allocated more free IP from VPC:
after IP_COOLDOWN_PERIOD , new pod can be assigned an IP:
after reading the source code here:
I found that the vpc-cni plugin calculate as follow:
and
AvailableAddress()
implement here, just total attached ips - assigned ips:This is fine most of the time.
But if there are enough ip in cool down state, suppose cool down cnt + assigned cnt = total attached cnt, there still no free ip can be assign to Pod, because "total - assigned" > "WARM_IP_TARGET", so ipadm won't try to allocate more ip from VPC.
Finally, the new created pod can not get a ip address even there r so many free ip in the VPC.
What you expected to happen:
pod should be assigned an IP when the VPC subnet has enough IP.
How to reproduce it (as minimally and precisely as possible):
MINIMUM_IP_TARGET - WARM_IP_TARGET
(5)Anything else we need to know?:
Environment: aws eks
kubectl version
): 1.29The text was updated successfully, but these errors were encountered: