Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting "Insufficient vpc.amazonaws.com/pod-eni" error with prefix mode enabled with a nitro based instance #3112

Open
uyilmaz opened this issue Nov 13, 2024 · 3 comments

Comments

@uyilmaz
Copy link

uyilmaz commented Nov 13, 2024

What happened:

I have an EKS cluster with a single worker node of type r6g.medium. I want to run many small pods on it so I set ENABLE_PREFIX_DELEGATION to true to increase the amount of IP's. I'm also using security groups for pods at the same time.

In the node events I can see that trunk interface is attached:

Controller
  Normal   ControllerVersionNotice  9m19s   vpc-resource-controller  The node is managed by VPC resource controller version v1.4.10
  Normal   NodeReady                9m18s   kubelet                  Node ip-x-x-x-x.x.compute.internal status is now: NodeReady
  Normal   NodeTrunkInitiated       9m15s   vpc-resource-controller  The node has trunk interface initialized successfully

In the ipamd.log file I can see these lines:

"msg":"Instance supports Prefix Delegation"}
{"level":"info","ts":"2024-11-13T05:30:43.235Z","caller":"ipamd/ipamd.go:380","msg":"Prefix Delegation enabled true"}
{"level":"debug","ts":"2024-11-13T05:30:43.235Z","caller":"ipamd/ipamd.go:385","msg":"Start node init"}
{"level":"debug","ts":"2024-11-13T05:30:43.235Z","caller":"ipamd/ipamd.go:2270","msg":"max prefix 3 max ips 48"}
{"level":"debug","ts":"2024-11-13T05:30:43.235Z","caller":"ipamd/ipamd.go:400","msg":"Max ip per ENI 48 and max prefixes per ENI 3"
...
{"level":"debug","ts":"2024-11-13T09:35:43.766Z","caller":"ipamd/ipamd.go:1283","msg":"ENI eni-xxx cannot be deleted because it is primary"}
{"level":"debug","ts":"2024-11-13T09:35:43.766Z","caller":"ipamd/ipamd.go:1283","msg":"ENI eni-yyy cannot be deleted because it is a trunk ENI"}

There are 11 pods currently in Running state, including aws system pods like aws-node. 12th pod I deploy gets stuck at Pending state , saying

Warning  FailedScheduling        21m (x5 over 41m)    default-scheduler        0/1 nodes are available: 1 Insufficient vpc.amazonaws.com/pod-eni. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod.
Warning  FailedCreatePodSandBox  17m                  kubelet                  Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "xxx": plugin type="aws-cni" name="aws-cni" failed (add): add cmd: failed to assign an IP address to container
Normal   SecurityGroupRequested  15m (x4 over 17m)    vpc-resource-controller  Pod will get the following Security Groups [sg-xxx]
Normal   ResourceAllocated       15m                  vpc-resource-controller  Allocated [{"eniId":"eni-x","ifAddress"xx:"0a:59:38:6c:b7:c7","privateIp":"10.9.194.205","ipv6Addr":"","vlanId":4,"subnetCidr":"10.9.194.0/23","subnetV6Cidr":""}] to the pod

CNI Metrics helper shows these stats on cloudwatch:

image

Environment:

  • Kubernetes version (use kubectl version): v1.31.2-eks-7f9249a
  • CNI Version: v1.18.6-eksbuild.1 (aws-network-policy-agent:v1.1.4-eksbuild.1)
  • OS (e.g: cat /etc/os-release): Amazon Linux 2
  • Kernel (e.g. uname -a): Linux ip-x-x-xxx-xx.ap-northeast-1.compute.internal x.xx.xxx-xxx.xxx.amzn2.aarch64 #1 SMP Tue Oct 22 16:38:25 UTC 2024 aarch64 aarch64 aarch64 GNU/Linux
@yash97
Copy link
Contributor

yash97 commented Nov 20, 2024

r6g.medium supports 4 branch interface . https://github.com/aws/amazon-vpc-resource-controller-k8s/blob/master/pkg/aws/vpc/limits.go#L9658. So you can deploy 4 pods using Security group per node. Let us know if this condition is not satisfied.

@uyilmaz
Copy link
Author

uyilmaz commented Nov 22, 2024

@yash97 Thanks for answering!

Doesn't prefix delegation increase that limit? Does it only help when pods don't use security groups?

@orsenthil
Copy link
Member

@uyilmaz , prefix delegation increases only number the number of IP addresses. In this case with Pods using Security Groups, each pod will take a branch interface and you are limited by the number of branch interfaces in your instance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants