Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Karpenter overestimates memory capacity of certain node types #7060

Open
JacobHenner opened this issue Sep 24, 2024 · 0 comments
Open

Karpenter overestimates memory capacity of certain node types #7060

JacobHenner opened this issue Sep 24, 2024 · 0 comments
Labels
bug Something isn't working needs-triage Issues that need to be triaged

Comments

@JacobHenner
Copy link

Description

Observed Behavior:

Karpenter is overestimating the memory capacity of certain node types. When this happens, pods with a certain range of memory requests can trigger Karpenter scale-ups of nodes with insufficient memory for that pending pod to be scheduled. Observing that the pending pod isn't getting scheduled on the newly started node, Karpenter repeatedly attempts to scale up similar nodes with the same result.

In addition to preventing pods from scheduling, this issue has caused us to incur additional costs from third-party integrations that charge by node count, as the repeated erroneous scale ups impact node count metrics used in billing.

In our case, we noticed this with c6g.medium instances running Bottlerocket (with AMI provided by AWS without modification). It's possible that Karpenter underestimates capacity of other instance types and distributions as well, but we have not confirmed this independently. We've also not yet compared the capacity values of c6g nodes running AL2 vs Bottlerocket.

Expected Behavior:

  • Karpenter should never overestimate the capacity/allocatable of a node (using the default value of VM_MEMORY_OVERHEAD_PERCENT, at least across all unmodified AWS-provided non-custom AMI families).
  • If this type of situation does occur, Karpenter should not continuously provision new nodes.

We are aware that this risk is called out in the troubleshooting guide:

A VM_MEMORY_OVERHEAD_PERCENT which results in Karpenter overestimating the memory available on a node can result in Karpenter launching nodes which are too small for your workload.
In the worst case, this can result in an instance launch loop and your workload remaining unschedulable indefinitely.

But I think the default should be suitable for all of the AWS-supported non-custom AMI families across instance types and sizes. If this isn't feasible, then perhaps this value should not be a global setting, and should vary by AMI family and instance type/size.

Reproduction Steps (Please include YAML):

The following steps do not reproduce the problem, but will demonstrate the issue:

  • Create a EC2NodeClass and NodePool with c6g.medium Bottlerocket instances
  • Trigger scale-up of this NodePool
  • Compare the capacity and allocatable values of the NodeClaim vs the Node, noting that the NodeClaim has larger memory capacity/allocatable values than the Node object

Example from our case:

NodeClaim:

status:
  allocatable:
    cpu: 940m
    ephemeral-storage: 89Gi
    memory: 1392Mi
    pods: "8"
    vpc.amazonaws.com/pod-eni: "4"
  capacity:
    cpu: "1"
    ephemeral-storage: 100Gi
    memory: 1835Mi
    pods: "8"
    vpc.amazonaws.com/pod-eni: "4"

Node:

  allocatable:
    cpu: 940m
    ephemeral-storage: "95500736762"
    hugepages-1Gi: "0"
    hugepages-2Mi: "0"
    hugepages-32Mi: "0"
    hugepages-64Ki: "0"
    memory: 1419032Ki
    pods: "8"
  capacity:
    cpu: "1"
    ephemeral-storage: 102334Mi
    hugepages-1Gi: "0"
    hugepages-2Mi: "0"
    hugepages-32Mi: "0"
    hugepages-64Ki: "0"
    memory: 1872664Ki
    pods: "8"

Note that 1879040Ki [1835Mi] (NodeClaim) > 1872664Ki (Node).

The default value of VM_MEMORY_OVERHEAD_PERCENT (0.075) is in use for this example.

Versions:

  • Karpenter Version: 1.0.1
  • Kubernetes Version (kubectl version): 1.28, 1.30
@JacobHenner JacobHenner added bug Something isn't working needs-triage Issues that need to be triaged labels Sep 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs-triage Issues that need to be triaged
Projects
None yet
Development

No branches or pull requests

1 participant