-
Notifications
You must be signed in to change notification settings - Fork 964
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Karpenter is always disrupting node via drift after upgrading to 0.37.3 from #7049
Comments
How often are you seeing drift occur? Also, do you have Karpenter logs from when drift occurred? Should see something generated from: https://github.com/kubernetes-sigs/karpenter/blob/v0.37.3/pkg/controllers/nodeclaim/disruption/drift.go#L82 |
The drift event is happened every 24h and it's affected in whole nodes, even i don’t make any changes on nodepool/ec2nodeclass.. Anyway i didn’t found any related logs from this https://github.com/kubernetes-sigs/karpenter/blob/v0.37.3/pkg/controllers/nodeclaim/disruption/drift.go#L82 Is there any possibilities the logs is from this code? https://github.com/kubernetes-sigs/karpenter/blob/04a921c00ad837c5c82fe190f4b6c39f4dffe6fa/pkg/controllers/disruption/controller.go#L151 |
anyway, on version 0.37.3 the drift is can be modified to disable/enable, but when i plan to upgrade to 1.0.0 the feature was dropped and cannot be disabled, so this is my concern when i upgrade karpenter to 1.0.0 |
This problem is also occurring here. Chart Version: 0.37.0 |
any updates @rschalo |
The logs @rschalo was looking for are only available if you had debug logging enabled, it doesn't look like either of you did. Given both of your examples happened within close proximity to an eks-optimized AMI release, I suspect Nodes were drifted due to that. To know for sure we would either need to see debug logs, the status conditions on the drifted NodeClaims, or the reason from the
I would also like to clarify that this isn't accurate. Drift is a separate disruption mechanism from consolidation and expiration, these fields are not intended to have any affect on it. As you found, you can disable drift globally through the feature gate pre-v1, and post v1 you can effectively disable it per NodePool via disruption budgets. |
Description
Observed Behavior:
Karpenter is always disrupting node via drift, even consolidate configuration is
expireAfter: Never
andconsolidateAfter: Never
.This issue is happen after i've upgrade from 0.32.10 to 0.37.3.
And it's happen to whole nodes in the cluster.
Logs
Expected Behavior:
Karpenter should not disrupt node when i defined
Reproduction Steps (Please include YAML):
Nodepoool
NodeClass
Versions:
kubectl version
): v1.28.12-eks-a18cd3aThe text was updated successfully, but these errors were encountered: