Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1: how to expire nodes only during certain times? #7122

Open
mikkoc opened this issue Sep 30, 2024 · 2 comments
Open

v1: how to expire nodes only during certain times? #7122

mikkoc opened this issue Sep 30, 2024 · 2 comments
Labels
triage/needs-information Marks that the issue still needs more information to properly triage

Comments

@mikkoc
Copy link

mikkoc commented Sep 30, 2024

In 0.37 this config allows us to expire nodes only during weekdays/working hours:

 disruption:
  consolidationPolicy: WhenUnderutilized
  expireAfter: 48h
  # Number of nodes: disrupt 1 node at a time, only during weekdays, from 10 to 18 UTC.
  # Schedule: zero disruptions (0) are allowed starting at 8am UTC on Friday, lasting 3 days + 2hrs.
  budgets:
  - nodes: "1"
   schedule: "0 10 * * mon-thu"
   duration: 8h
  - nodes: "0"
   schedule: "0 8 * * fri"
   duration: 74h

After migrating to v1 the config looks more like this:

apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
 name: generic
spec:
 template:
  spec:
   nodeClassRef:
    apiVersion: karpenter.k8s.aws/v1
    kind: EC2NodeClass
    name: generic
   expireAfter: 48h   ### I don't want this to take effect during nights and weekends.

 disruption:
  consolidationPolicy: WhenEmptyOrUnderutilized
  consolidateAfter: 30m
  budgets:
  - nodes: "1"
   schedule: "0 10 * * mon-thu"
   duration: 8h
  - nodes: "0"
   schedule: "0 8 * * fri"
   duration: 74h

And nodes are expired also during weekends. How can I keep the behaviour of karpenter 0.37 in v1? From the documentation, it is unclear to me how to achieve this.

Thanks

@jmdeal
Copy link
Contributor

jmdeal commented Sep 30, 2024

This isn't currently possible to achieve with expiration on v1. Expiration was changed from a graceful to a forceful expiration method at v1 (RFC: kubernetes-sigs/karpenter#1303) which means it can no longer be restricted by disruption budgets. This enabled enforcing strict maximum node lifetimes through expiration and termination grace period. Graceful disruption methods are still subject to disruption budgets (i.e. drift and consolidation). I'm wondering if your goals can be better met by drift, what are you trying to achieve by rolling nodes roughly every 48 hours?

@jmdeal jmdeal added the triage/needs-information Marks that the issue still needs more information to properly triage label Sep 30, 2024
@mikkoc
Copy link
Author

mikkoc commented Oct 1, 2024

From the Karpenter 0.37 NodePool example config: https://karpenter.sh/v0.37/concepts/nodepools/

# Avoiding long-running Nodes helps to reduce security vulnerabilities as well as to reduce the chance of issues that can plague Nodes with long uptimes such as file fragmentation or memory leaks from system processes

Since I ever started working with Kubernetes I was always taught to avoid long-running nodes. So even after all these years I try to keep my nodes "fresh" by rotating them.

If you have an alternative mechanism to achieve this with Karpenter v1, please share. I would like to avoid having my production nodes reaching 2 weeks of life (that's how often we deploy). But at the same time, I'd like to retain more control over when nodes rotate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage/needs-information Marks that the issue still needs more information to properly triage
Projects
None yet
Development

No branches or pull requests

2 participants