Skip to content
This repository has been archived by the owner on Nov 17, 2022. It is now read-only.

k8s-spot-rescheduler doesn't handle pod disruption budgets nicely, leaving nodes underutilized and tainted #54

Open
morganwalker opened this issue Jan 17, 2019 · 1 comment

Comments

@morganwalker
Copy link

We're using kops 1.10.0 and k8s 1.10.11. We're using two separate instance groups (IG), nodes (on-demand) and spots (spot), both spread across 3 availability zones. I've applied the appropriate nodeLabels and have defined the following in my k8s-spot-rescheduler deployment manifest:

- --on-demand-node-label=on-demand
- --spot-node-label=spot

The nodes IG has the spot=false:PreferNoSchedule taint so the spots IG is preferred. I'm using the cluster autoscaler to autodiscover both IGs via the --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,kubernetes.io/cluster/kubernetes.metis.wtf and these tags exist on both IGs. I've confirmed that pods on most nodes nodes are able to be drained and moved to spots nodes. With an exception:

  • k8s-spot-reschedule picks a node and states

    moved. Will drain node.
    

    which isn't true

  • It then figures out it's unable to drain the node due to PDBs

    E0117 14:03:51.801764       1 rescheduler.go:302] Failed to drain node: Failed to drain node /ip-172- 
    20-61-39.ec2.internal, due to following errors: [Failed to evict pod skafos-notebooks/hub- 
    deployment-cf799d494-gp6z4 within allowed timeout (last error: Cannot evict pod as it would 
    violate the pod's disruption budget.)]
    

    and aborts the drain.

Now we're left with an on-demand node that has had all of its pods evicted except those with PDBs, leaving the on-demand node underutilized and tainted with ToBeDeletedByClusterAutoscaler. It seems like it should check if it can drain all pods, taking into consideration PDBs, and if it can't, don't evict any pods and don't taint with ToBeDeletedByClusterAutoscaler.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants