Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto cache when new node added to cluster #213

Open
maacarbo opened this issue Apr 19, 2023 · 4 comments
Open

Auto cache when new node added to cluster #213

maacarbo opened this issue Apr 19, 2023 · 4 comments

Comments

@maacarbo
Copy link

maacarbo commented Apr 19, 2023

In AWS EKS, we intensively use auto scaling clusters. It would be handy if the controller knows when a new node is spin up and directly starts to cache the images.

@leonidkhelemes
Copy link

+1

@elocke
Copy link

elocke commented May 3, 2023

+1 I kindof expected this already happened.

@jaihwan104
Copy link

+1

@djmcgreal-cc
Copy link

My exact question, the top issue in the list!

This is likely a major use case in Machine Learning where a) GPUs are more expensive so typically scale often and b) images are large.

In this auto-scale-up case, Pods are waiting to be scheduled immediately so will probably not be able to take advantage of the kube-fledged cache refresh to load images into the new node (which I assume at least works?). Perhaps kube-fledged could be configured to manage a taint on newly provisioned nodes that's removed when images have been loaded from the cache. In cluster-autoscaler, taints can be prefixed with ignore-taint.cluster-autoscaler.kubernetes.io/ so they do not effect auto scaling groups selection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants