You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Use of KTF outside CI may leak GKE clusters due to humans forgetting to delete them after usage. To avoid unnecessary cloud spend, we'd like a means to have them default to ephemeral unless marked otherwise.
Proposed Solution
Add a GKE cluster label indicating a deletion date.
Default this label to three days after creation.
Provide a flag allowing users to override the default.
Add a scheduled task to delete clusters past their expiration.
Additional information
Elsewhere we run cleanup jobs as part of CI in projects that use KTF. There isn't really a great CI location for this scheduled deletion, since this is intended for clusters created outside CI. I don't think it makes sense to run this in the KTF repo itself.
GCP does provide https://cloud.google.com/scheduler with 3 free jobs a month ($0.10/job/month after). We should use it as a project-agnostic deletion method.
We may want a "never" option but it's simpler to just use dates as label values, and we should probably discourage indefinite lifetime clusters anyway. Setting the duration to 9999 or something similarly ridiculous or manually removing the label should be sufficient if you really, really want to avoid the cleanup job.
Acceptance Criteria
KTF's GKE cluster creation utility provides a CLI flag that takes a number of days as input and sets an expiration date label.
The CLI flag defaults to 3 days.
We provide an example GKE Cloud Scheduler configuration that deletes clusters past expiration.
We use the scheduler job on our GKE project.
The text was updated successfully, but these errors were encountered:
Is there an existing issue for this?
Problem Statement
Use of KTF outside CI may leak GKE clusters due to humans forgetting to delete them after usage. To avoid unnecessary cloud spend, we'd like a means to have them default to ephemeral unless marked otherwise.
Proposed Solution
Additional information
Elsewhere we run cleanup jobs as part of CI in projects that use KTF. There isn't really a great CI location for this scheduled deletion, since this is intended for clusters created outside CI. I don't think it makes sense to run this in the KTF repo itself.
GCP does provide https://cloud.google.com/scheduler with 3 free jobs a month ($0.10/job/month after). We should use it as a project-agnostic deletion method.
We may want a "never" option but it's simpler to just use dates as label values, and we should probably discourage indefinite lifetime clusters anyway. Setting the duration to 9999 or something similarly ridiculous or manually removing the label should be sufficient if you really, really want to avoid the cleanup job.
Acceptance Criteria
The text was updated successfully, but these errors were encountered: