You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As of cdd179b, the Machine controller blocks delete when named disks are attached to the VM. This prevents Machine delete in the following cases:
(a) The Cluster is being deleted, and the Node corresponding to the Machine is not drained (by design).
(b) The kubelet running on the Machine's corresponding VM is not responding, and therefore the CSI driver cannot unmount detach the disks from the VM.
(c) The CSI driver is malfunctioning.
Delete Machine corresponding to the Node where the Pod is scheduled.
...
Expected behavior
The Machine controller should allow delete when named disk volumes are attached to the corresponding VM.
This means that the Machine controller should detach the named disk volumes on delete. Right now, to unblock the delete, some other API client (e.g. the CSI driver, or a tenant admin) must detach the disks from the VM. Please note that the two actions are equivalent with respect to data durability in scenarios (b) and (c).
Although we could try to address scenario (a) by changing Cluster API, the change would be limited by the design of drain. Namely, drain does not evict any Pods managed by a DaemonSet. So, if such a Pod uses persistent storage, the named disk would not be detached from the VM. In any case, graceful termination is not a guarantee, and applications cannot expect it.
For comparison, the AWS, Azure, GCP, and vSphere infrastructure providers do not block Machine delete in this case.
Also for automated machine management it will be very helpful if the machine controller allows machine deletion when named disks are attached. This allows to (auto) recover from kubelet/CSI driver failures, or any other scenario where draining of the node is not possible.
Describe the bug
As of cdd179b, the Machine controller blocks delete when named disks are attached to the VM. This prevents Machine delete in the following cases:
(a) The Cluster is being deleted, and the Node corresponding to the Machine is not drained (by design).
(b) The kubelet running on the Machine's corresponding VM is not responding, and therefore the CSI driver cannot unmount detach the disks from the VM.
(c) The CSI driver is malfunctioning.
/cc @erkanerol
Reproduction steps
...
Expected behavior
The Machine controller should allow delete when named disk volumes are attached to the corresponding VM.
This means that the Machine controller should detach the named disk volumes on delete. Right now, to unblock the delete, some other API client (e.g. the CSI driver, or a tenant admin) must detach the disks from the VM. Please note that the two actions are equivalent with respect to data durability in scenarios (b) and (c).
Although we could try to address scenario (a) by changing Cluster API, the change would be limited by the design of drain. Namely, drain does not evict any Pods managed by a DaemonSet. So, if such a Pod uses persistent storage, the named disk would not be detached from the VM. In any case, graceful termination is not a guarantee, and applications cannot expect it.
For comparison, the AWS, Azure, GCP, and vSphere infrastructure providers do not block Machine delete in this case.
Additional context
Original discussion: https://kubernetes.slack.com/archives/C04JFT7GDGR/p1681728672112079
The text was updated successfully, but these errors were encountered: