You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
To perform a rollback, you would need to use Bottlerocket’s settings API to lock a node to a given version. Brupop does some work to prevent errant updates from rolling out across the whole cluster: if the Brupop agent fails to come up on an updated node, then that node will never re-enter the “Idle” state from Brupop’s perspective. Since only a finite (configurable) number of nodes can be updated at a time, this means that complete update failures have a configurable maximum number of nodes that can be impacted by a Brupop-managed update.
Unfortunately, this will not help in cases where updates don’t affect the node’s ability to boot or Brupop’s ability to successfully manage the node, but do impair other applications. One helpful feature is that Brupop respects PodDisruptionBudgets, so using PDBs on an application can prevent Brupop from continuing to drain more hosts while older ones are impaired.
It may be helpful to consider a feature that allows Brupop to rollback nodes under some configurable circumstance. In practice, this would only work in cases where the Brupop agent itself comes back up but applications do not. We would need some way for users to “hook” their application health into Brupop.
The text was updated successfully, but these errors were encountered:
Issue or Feature Request:
To perform a rollback, you would need to use Bottlerocket’s settings API to lock a node to a given version. Brupop does some work to prevent errant updates from rolling out across the whole cluster: if the Brupop agent fails to come up on an updated node, then that node will never re-enter the “Idle” state from Brupop’s perspective. Since only a finite (configurable) number of nodes can be updated at a time, this means that complete update failures have a configurable maximum number of nodes that can be impacted by a Brupop-managed update.
Unfortunately, this will not help in cases where updates don’t affect the node’s ability to boot or Brupop’s ability to successfully manage the node, but do impair other applications. One helpful feature is that Brupop respects PodDisruptionBudgets, so using PDBs on an application can prevent Brupop from continuing to drain more hosts while older ones are impaired.
It may be helpful to consider a feature that allows Brupop to rollback nodes under some configurable circumstance. In practice, this would only work in cases where the Brupop agent itself comes back up but applications do not. We would need some way for users to “hook” their application health into Brupop.
The text was updated successfully, but these errors were encountered: