You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@xeniape ran into an issue (sble employees: see slack) where pods would be left with expired certificates after a while, rather than getting evicted by commons-op as expected. Restarting commons-op evicted the pods, as expected.
Our current working hypothesis here is that commons-op's re-reconciliation timer didn't advance while the computer was suspended, causing the eviction to be delayed by the same amount of time.
Possible solution
Either:
Change the timer to use wall time instead of monotonic/CPU time
Cap the re-reconciliation timer, causing spurious reconciles but at least limiting the issue
Make the timer automatically expire when resuming from suspend
Either way, we should probably also communicate upstream with kube-rs and either fix it there or highlight the issue somehow.
Additional context
No response
Environment
No response
Would you like to work on fixing this bug?
None
The text was updated successfully, but these errors were encountered:
Affected Stackable version
dev (24.11 prerelease)
Current and expected behavior
@xeniape ran into an issue (sble employees: see slack) where pods would be left with expired certificates after a while, rather than getting evicted by commons-op as expected. Restarting commons-op evicted the pods, as expected.
Our current working hypothesis here is that commons-op's re-reconciliation timer didn't advance while the computer was suspended, causing the eviction to be delayed by the same amount of time.
Possible solution
Either:
Either way, we should probably also communicate upstream with kube-rs and either fix it there or highlight the issue somehow.
Additional context
No response
Environment
No response
Would you like to work on fixing this bug?
None
The text was updated successfully, but these errors were encountered: