Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pod expiration drifts when system is suspended #302

Open
nightkr opened this issue Nov 13, 2024 · 0 comments
Open

Pod expiration drifts when system is suspended #302

nightkr opened this issue Nov 13, 2024 · 0 comments
Labels

Comments

@nightkr
Copy link
Member

nightkr commented Nov 13, 2024

Affected Stackable version

dev (24.11 prerelease)

Current and expected behavior

@xeniape ran into an issue (sble employees: see slack) where pods would be left with expired certificates after a while, rather than getting evicted by commons-op as expected. Restarting commons-op evicted the pods, as expected.

Our current working hypothesis here is that commons-op's re-reconciliation timer didn't advance while the computer was suspended, causing the eviction to be delayed by the same amount of time.

Possible solution

Either:

  1. Change the timer to use wall time instead of monotonic/CPU time
  2. Cap the re-reconciliation timer, causing spurious reconciles but at least limiting the issue
  3. Make the timer automatically expire when resuming from suspend

Either way, we should probably also communicate upstream with kube-rs and either fix it there or highlight the issue somehow.

Additional context

No response

Environment

No response

Would you like to work on fixing this bug?

None

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant