Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge freezes on peers happening during agent offline periods #387

Merged
merged 9 commits into from
Aug 3, 2023

Conversation

cvaroqui
Copy link
Member

@cvaroqui cvaroqui commented Aug 2, 2023

No description provided.

To not use too much horizontal space in the node columns.
Draining is too alarming because of the ambiguity with the "drain"
action, which is a outager.
Also avoid defering the stacking over cancelers when not
necessary.
Inherited from the imon original copy.
Instead of creating a new identical pubsub.Label{}.
And emit it from nmon when rejoined, before transitioning nmon
state to idle.
Touch a var/last_shutdown file when the nmon routine is canceled.

Compare this last_shutdown mtime with the peer nodes FrozenAt, and
freeze the node if peer was frozen when we were down.

For each object, compare this last_shutdown mtime with the peer
instances FrozenAt, and freeze the local instance if the peer
instance was frozen when we were down.
If SIGHUP kills the daemon brutally, which is error prone for
admins used to SIGHUP being a magic "sync with reality" trigger.

Better to ignore it, pending implementation of a reconfiguring
handler.
And fix the ordering of daemon/daemon sub routines cancels.

Add a sync.WaitGroup to nmon to be able to synchrously wait for
its canceling.
@cgalibern cgalibern merged commit 4aa8798 into opensvc:main Aug 3, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants