You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
there is a current issue where it looks like the elector is, in one way or another, failing to communicate with the kubernetes api server. This is leading to a bunch of errors in the logs. This issue is to dig into this a little, see if any error handling could be improved, see if any log messaging could be improved, etc.
Most of the election magic is handling by the kubernetes package, so maybe extensive error handling is out of our control, but it still feels worth investigating.
Also, thinking about when to terminate vs just retry could be a fruitful thought process alongside the error/logging investigation.
I0411 13:29:57.869836 1 elector.go:166] re-running election
I0411 13:29:57.872555 1 leaderelection.go:242] attempting to acquire leader lease default/east-liberty-blackbox-election...
I0411 13:29:57.882848 1 recorder.go:13] lock event LeaderElection (Normal): east-liberty-blackbox-757f9b54b4-q7pjh became leader
I0411 13:29:57.882880 1 leaderelection.go:252] successfully acquired lease default/east-liberty-blackbox-election
I0411 13:29:57.883031 1 elector.go:210] [east-liberty-blackbox-757f9b54b4-q7pjh] started leading
I0411 13:29:59.393775 1 recorder.go:13] lock event LeaderElection (Normal): east-liberty-blackbox-757f9b54b4-q7pjh stopped leading
I0411 13:29:59.393810 1 leaderelection.go:288] failed to renew lease default/east-liberty-blackbox-election: failed to tryAcquireOrRenew context deadline exceeded
I0411 13:29:59.402312 1 elector.go:218] [east-liberty-blackbox-757f9b54b4-q7pjh] stepping down as leader
I0411 13:30:00.418657 1 elector.go:166] re-running election
I0411 13:30:00.419516 1 leaderelection.go:242] attempting to acquire leader lease default/east-liberty-blackbox-election...
I0411 13:30:00.427696 1 recorder.go:13] lock event LeaderElection (Normal): east-liberty-blackbox-757f9b54b4-q7pjh became leader
I0411 13:30:00.427707 1 leaderelection.go:252] successfully acquired lease default/east-liberty-blackbox-election
I0411 13:30:00.427769 1 elector.go:210] [east-liberty-blackbox-757f9b54b4-q7pjh] started leading
I0411 13:30:01.937246 1 recorder.go:13] lock event LeaderElection (Normal): east-liberty-blackbox-757f9b54b4-q7pjh stopped leading
I0411 13:30:01.937284 1 leaderelection.go:288] failed to renew lease default/east-liberty-blackbox-election: failed to tryAcquireOrRenew context deadline exceeded
I0411 13:30:01.945378 1 elector.go:218] [east-liberty-blackbox-757f9b54b4-q7pjh] stepping down as leader
E0411 13:30:02.668770 1 elector.go:214] failed to set leader annotation: the server was unable to return a response in the time allotted, but may still be processing the request (get pods east-liberty-blackbox-757f9b54b4-q7pjh)
E0411 13:30:12.301945 1 leaderelection.go:331] error retrieving resource lock default/east-liberty-blackbox-election: rpc error: code = Unavailable desc = transport is closing
E0411 13:30:12.301985 1 leaderelection.go:331] error retrieving resource lock default/east-liberty-blackbox-election: rpc error: code = Unavailable desc = transport is closing
E0411 13:30:12.302852 1 leaderelection.go:331] error retrieving resource lock default/east-liberty-blackbox-election: rpc error: code = Unavailable desc = transport is closing
E0411 13:30:12.304045 1 leaderelection.go:331] error retrieving resource lock default/east-liberty-blackbox-election: rpc error: code = Unavailable desc = transport is closing
E0411 13:30:12.305622 1 leaderelection.go:367] Failed to update lock: rpc error: code = Unavailable desc = transport is closing
E0411 13:30:15.152415 1 elector.go:222] failed to set standby annotation: rpc error: code = Unavailable desc = transport is closing
E0411 13:30:15.153263 1 elector.go:214] failed to set leader annotation: rpc error: code = Unavailable desc = transport is closing
I0411 13:30:16.152650 1 elector.go:166] re-running election
I0411 13:30:16.155592 1 leaderelection.go:242] attempting to acquire leader lease default/east-liberty-blackbox-election...
I0411 13:30:16.167184 1 recorder.go:13] lock event LeaderElection (Normal): east-liberty-blackbox-757f9b54b4-q7pjh became leader
I0411 13:30:16.167201 1 leaderelection.go:252] successfully acquired lease default/east-liberty-blackbox-election
I0411 13:30:16.167273 1 elector.go:210] [east-liberty-blackbox-757f9b54b4-q7pjh] started leading
I0412 19:25:18.324059 1 recorder.go:13] lock event LeaderElection (Normal): east-liberty-blackbox-757f9b54b4-q7pjh stopped leading
I0412 19:25:18.324122 1 leaderelection.go:288] failed to renew lease default/east-liberty-blackbox-election: failed to tryAcquireOrRenew context deadline exceeded
I0412 19:25:18.337305 1 elector.go:218] [east-liberty-blackbox-757f9b54b4-q7pjh] stepping down as leader
E0412 19:25:18.606373 1 leaderelection.go:367] Failed to update lock: Operation cannot be fulfilled on configmaps "east-liberty-blackbox-election": the object has been modified; please apply your changes to the latest version and try again
I0412 19:25:19.395127 1 elector.go:166] re-running election
I0412 19:25:19.415073 1 leaderelection.go:242] attempting to acquire leader lease default/east-liberty-blackbox-election...
I0412 19:25:19.420852 1 elector.go:233] new leader elected: east-liberty-blackbox-757f9b54b4-bv2mk
The text was updated successfully, but these errors were encountered:
there is a current issue where it looks like the elector is, in one way or another, failing to communicate with the kubernetes api server. This is leading to a bunch of errors in the logs. This issue is to dig into this a little, see if any error handling could be improved, see if any log messaging could be improved, etc.
Most of the election magic is handling by the kubernetes package, so maybe extensive error handling is out of our control, but it still feels worth investigating.
Also, thinking about when to terminate vs just retry could be a fruitful thought process alongside the error/logging investigation.
The text was updated successfully, but these errors were encountered: