You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on May 22, 2023. It is now read-only.
Keep ECDSA client offers plenty of metrics and diagnostics allowing to monitor the health of the node. However, there is no obvious way to monitor the health of third-party nodes which could be important especially if the node is a member of n-of-n threshold keep with the node being offline. Having an easy way to determine which nodes are offline and what is the impact could help operators to alert each other before a signature is requested from a keep.
One option to achieve it is to start warning in logs if a node sees a peer drop from their list for more than N minutes while they still have an active stake/keeps. We could also limit the warnings to the nodes with which the node being operated has active keeps with.
Another option, not requiring any change in the client, could be a remote telemetry service. The node exposes diagnostics with the list of connected peers that together with the graph can be used to identify offline operators that still have active keeps. This option could be even further enhanced by modeling the network topology for operators who opt-in to the mechanism and submit their diagnostics periodically.
The text was updated successfully, but these errors were encountered:
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Keep ECDSA client offers plenty of metrics and diagnostics allowing to monitor the health of the node. However, there is no obvious way to monitor the health of third-party nodes which could be important especially if the node is a member of n-of-n threshold keep with the node being offline. Having an easy way to determine which nodes are offline and what is the impact could help operators to alert each other before a signature is requested from a keep.
One option to achieve it is to start warning in logs if a node sees a peer drop from their list for more than N minutes while they still have an active stake/keeps. We could also limit the warnings to the nodes with which the node being operated has active keeps with.
Another option, not requiring any change in the client, could be a remote telemetry service. The node exposes diagnostics with the list of connected peers that together with the graph can be used to identify offline operators that still have active keeps. This option could be even further enhanced by modeling the network topology for operators who opt-in to the mechanism and submit their diagnostics periodically.
The text was updated successfully, but these errors were encountered: