Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add health check service #1112

Open
donkirkby opened this issue Jan 13, 2021 · 0 comments
Open

Add health check service #1112

donkirkby opened this issue Jan 13, 2021 · 0 comments

Comments

@donkirkby
Copy link
Member

We had a server mysteriously power off, and it wasn't noticed for a few days. Add a health check service that pings all our ongoing services around the network on a regular schedule. (Somewhere between 10 minutes and an hour?)

I think the best method is the --host option of systemctl, but healthchecks.io might also be useful. I guess it makes sense to configure a YAML file with a list of hosts and service names, along with some kind of schedule and notification lists. Some services might be more convenient to monitor through the modification date of a log file. I guess we'd need two copies on separate hosts to monitor each other.

Think through the security implications, because you need SSH access. Maybe configure a very limited account for health check access.

@donkirkby donkirkby added this to the Near future milestone Jan 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant