-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFE: output log message when contacting the update server recovers #407
Comments
I don't think that point 2 is useful. We already emit such message at To that extent, a whole tutorial in fcos-docs showing how to plumb Zincati metrics into Prometheus and Grafana would definitely be more useful. I'll try to find some time to write that. |
What if we just limited it down to just an INFO with a link to the metrics docs webpage? That webpage could then provide both the dumb |
Regarding point 1, it's kinda reasonable even though metrics are the right tool for this, really. Specifically, your client observed a total of three backend glitches in more than 10 days. What you don't see there is the amount of non-error requests it completed in the same timespan. Metrics capture both and help you put this in perspective (my quick-math on your logs gives a 0.09% error rate) and let you track the non-error counter in real time. An info message like "successful contact after X errors" can be added to break apart long strikes, agreed. |
Related: Zincati now exposes its current state via a systemd |
Feature Request
As a linux user I've been conditioned to look at logs. When I'm interested in how Zincati is doing I check systemd/journald and I see:
My first thought is, "oh no, the update server must be having trouble. Or maybe my node networking is off. Either way, is my node going to receive updates?".
Arguably the user (me) should know about the better way to find out about running state by checking the exposed metrics, but it's not easily discoverable on the system without reading docs. I have two proposals to band-aid over the knee-jerk reaction from the user and also lead them to the right way to check things.
Desired Feature
A crude example would be something like:
This is a bit more verbose and the most possible info we could provide. A variant of this should suffice.
The text was updated successfully, but these errors were encountered: