You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While migrating the transit relay to a new host, I noticed that munin wasn't reporting any events for the first hour (under the "events since reboot" plugin named wormhole_transit_events). The server writes the actual (accurate) timestamp of reboot into the usagedb, and the munin plugin uses a SQL query that only looks for events with a timestamp greater than the reboot time. But.. the actual event timestamps are blurred (rounded to the nearest 3600 seconds), causing them to look like they happened before the last reboot, so the munin script ignores them.
For now, I just manually changed the "rebooted" timestamp to be one second less than the blurred value of the current time (so just before the last blur point).
The long-term fix will be to add an extra field to the current DB table, with the blurred reboot time, and have the munin plugins compare against that instead of the actual reboot time. Also, the plugins should compare event_time >= blurred_reboot, instead of the current event_time > reboot, since everything in that first half-ish hour window should be counted.
This will cause some inaccuracies, as some events will be double-counted. I suspect there's a hard tradeoff to be made, between double-counting some events vs never-counting some events.
The text was updated successfully, but these errors were encountered:
While migrating the transit relay to a new host, I noticed that munin wasn't reporting any events for the first hour (under the "events since reboot" plugin named
wormhole_transit_events
). The server writes the actual (accurate) timestamp of reboot into the usagedb, and the munin plugin uses a SQL query that only looks for events with a timestamp greater than the reboot time. But.. the actual event timestamps are blurred (rounded to the nearest 3600 seconds), causing them to look like they happened before the last reboot, so the munin script ignores them.For now, I just manually changed the "rebooted" timestamp to be one second less than the blurred value of the current time (so just before the last blur point).
The long-term fix will be to add an extra field to the
current
DB table, with the blurred reboot time, and have the munin plugins compare against that instead of the actual reboot time. Also, the plugins should compareevent_time >= blurred_reboot
, instead of the currentevent_time > reboot
, since everything in that first half-ish hour window should be counted.This will cause some inaccuracies, as some events will be double-counted. I suspect there's a hard tradeoff to be made, between double-counting some events vs never-counting some events.
The text was updated successfully, but these errors were encountered: