Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

STATUS/SGE: Automatically report on scheduled downtimes #149

Open
HenrikBengtsson opened this issue Jun 5, 2024 · 1 comment
Open

STATUS/SGE: Automatically report on scheduled downtimes #149

HenrikBengtsson opened this issue Jun 5, 2024 · 1 comment

Comments

@HenrikBengtsson
Copy link
Contributor

HenrikBengtsson commented Jun 5, 2024

SGE maintenance windows are scheduled using SGE calendars. We can use that information to automatically populate docs/hpc/status/incidents-upcoming.md to give a heads-up to users.

All available SGE calendars:

$ qconf -scall
beegfs_outage
cc_outage
maint_downtime
n106_outage
rowA_downtime

Example of one of the SGE calendars:

$ qconf -scal maint_downtime
calendar_name    maint_downtime
year             30.10.2023,31.10.2023,1.11.2023,2.11.2023,3.11.2023=off
week             NONE

This says that this is a one-time event (week = NONE) running 2023-10-30 - 2023-11-03.

Our upcoming 2024-06-17T09:00-2024-06-18 downtown is encoded as:

$ qconf -scal maint_downtime
calendar_name    maint_downtime
year             17.6.2024,18.6.2024=off
week             NONE
@HenrikBengtsson
Copy link
Contributor Author

HenrikBengtsson commented Sep 14, 2024

How to list all calendars:

$ mapfile -t cals < <(qconf -scall)
$ for cal in "${cals[@]}"; do qconf -scal "${cal}"; done
calendar_name    beegfs_outage
year             8.7.2019-12.7.2019=off
week             NONE
calendar_name    cc_outage
year             23.10.2019=1-7=off
week             NONE
calendar_name    maint_downtime
year             17.6.2024,18.6.2024,19.6.2024,20.6.2024=off
week             NONE
calendar_name    n106_outage
year             4.8.2021=14:30-23:59=off 5.8.2021=0:00-8:00=off
week             NONE
calendar_name    rowA_downtime
year             27.4.2022=8-15=off
week             NONE

Ditto, but with ISO 8601 dates and HH:MM timestamps;

$ for cal in "${cals[@]}"; do qconf -scal "${cal}" | sed -E 's/\b([[:digit:]]+)[.]([[:digit:]]+)[.]([[:digit:]]+)\b/\3-\2-\1/g' | sed -E 's/-([[:digit:]])\b/-0\1/g' | sed -E 's/\b([[:digit:]]):/0\1:/g'; done
calendar_name    beegfs_outage
year             2019-07-08-2019-07-12=off
week             NONE
calendar_name    cc_outage
year             2019-10-23=1-07=off
week             NONE
calendar_name    maint_downtime
year             2024-06-17,2024-06-18,2024-06-19,2024-06-20=off
week             NONE
calendar_name    n106_outage
year             2021-08-04=14:30-23:59=off 2021-08-05=00:00-08:00=off
week             NONE
calendar_name    rowA_downtime
year             2022-04-27=8-15=off
week             NONE

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant