SRE-3945: Add service impacts to Datadog alerts in Slack #38
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This commit enriches the Datadog alert messages forwarded to Slack. Alert titles are now bolded, the date is included, as well as the alert priority and scope. There's also now a link to the specific event, and the affected services are listed. This change allows for quicker and more efficient responses to alerts. In the Datadog processor, the alert structure now includes 'Services', indicating which services are affected by the alert. The processor also now queries a Prometheus endpoint to get a list of services affected by the product specified in the alert scope.
In the Slack message template, it now extracts the channel and thread IDs from the same GitLab pipeline variables as before, but under updated identifiers. Moreover, it tries to fetch chart images by checking more snapshot URLs than before. These enrichments provide much-needed context right in the Slack message, removing the need to jump back to Datadog or GitLab for additional investigation.