Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Response Ops] Break down task run success SLI into observability and security alerting types #165989

Open
ymao1 opened this issue Sep 7, 2023 · 4 comments
Labels
Feature:Task Manager Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)

Comments

@ymao1
Copy link
Contributor

ymao1 commented Sep 7, 2023

With #163652 we added SLI metrics for task run success, broken down into alerting and action task types. We think it'd be useful to further break down the alerting task types into security alerting task types and observability alerting task types. This would help us narrow down where to focus our investigations when those SLOs are breached.

Currently the metrics look like

{
  "task_run": {
    "timestamp": "2023-09-06T13:43:52.205Z",
    "value": {
      "by_type": {
         "alerting": {
           "success": 1,
           "total": 1
         },
         "alerting:<ruleTypeId>": {
           "success": 1,
           "total": 1
         }
      }
    }
  }
}

It'd be useful to add a grouping for alerting_security and alerting_observability

@ymao1 ymao1 added Feature:Task Manager Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) labels Sep 7, 2023
@elasticmachine
Copy link
Contributor

Pinging @elastic/response-ops (Team:ResponseOps)

@mikecote
Copy link
Contributor

mikecote commented Sep 7, 2023

@kobelb should we prioritize this for 8.11?

@mikecote mikecote moved this from Awaiting Triage to Todo in AppEx: ResponseOps - Execution & Connectors Sep 7, 2023
@kobelb
Copy link
Contributor

kobelb commented Sep 7, 2023

@kobelb should we prioritize this for 8.11?

Yes. Otherwise, we're going to have to manually investigate and route these issues.

@ymao1
Copy link
Contributor Author

ymao1 commented Oct 10, 2023

We might be able to do this in the SLO itself by filtering on project type

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Task Manager Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)
Projects
No open projects
Development

No branches or pull requests

4 participants