Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[8.x] [SLO] Exclude stale slos from healthy count on overview (#201027)…
… (#201831) # Backport This will backport the following commits from `main` to `8.x`: - [[SLO] Exclude stale slos from healthy count on overview (#201027)](#201027) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Justin Kambic","email":"[email protected]"},"sourceCommit":{"committedDate":"2024-11-26T16:23:20Z","message":"[SLO] Exclude stale slos from healthy count on overview (#201027)\n\n## Summary\r\n\r\nResolves #198911.\r\n\r\nThe result is achieved by nesting a new filter agg inside the existing\r\n`HEALTHY` agg to remove any stale SLOs from the ultimate result.\r\n\r\nThis required a modification of the parsing code on the ES response to\r\ninclude a new `not_stale` key. The original `success` total is preserved\r\nin the `doc_count` of that agg, but is no longer referenced.\r\n\r\nThe filter for the `not_stale` agg I have added is the logical inverse\r\nof the filter we're using to determine stale SLOs:\r\n\r\n```json\r\n{\r\n \"range\": {\r\n \"summaryUpdatedAt\": {\r\n \"gte\": \"now-48h\"\r\n }\r\n }\r\n}\r\n```\r\n\r\n_Reviewer note: I also changed the spelling of a UI component, should be\r\na completely transparent change._\r\n\r\n## Example\r\n\r\n### Before\r\n\r\nThis is my local running on `main`:\r\n\r\n<img width=\"1116\" alt=\"image\"\r\nsrc=\"https://github.com/user-attachments/assets/80f86426-c7f1-4847-830f-a311c865a225\">\r\n\r\n\r\n### After\r\n\r\nThis is my local running on this PR branch:\r\n\r\n<img width=\"1120\" alt=\"image\"\r\nsrc=\"https://github.com/user-attachments/assets/2c4c4f26-2407-41ca-bf01-9ca730bbfab2\">\r\n\r\n\r\n### Proof query works\r\n\r\nYou can replicate these results by including a similar agg on a query\r\nagainst SLO data. I added a terms agg to the `stale` agg to determine\r\nhow many SLOs I need to remove. The number of `HEALTHY` SLOs showing up\r\nin `stale` should match the difference between the total `doc_count`\r\nfrom `healthy` and the `doc_count` in the `not_stale` sub-aggregation.\r\n\r\n#### Query\r\n\r\nYou can run this example aggs:\r\n\r\n```json\r\n{\r\n \"aggs\": {\r\n \"stale\": {\r\n \"filter\": {\r\n \"range\": {\r\n \"summaryUpdatedAt\": {\r\n \"lt\": \"now-48h\"\r\n }\r\n }\r\n },\r\n \"aggs\": {\r\n \"by_status\": {\r\n \"terms\": {\r\n \"field\": \"status\"\r\n }\r\n }\r\n }\r\n },\r\n \"healthy\": {\r\n \"filter\": {\r\n \"term\": {\r\n \"status\": \"HEALTHY\"\r\n }\r\n },\r\n \"aggs\": {\r\n \"not_stale\": {\r\n \"filter\": {\r\n \"range\": {\r\n \"summaryUpdatedAt\": {\r\n \"gte\": \"now-48h\"\r\n }\r\n }\r\n }\r\n }\r\n }\r\n }\r\n }\r\n}\r\n```\r\n\r\n#### Relevant output\r\n\r\nHere's a subset of my example query output. You can see that\r\n`stale.by_status.buckets[1]` contains a total of 2 docs, which is the\r\ndifference between `healthy.doc_count` and\r\n`healthy.not_stale.doc_count`.\r\n\r\n```json\r\n{\r\n \"stale\": {\r\n \"doc_count\": 7,\r\n \"by_status\": {\r\n \"doc_count_error_upper_bound\": 0,\r\n \"sum_other_doc_count\": 0,\r\n \"buckets\": [\r\n {\r\n \"key\": \"VIOLATED\",\r\n \"doc_count\": 5\r\n },\r\n {\r\n \"key\": \"HEALTHY\",\r\n \"doc_count\": 2\r\n }\r\n ]\r\n }\r\n },\r\n \"healthy\": {\r\n \"doc_count\": 9,\r\n \"not_stale\": {\r\n \"doc_count\": 7\r\n }\r\n }\r\n}\r\n```","sha":"a92103b2a9c06e3af30dea591ac769b995c78145","branchLabelMapping":{"^v9.0.0$":"main","^v8.18.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:enhancement","v9.0.0","backport:prev-minor","ci:project-deploy-observability","Team:obs-ux-management","v8.17.0"],"title":"[SLO] Exclude stale slos from healthy count on overview","number":201027,"url":"https://github.com/elastic/kibana/pull/201027","mergeCommit":{"message":"[SLO] Exclude stale slos from healthy count on overview (#201027)\n\n## Summary\r\n\r\nResolves #198911.\r\n\r\nThe result is achieved by nesting a new filter agg inside the existing\r\n`HEALTHY` agg to remove any stale SLOs from the ultimate result.\r\n\r\nThis required a modification of the parsing code on the ES response to\r\ninclude a new `not_stale` key. The original `success` total is preserved\r\nin the `doc_count` of that agg, but is no longer referenced.\r\n\r\nThe filter for the `not_stale` agg I have added is the logical inverse\r\nof the filter we're using to determine stale SLOs:\r\n\r\n```json\r\n{\r\n \"range\": {\r\n \"summaryUpdatedAt\": {\r\n \"gte\": \"now-48h\"\r\n }\r\n }\r\n}\r\n```\r\n\r\n_Reviewer note: I also changed the spelling of a UI component, should be\r\na completely transparent change._\r\n\r\n## Example\r\n\r\n### Before\r\n\r\nThis is my local running on `main`:\r\n\r\n<img width=\"1116\" alt=\"image\"\r\nsrc=\"https://github.com/user-attachments/assets/80f86426-c7f1-4847-830f-a311c865a225\">\r\n\r\n\r\n### After\r\n\r\nThis is my local running on this PR branch:\r\n\r\n<img width=\"1120\" alt=\"image\"\r\nsrc=\"https://github.com/user-attachments/assets/2c4c4f26-2407-41ca-bf01-9ca730bbfab2\">\r\n\r\n\r\n### Proof query works\r\n\r\nYou can replicate these results by including a similar agg on a query\r\nagainst SLO data. I added a terms agg to the `stale` agg to determine\r\nhow many SLOs I need to remove. The number of `HEALTHY` SLOs showing up\r\nin `stale` should match the difference between the total `doc_count`\r\nfrom `healthy` and the `doc_count` in the `not_stale` sub-aggregation.\r\n\r\n#### Query\r\n\r\nYou can run this example aggs:\r\n\r\n```json\r\n{\r\n \"aggs\": {\r\n \"stale\": {\r\n \"filter\": {\r\n \"range\": {\r\n \"summaryUpdatedAt\": {\r\n \"lt\": \"now-48h\"\r\n }\r\n }\r\n },\r\n \"aggs\": {\r\n \"by_status\": {\r\n \"terms\": {\r\n \"field\": \"status\"\r\n }\r\n }\r\n }\r\n },\r\n \"healthy\": {\r\n \"filter\": {\r\n \"term\": {\r\n \"status\": \"HEALTHY\"\r\n }\r\n },\r\n \"aggs\": {\r\n \"not_stale\": {\r\n \"filter\": {\r\n \"range\": {\r\n \"summaryUpdatedAt\": {\r\n \"gte\": \"now-48h\"\r\n }\r\n }\r\n }\r\n }\r\n }\r\n }\r\n }\r\n}\r\n```\r\n\r\n#### Relevant output\r\n\r\nHere's a subset of my example query output. You can see that\r\n`stale.by_status.buckets[1]` contains a total of 2 docs, which is the\r\ndifference between `healthy.doc_count` and\r\n`healthy.not_stale.doc_count`.\r\n\r\n```json\r\n{\r\n \"stale\": {\r\n \"doc_count\": 7,\r\n \"by_status\": {\r\n \"doc_count_error_upper_bound\": 0,\r\n \"sum_other_doc_count\": 0,\r\n \"buckets\": [\r\n {\r\n \"key\": \"VIOLATED\",\r\n \"doc_count\": 5\r\n },\r\n {\r\n \"key\": \"HEALTHY\",\r\n \"doc_count\": 2\r\n }\r\n ]\r\n }\r\n },\r\n \"healthy\": {\r\n \"doc_count\": 9,\r\n \"not_stale\": {\r\n \"doc_count\": 7\r\n }\r\n }\r\n}\r\n```","sha":"a92103b2a9c06e3af30dea591ac769b995c78145"}},"sourceBranch":"main","suggestedTargetBranches":["8.17"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/201027","number":201027,"mergeCommit":{"message":"[SLO] Exclude stale slos from healthy count on overview (#201027)\n\n## Summary\r\n\r\nResolves #198911.\r\n\r\nThe result is achieved by nesting a new filter agg inside the existing\r\n`HEALTHY` agg to remove any stale SLOs from the ultimate result.\r\n\r\nThis required a modification of the parsing code on the ES response to\r\ninclude a new `not_stale` key. The original `success` total is preserved\r\nin the `doc_count` of that agg, but is no longer referenced.\r\n\r\nThe filter for the `not_stale` agg I have added is the logical inverse\r\nof the filter we're using to determine stale SLOs:\r\n\r\n```json\r\n{\r\n \"range\": {\r\n \"summaryUpdatedAt\": {\r\n \"gte\": \"now-48h\"\r\n }\r\n }\r\n}\r\n```\r\n\r\n_Reviewer note: I also changed the spelling of a UI component, should be\r\na completely transparent change._\r\n\r\n## Example\r\n\r\n### Before\r\n\r\nThis is my local running on `main`:\r\n\r\n<img width=\"1116\" alt=\"image\"\r\nsrc=\"https://github.com/user-attachments/assets/80f86426-c7f1-4847-830f-a311c865a225\">\r\n\r\n\r\n### After\r\n\r\nThis is my local running on this PR branch:\r\n\r\n<img width=\"1120\" alt=\"image\"\r\nsrc=\"https://github.com/user-attachments/assets/2c4c4f26-2407-41ca-bf01-9ca730bbfab2\">\r\n\r\n\r\n### Proof query works\r\n\r\nYou can replicate these results by including a similar agg on a query\r\nagainst SLO data. I added a terms agg to the `stale` agg to determine\r\nhow many SLOs I need to remove. The number of `HEALTHY` SLOs showing up\r\nin `stale` should match the difference between the total `doc_count`\r\nfrom `healthy` and the `doc_count` in the `not_stale` sub-aggregation.\r\n\r\n#### Query\r\n\r\nYou can run this example aggs:\r\n\r\n```json\r\n{\r\n \"aggs\": {\r\n \"stale\": {\r\n \"filter\": {\r\n \"range\": {\r\n \"summaryUpdatedAt\": {\r\n \"lt\": \"now-48h\"\r\n }\r\n }\r\n },\r\n \"aggs\": {\r\n \"by_status\": {\r\n \"terms\": {\r\n \"field\": \"status\"\r\n }\r\n }\r\n }\r\n },\r\n \"healthy\": {\r\n \"filter\": {\r\n \"term\": {\r\n \"status\": \"HEALTHY\"\r\n }\r\n },\r\n \"aggs\": {\r\n \"not_stale\": {\r\n \"filter\": {\r\n \"range\": {\r\n \"summaryUpdatedAt\": {\r\n \"gte\": \"now-48h\"\r\n }\r\n }\r\n }\r\n }\r\n }\r\n }\r\n }\r\n}\r\n```\r\n\r\n#### Relevant output\r\n\r\nHere's a subset of my example query output. You can see that\r\n`stale.by_status.buckets[1]` contains a total of 2 docs, which is the\r\ndifference between `healthy.doc_count` and\r\n`healthy.not_stale.doc_count`.\r\n\r\n```json\r\n{\r\n \"stale\": {\r\n \"doc_count\": 7,\r\n \"by_status\": {\r\n \"doc_count_error_upper_bound\": 0,\r\n \"sum_other_doc_count\": 0,\r\n \"buckets\": [\r\n {\r\n \"key\": \"VIOLATED\",\r\n \"doc_count\": 5\r\n },\r\n {\r\n \"key\": \"HEALTHY\",\r\n \"doc_count\": 2\r\n }\r\n ]\r\n }\r\n },\r\n \"healthy\": {\r\n \"doc_count\": 9,\r\n \"not_stale\": {\r\n \"doc_count\": 7\r\n }\r\n }\r\n}\r\n```","sha":"a92103b2a9c06e3af30dea591ac769b995c78145"}},{"branch":"8.17","label":"v8.17.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: Justin Kambic <[email protected]>
- Loading branch information