Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Stack Monitoring] [Alerting] Investigate why "Missing monitoring data" rule is much slower than other rules on the same cluster #123844

Closed
jasonrhodes opened this issue Jan 26, 2022 · 8 comments
Assignees
Labels
Feature:Alerting/RuleTypes Issues related to specific Alerting Rules Types Feature:Stack Monitoring performance Team:Infra Monitoring UI - DEPRECATED DEPRECATED - Label for the Infra Monitoring UI team. Use Team:obs-ux-infra_services

Comments

@jasonrhodes
Copy link
Member

jasonrhodes commented Jan 26, 2022

Using APM data to investigate the alerting rules and their performance, it was discovered that the "Missing monitoring data" rule appears to be running much slower than the other rules in a given cluster. (Info provided by @lizozom).

APM graphs Screen Shot 2022-01-26 at 11 07 00 AM Screen Shot 2022-01-26 at 11 07 17 AM Screen Shot 2022-01-26 at 11 07 31 AM

Acceptance Criteria

  • Look into why this rule is slower than other rules (e.g. is this query doing something much different in ES, or are we processing data outside of ES?)
  • If the fix is simple, we can fix it here. If not, log an implementation bugfix ticket to fix the problem.
@botelastic botelastic bot added the needs-team Issues missing a team label label Jan 26, 2022
@jasonrhodes jasonrhodes added Feature:Alerting/RuleTypes Issues related to specific Alerting Rules Types Feature:Stack Monitoring Team: Actionable Observability - DEPRECATED For Observability Alerting and SLOs use "Team:obs-ux-management", for AIops "Team:obs-knowledge" Team:Infra Monitoring UI - DEPRECATED DEPRECATED - Label for the Infra Monitoring UI team. Use Team:obs-ux-infra_services and removed needs-team Issues missing a team label labels Jan 26, 2022
@elasticmachine
Copy link
Contributor

Pinging @elastic/infra-monitoring-ui (Team:Infra Monitoring UI)

@neptunian
Copy link
Contributor

neptunian commented Jan 28, 2022

Haven't look into this much but noticed today while looking at this rule's query that, unlike the others, it doesn't have a term filter for the metricset in the bool query. I noticed while writing unit tests for them and added one. Also left a comment https://github.com/elastic/kibana/pull/124033/files#r794930698

@lizozom
Copy link
Contributor

lizozom commented Jan 31, 2022

@neptunian do you believe this should improve performance?
Maybe we could test on cloud staging once those chagnes are merged.
I'm sure that testing this kind of changes might help us understand how we want to monitor and alert on performance degredations in the future.

@neptunian neptunian self-assigned this Feb 1, 2022
@neptunian
Copy link
Contributor

@lizozom Yes the filter should improve performance. Will let you know when its merged.

@neptunian
Copy link
Contributor

@lizozom The change to the query was merged!

@lizozom
Copy link
Contributor

lizozom commented Feb 2, 2022

@neptunian any thoughts on how to benchmark this?

@neptunian
Copy link
Contributor

I'll setup APM locally with some test data and see what I find.

@emma-raffenne emma-raffenne removed the Team: Actionable Observability - DEPRECATED For Observability Alerting and SLOs use "Team:obs-ux-management", for AIops "Team:obs-knowledge" label Feb 22, 2022
@neptunian
Copy link
Contributor

We found that having a default 1 day lookback does slow down the query when there is a lot of data in the range. @jasonrhodes did a comparison:
Pre-Change Latency (ms) and Post-Change Latency (ms)

We've opened another issue #126709 to discuss improvements so I am closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Alerting/RuleTypes Issues related to specific Alerting Rules Types Feature:Stack Monitoring performance Team:Infra Monitoring UI - DEPRECATED DEPRECATED - Label for the Infra Monitoring UI team. Use Team:obs-ux-infra_services
Projects
None yet
Development

No branches or pull requests

5 participants