-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kibana alert fires when it should not have due to temporary disconnect of remote CCS connection #168293
Comments
Pinging @elastic/response-ops (Team:ResponseOps) |
Can you provide the rule type, and parameters used in the rule? |
potentially related to #168293 |
The action being used was iterating over the |
@henrikno I talked to @ymao1 and @pmuellr about this issue. We have other SDH related to that problem but we do not have access to the data like here. For us to find a solution, we need to investigate but to do that we need to log a little bit more information in the message like that Do you think that's possible? and will we be able to access this kibana? |
Created a dedicated investigation issue for this #175980 and linking this for the rule definition |
Kibana version:
8.10.2
Elasticsearch version:
8.10.2
Server OS version:
Elastic Cloud
Original install method (e.g. download page, yum, from source, etc.):
Elastic Cloud
Describe the bug:
We have an alert that queries for a specific document showing up at least 8 times within 10 minutes over a remote CCS connection. The alert triggers, but when we check there were zero documents that match the query, and we did not delete any documents. The history does not say that the query failed, it shows up as "Succeeded", yet no info about what triggered it. The only hit that something iffy happened is that the query took 15 seconds instead of the normal 1-2 seconds.
Steps to reproduce:
Expected behavior:
I expected the alert not to fire because there were no hits. Or at least give context about it firing because it could not get results.
Ideal scenario would be to not trigger if it's a transient issue, but if it's a sustained issue (for a configurable time), then trigger. For instance this seems to trigger when we do an upgrade, but then resolves itself.
Screenshots (if relevant):
Provide logs and/or server output (if relevant):
Any additional context:
The text was updated successfully, but these errors were encountered: