[Metricbeat] Improve the elasticsearch
module when used for Stack Monitoring
#39058
Labels
elasticsearch
module when used for Stack Monitoring
#39058
While investigating the root cause of indexing failures (also reported here in the past), we discovered that when using Metricbeat to feed Stack Monitoring, the
elasticsearch
module of Metricbeat shipselasticsearch.shard
documents with concrete IDs that are made of the current cluster state (i.e.,state_uuid
) and some other constant data. Since the cluster state doesn't change at the same pace as Metricbeat collection rounds (10s by default), those version conflicts happen all the time.Those version conflicts are probably a side-effect of switching to data streams in 8.0.0 (i.e. put if absent semantics with concrete ID) and weren't apparent earlier when the data was stored in simple indexes. Since each
elasticsearch.shard
document is about a shard placement in the cluster, the logic makes sense, i.e. there's no point re-indexing a document whose content hasn't changed since the last collection round.However, we could/should go one step further and detect if the cluster state hasn't changed between two collection rounds. I'm naively thinking about "simply" comparing the old and new
state_uuid
, but it might be more involved than that. Anyway, if there's no change, there's no point in even rebuilding those documents and sending them again, since we know they'll bounce anyway, generate a version conflict and increase the indexing failure counter for no reason. In addition to that, that wastes network bandwidth and CPU/RAM resource on ES side. For big clusters with many thousands of shards, that can make a big difference.Related issue: #36547 (comment)
The text was updated successfully, but these errors were encountered: