Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

elasticsearch-xpack shard metricset not sending metrics #26314

Closed
ndtreviv opened this issue Jun 15, 2021 · 11 comments
Closed

elasticsearch-xpack shard metricset not sending metrics #26314

ndtreviv opened this issue Jun 15, 2021 · 11 comments
Labels
Stalled Team:Integrations Label for the Integrations team

Comments

@ndtreviv
Copy link

I'm trying to ship my monitoring stats to a separate cluster using metricbeat, but the shard metricset isn't being sent:
Screenshot 2021-06-15 at 09 26 16

There are no errors in the logs (follow the link to the discussion forum post and you can see some logs there).

elasticsearch-xpack config (vanilla):

# Module: elasticsearch
# Docs: https://www.elastic.co/guide/en/beats/metricbeat/7.9/metricbeat-module-elasticsearch.html

- module: elasticsearch
  xpack.enabled: true
  period: 10s
  hosts: ["http://localhost:9200"]
  #username: "user"
  #password: "secret"

metricbeat.yml:

metricbeat.config.modules:
  path: ${path.config}/modules.d/*.yml
  reload.enabled: false

setup.template.settings:
  index.number_of_shards: 1
  index.number_of_replicas: 0
  index.codec: best_compression

setup.kibana:
  host: "[redacted]:5601"
  ssl.verification_mode: none
  protocol: "http"

output.elasticsearch:
  hosts: ["[redacted]:9200"]
  ssl.verification_mode: none
  protocol: "http"

processors:
  - add_host_metadata: ~
  - add_cloud_metadata: ~
  - add_docker_metadata: ~
  - add_kubernetes_metadata: ~
  - add_fields:
      target: node
      fields:
        name: "es-live-1"
        display_name: "[redacted]:es-live-1"
        public_hostname: "[redacted]"
        cluster_name: "cluster-live"
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Jun 15, 2021
@ChrsMark ChrsMark added the Team:Integrations Label for the Integrations team label Jun 15, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations (Team:Integrations)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Jun 15, 2021
@ndtreviv
Copy link
Author

On the discuss forum someone noted that the module config specified an older verison. This was because I had upgraded metricbeat on this node, but I get the same error on fresh installs with the following config:

# Module: elasticsearch
# Docs: https://www.elastic.co/guide/en/beats/metricbeat/7.x/metricbeat-module-elasticsearch.html

- module: elasticsearch
  xpack.enabled: true
  period: 10s
  hosts: ["http://localhost:9200"]
  #username: "user"
  #password: "secret"
  metricsets:
    - node
    - node_stats
    - cluster_stats
    - index
    - index_recovery
    - shard
    - index_summary
    - pending_tasks

@ndtreviv
Copy link
Author

I had some sporadic shard monitoring data come into my monitoring cluster the past couple of nights, and it correlates with status changes.

In other words: I only got updated shard data when a shard changed status.

So now I'm wondering whether this is a feature of the shard metricset?

It's not clear in the documentation at all, and it's incompatible with my grafana dashboards, unfortunately. But that seems to be the most likely explanation so far...

Can anyone confirm this?

@ndtreviv
Copy link
Author

Someone has confirmed on the discuss group that this is the case. This then becomes a documentation issue on the metricbeat shard metricset page - much easier to resolve!

@predogma
Copy link
Contributor

predogma commented Jul 9, 2021

@ndtreviv The elasticsearch module in metricbeat has two configurations one for monitoring (xpack.enabled: true) or metricsets depending on purpose.

The elasticsearch module can be used to collect metrics shown in our Stack Monitoring UI in Kibana. To enable this usage, set xpack.enabled: true and remove any metricsets from the module’s configuration.

If have monitoring config (xpack.enabled: true), the index .monitoring-es-* indices will populate and are used in rendering in Kibana > Stack Monitoring views for that ES.

If set just metricsets and set xpack.enabled: false, and with metricsets defined then should populate in metricbeat-*

@ndtreviv
Copy link
Author

@predogma I specifically specified metricsets to try and get around this error: #26284

It currently keeps throwing errors about collecting ml_job stats, which I don't want (I have ml turned off on my cluster specifically). I thought at the time that specifying metricsets and excluding this one would help me. I guess not!
Thanks for clarifying.

@ndtreviv
Copy link
Author

Anyway, the fix to this issue is to update the documentation on the metricsets to make it clear what data each set draws in.

For example: The shards metricset does not bring in all data about shards - only shards that have changed status as and when they do change status, and so far I've only seen it report shards that have STARTED, not shards changing to REALLOCATING or UNASSIGNED.

@botelastic
Copy link

botelastic bot commented Jul 19, 2022

Hi!
We just realized that we haven't looked into this issue in a while. We're sorry!

We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1.
Thank you for your contribution!

@botelastic botelastic bot added Stalled and removed Stalled labels Jul 19, 2022
@Agraphie
Copy link

👍

@botelastic
Copy link

botelastic bot commented Sep 21, 2023

Hi!
We just realized that we haven't looked into this issue in a while. We're sorry!

We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1.
Thank you for your contribution!

@botelastic botelastic bot added the Stalled label Sep 21, 2023
@botelastic botelastic bot closed this as completed Mar 19, 2024
@consulthys
Copy link
Contributor

There might be one possible explanation of this behavior.

Looking at the source code of the shard metricset, we can see that the cluster state is retrieved every x seconds (10 by default) and the routing table is iterated over every time. Hence we could expect to see one document per shard being indexed over and over.

Looking at the way the document ID is generated, we can see that the cluster state_uuid is used as prefix of that ID.

This means that

  1. if the cluster state doesn't change very often, that ID is stable and the shard document IDs will be constant. As a result, the first document will be indexed and the subsequent ones will be rejected (i.e. due to put if absent semantics)
  2. if the cluster state changes more often, shard documents might get indexed much more frequently.

In the first case, you can probably correlate this with a higher amount of indexing failures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Stalled Team:Integrations Label for the Integrations team
Projects
None yet
Development

No branches or pull requests

6 participants