Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log based distribution metric counts each event 10 times #392

Open
jannikschaper opened this issue Nov 21, 2024 · 0 comments
Open

Log based distribution metric counts each event 10 times #392

jannikschaper opened this issue Nov 21, 2024 · 0 comments

Comments

@jannikschaper
Copy link

jannikschaper commented Nov 21, 2024

In our project we noticed a peculiar discrepancy between the distribution metric values as reported by Google Cloud itself, and as exported into Prometheus by stackdriver_exporter. Upon further investigation, we could narrow down the misbehavior to what I stated in the title: each event is seemingly counted exactly 10 times, blowing up the metric values to 10x the correct value

We triggered three events after another. In Google Cloud, the _count metric correctly jumps to 1 for each event as they come in:

image

In Prometheus, using stackdriver_exporter, the _count metric incorrectly jumps to 10 for each event as they come in:

image

Later, we also tried triggering the same request twice and indeed, the stackdriver_exporter metric value increased first to 10, then to 20:

image


We set up stackdriver_exporter via Helm, using prometheus-stackdriver-exporter 4.6.2, corresponding to stackdriver_exporter 0.16.0

Out stackdriver_exporter (Helm) configuration:

prometheus-stackdriver-exporter:
  nameOverride: 'gcloud-metrics-exporter'
  fullnameOverride: 'gcloud-metrics-exporter'
  stackdriver:
    projectId: [redacted]
    serviceAccountKey: [redacted]
    metrics:
      typePrefixes: [redacted]
      # Workaround needed for accurate histogram metrics
      # https://github.com/prometheus-community/stackdriver_exporter?tab=readme-ov-file#what-to-know-about-aggregating-delta-metrics
      aggregateDeltas: true
      aggregateDeltasTTL: '30m' # default value from https://github.com/prometheus-community/helm-charts/blob/f2aeaf773cd22ae2bffb7ec846b06eadf4169387/charts/prometheus-stackdriver-exporter/values.yaml#L88 for lack of better judgment
  serviceMonitor:
    enabled: true
    namespace: monitoring
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant