Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

/health endpoint returns 404 #386

Open
Dezarin opened this issue Nov 14, 2024 · 3 comments
Open

/health endpoint returns 404 #386

Dezarin opened this issue Nov 14, 2024 · 3 comments

Comments

@Dezarin
Copy link

Dezarin commented Nov 14, 2024

I tried updating to version v0.17.0 of the stackdriver-exporter container but the pod never achieves a running state, because it fails it's liveness probes. Queries to the /metrics endpoint works as expected, but /health returns 404 errors. I tried looking for an updated helm-chart for version v0.17.0, but it has not been released yet.

foo:~# curl 10.244.1.17:9255/metrics
# HELP go_gc_duration_seconds A summary of the wall-time pause (stop-the-world) duration in garbage collection cycles.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 0
go_gc_duration_seconds{quantile="0.25"} 0
go_gc_duration_seconds{quantile="0.5"} 0
go_gc_duration_seconds{quantile="0.75"} 0
go_gc_duration_seconds{quantile="1"} 0
<truncated>
foo:~#  curl 10.244.1.17:9255/health
404 page not found
Containers:
  prometheus-stackdriver-exporter:
    Container ID:  containerd://0e46d42f0432dfd3dcc37acd6f9b78edbcc6ae1c71c907b89d2ea67d55aa4269
    Image:         prometheuscommunity/stackdriver-exporter:v0.17.0
    Image ID:      docker.io/prometheuscommunity/stackdriver-exporter@sha256:ca514180d5f5e4997e78f94ad23a08d7ad81b932485bd2152c98504cb38c1fdb
    Port:          9255/TCP
    Host Port:     0/TCP
    Command:
      stackdriver_exporter
    Args:
      --google.project-id=<REMOVED>
      --monitoring.metrics-interval=5m
      --monitoring.metrics-offset=0s
      --monitoring.metrics-type-prefixes=compute.googleapis.com/instance/cpu
      --stackdriver.backoff-jitter=1s
      --stackdriver.http-timeout=10s
      --stackdriver.max-backoff=5s
      --stackdriver.max-retries=0
      --stackdriver.retry-statuses=503
      --web.listen-address=:9255
      --web.telemetry-path=/metrics
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    2
      Started:      Thu, 14 Nov 2024 10:55:31 -0700
      Finished:     Thu, 14 Nov 2024 10:56:21 -0700
    Ready:          False
    Restart Count:  5
    Liveness:       http-get http://:http/health delay=30s timeout=10s period=10s #success=1 #failure=3
    Readiness:      http-get http://:http/health delay=10s timeout=10s period=10s #success=1 #failure=3
@initharrington
Copy link

initharrington commented Nov 18, 2024

I upgraded to v0.17.0 this morning, same issue with health endpoint. Using https://github.com/prometheus-community/helm-charts/tree/main/charts/prometheus-stackdriver-exporter (v4.6.2 chart) causes it to just crashloop.

@Dezarin
Copy link
Author

Dezarin commented Nov 18, 2024

You can modify the liveness / readiness checks in the chart to point to / instead of /health and the pod will come to a ready state, but the /health endpoint should be restored.

@thinkjk
Copy link

thinkjk commented Dec 5, 2024

I'm also running into this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants