Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exporter fails when stackdriver updates their help strings #361

Open
henry-fn opened this issue Aug 21, 2024 · 0 comments
Open

Exporter fails when stackdriver updates their help strings #361

henry-fn opened this issue Aug 21, 2024 · 0 comments

Comments

@henry-fn
Copy link

henry-fn commented Aug 21, 2024

Recently the help string for some metrics changed, one such example metric is stackdriver_gce_instance_compute_googleapis_com_instance_uptime_total. This led to the exporter failing and yielding messages like the below (newlines added for readability)

34 error(s) occurred:
* [from Gatherer #2] collected metric stackdriver_gce_instance_compute_googleapis_com_instance_uptime_total label:{name:"instance_id" value:"REDACTED"} label:{name:"instance_name" value:"REDACTED"} label:{name:"project_id" value:"REDACTED"} label:{name:"unit" value:"s"} label:{name:"zone" value:"REDACTED"} gauge:{value:11520} timestamp_ms:1724192100000 

has help 

"Elapsed time since the VM was started, in seconds. After sampling, data is not visible for up to 120 seconds. When VM is Stopped (https://cloud.google.com/compute/docs/instances/stop-start-instance#stop-vm-google-cloud), the time is not calculated. On starting the VM again, the timer will reset to 0 for that VM." 

but should have 

"Elapsed time since the VM was started, in seconds."
... and many more of the same format

If this is something stackdriver is messing up I can open a Google support case instead, but this project or Google support seemed like the right place to handle it since the prometheus client does not allow changing these help strings (https://github.com/prometheus/client_golang/blob/b5361fed217651b4d855961b47481209ac0745a0/prometheus/registry.go#L640 causes the underlying failure)

I am not sure what the correct solution is here but I do know it is a really annoying issue since it means we essentially have to ignore groups of metrics at least until help string versions converge / age out. If there is a suggestion on how to fix this I can try to contribute as well, I just wasn't even sure where to start with this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant