Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

panic: duplicate label names #85

Open
acondrat opened this issue May 4, 2020 · 13 comments
Open

panic: duplicate label names #85

acondrat opened this issue May 4, 2020 · 13 comments

Comments

@acondrat
Copy link
Member

acondrat commented May 4, 2020

Looks like the exporter crashes when a metric has duplicate label names.

time="2020-05-04T13:49:10Z" level=info msg="Starting stackdriver_exporter (version=0.7.0, branch=HEAD, revision=a339261e716271d77f6dc73d1998600d6d31089b)" source="stackdriver_exporter.go:136"
time="2020-05-04T13:49:10Z" level=info msg="Build context (go=go1.14.2, user=root@6bfda044714a, date=20200501-12:39:15)" source="stackdriver_exporter.go:137"
time="2020-05-04T13:49:10Z" level=info msg="Listening on :9255" source="stackdriver_exporter.go:163"
panic: duplicate label names

goroutine 162 [running]:
github.com/prometheus/client_golang/prometheus.MustNewConstHistogram(...)
	/app/vendor/github.com/prometheus/client_golang/prometheus/histogram.go:619
github.com/prometheus-community/stackdriver_exporter/collectors.(*TimeSeriesMetrics).newConstHistogram(0xc000377a18, 0xc0005fe280, 0x50, 0xc000a47600, 0xe, 0x10, 0xc0003e4870, 0xc000a51da0, 0xc000a47700, 0xe, ...)
	/app/collectors/monitoring_metrics.go:94 +0x19a
github.com/prometheus-community/stackdriver_exporter/collectors.(*TimeSeriesMetrics).completeHistogramMetrics(0xc000377a18)
	/app/collectors/monitoring_metrics.go:186 +0x1c7
github.com/prometheus-community/stackdriver_exporter/collectors.(*TimeSeriesMetrics).Complete(0xc000377a18)
	/app/collectors/monitoring_metrics.go:149 +0x39
github.com/prometheus-community/stackdriver_exporter/collectors.(*MonitoringCollector).reportTimeSeriesMetrics(0xc0000b88f0, 0xc000940000, 0xc000356b00, 0xc0001a4f00, 0xc000940000, 0x0)
	/app/collectors/monitoring_collector.go:370 +0x10a3
github.com/prometheus-community/stackdriver_exporter/collectors.(*MonitoringCollector).reportMonitoringMetrics.func1.1(0xc000412640, 0xc0000b88f0, 0xe1193bb, 0xed6421353, 0x0, 0xe1193bb, 0xed642147f, 0x0, 0xc0003645a0, 0xc000356b00, ...)
	/app/collectors/monitoring_collector.go:223 +0x5e7
created by github.com/prometheus-community/stackdriver_exporter/collectors.(*MonitoringCollector).reportMonitoringMetrics.func1
	/app/collectors/monitoring_collector.go:197 +0x3f3

It seems like something introduced in v0.7.0 as I don't see the same issue in v0.6.0

@SuperQ
Copy link
Contributor

SuperQ commented May 4, 2020

I wonder if this is a side effect of #50

@omerlh
Copy link

omerlh commented May 25, 2020

Having the exact same issue here, testing version 0.6.0 and it is not reproduce-able - so it's probably either 0.7.0 or 0.8.0 (0.7.0 introduce #50)

@SuperQ
Copy link
Contributor

SuperQ commented May 25, 2020

Can you try building and running with #50 reverted?

@omerlh
Copy link

omerlh commented May 25, 2020

Still crash with the same error:

panic: duplicate label names

goroutine 99 [running]:
github.com/prometheus/client_golang/prometheus.MustNewConstMetric(...)
/Users/omerlh/go/pkg/mod/github.com/prometheus/[email protected]/prometheus/value.go:106
github.com/prometheus-community/stackdriver_exporter/collectors.(*TimeSeriesMetrics).newConstMetric(0xc000183a10, 0xc0005c8500, 0x4d, 0x34f92458, 0xed65d7041, 0x0, 0xc000285480, 0x8, 0x8, 0x2, ...)
/Users/omerlh/dev/stackdriver_exporter/collectors/monitoring_metrics.go:138 +0x204
github.com/prometheus-community/stackdriver_exporter/collectors.(*TimeSeriesMetrics).completeConstMetrics(0xc000183a10)
/Users/omerlh/dev/stackdriver_exporter/collectors/monitoring_metrics.go:179 +0x1dd
github.com/prometheus-community/stackdriver_exporter/collectors.(*TimeSeriesMetrics).Complete(0xc000183a10)
/Users/omerlh/dev/stackdriver_exporter/collectors/monitoring_metrics.go:160 +0x2b
github.com/prometheus-community/stackdriver_exporter/collectors.(*MonitoringCollector).reportTimeSeriesMetrics(0xc000156000, 0xc00011e200, 0xc000140500, 0xc000128480, 0xc00011e200, 0x0)
/Users/omerlh/dev/stackdriver_exporter/collectors/monitoring_collector.go:400 +0x13c4
github.com/prometheus-community/stackdriver_exporter/collectors.(*MonitoringCollector).reportMonitoringMetrics.func1.1(0xc000606620, 0xc000156000, 0x34f92458, 0xed65d6f15, 0x0, 0x34f92458, 0xed65d7041, 0x0, 0xc000756540, 0xc000140500, ...)
/Users/omerlh/dev/stackdriver_exporter/collectors/monitoring_collector.go:253 +0x6d7
created by github.com/prometheus-community/stackdriver_exporter/collectors.(*MonitoringCollector).reportMonitoringMetrics.func1
/Users/omerlh/dev/stackdriver_exporter/collectors/monitoring_collector.go:227 +0x2bb

@SuperQ
Copy link
Contributor

SuperQ commented May 25, 2020

So I guess it's not #50 then, something else with the client_golang upgrade.

@SuperQ
Copy link
Contributor

SuperQ commented May 25, 2020

Can you include more details? Like the flags you're using with the exporter?

@acondrat
Copy link
Member Author

Please find my setup bellow. I was having the duplicates panic issues with logging.googleapis.com/user. All other prefixes seem fine.

spec:
  containers:
  - command:
    - stackdriver_exporter
    env:
    - name: STACKDRIVER_EXPORTER_MONITORING_METRICS_TYPE_PREFIXES
      value: bigtable.googleapis.com/cluster,loadbalancing.googleapis.com/https/request_count,custom.googleapis.com,logging.googleapis.com/user
    - name: STACKDRIVER_EXPORTER_MONITORING_METRICS_INTERVAL
      value: 5m
    - name: STACKDRIVER_EXPORTER_MONITORING_METRICS_OFFSET
      value: 0s
    - name: STACKDRIVER_EXPORTER_WEB_LISTEN_ADDRESS
      value: :9255
    - name: STACKDRIVER_EXPORTER_WEB_TELEMETRY_PATH
      value: /metrics
    - name: STACKDRIVER_EXPORTER_MAX_RETRIES
      value: "0"
    - name: STACKDRIVER_EXPORTER_HTTP_TIMEOUT
      value: 10s
    - name: STACKDRIVER_EXPORTER_MAX_BACKOFF_DURATION
      value: 5s
    - name: STACKDRIVER_EXPORTER_BACKODFF_JITTER_BASE
      value: 1s
    - name: STACKDRIVER_EXPORTER_RETRY_STATUSES
      value: "503"
    image: prometheuscommunity/stackdriver-exporter:v0.7.0

@dgarcdu
Copy link

dgarcdu commented Jun 15, 2020

Same issue here with v0.9.1:

level=info ts=2020-06-15T11:40:31.592Z caller=stackdriver_exporter.go:136 msg="Starting stackdriver_exporter" version="(version=0.9.1, branch=HEAD, revision=770b1be3d430ef9768f30a2a5d2e35557e464f3c)"
level=info ts=2020-06-15T11:40:31.592Z caller=stackdriver_exporter.go:137 msg="Build context" build_context="(go=go1.14.4, user=root@faf330a7765b, date=20200602-12:12:58)"
level=info ts=2020-06-15T11:40:31.592Z caller=stackdriver_exporter.go:158 msg="Listening on" address=:9255
panic: duplicate label names

goroutine 9602 [running]:
github.com/prometheus/client_golang/prometheus.MustNewConstMetric(...)
	/app/vendor/github.com/prometheus/client_golang/prometheus/value.go:106
github.com/prometheus-community/stackdriver_exporter/collectors.(*TimeSeriesMetrics).newConstMetric(0xc000e37a10, 0xc0004c2460, 0x4d, 0x12059df0, 0xed67954c8, 0x0, 0xc000871b00, 0xe, 0x10, 0x2, ...)
	/app/collectors/monitoring_metrics.go:139 +0x204
github.com/prometheus-community/stackdriver_exporter/collectors.(*TimeSeriesMetrics).completeConstMetrics(0xc000e37a10)
	/app/collectors/monitoring_metrics.go:180 +0x1dd
github.com/prometheus-community/stackdriver_exporter/collectors.(*TimeSeriesMetrics).Complete(0xc000e37a10)
	/app/collectors/monitoring_metrics.go:161 +0x2b
github.com/prometheus-community/stackdriver_exporter/collectors.(*MonitoringCollector).reportTimeSeriesMetrics(0xc00087b080, 0xc000901a00, 0xc000d2fe00, 0xc00165ef00, 0xc000901a00, 0x0)
	/app/collectors/monitoring_collector.go:414 +0x13c4
github.com/prometheus-community/stackdriver_exporter/collectors.(*MonitoringCollector).reportMonitoringMetrics.func1.1(0xc002207920, 0xc00087b080, 0x12059e1f, 0xed679539c, 0x0, 0x12059e1f, 0xed67954c8, 0x0, 0xc000fac720, 0xc000d2fe00, ...)
	/app/collectors/monitoring_collector.go:267 +0x6d7
created by github.com/prometheus-community/stackdriver_exporter/collectors.(*MonitoringCollector).reportMonitoringMetrics.func1
	/app/collectors/monitoring_collector.go:241 +0x3f3

Edit 2020/07/14 I can confirm that the issue is still present in v0.10.0:


goroutine 477 [running]:
github.com/prometheus/client_golang/prometheus.MustNewConstMetric(...)
	/app/vendor/github.com/prometheus/client_golang/prometheus/value.go:107
github.com/prometheus-community/stackdriver_exporter/collectors.(*TimeSeriesMetrics).newConstMetric(0xc0005cda10, 0xc0010ad7c0, 0x43, 0x1e63a5d8, 0xed69f2568, 0x0, 0xc001fc1580, 0x8, 0x8, 0x2, ...)
	/app/collectors/monitoring_metrics.go:139 +0x204
github.com/prometheus-community/stackdriver_exporter/collectors.(*TimeSeriesMetrics).completeConstMetrics(0xc0005cda10)
	/app/collectors/monitoring_metrics.go:180 +0x1dd
github.com/prometheus-community/stackdriver_exporter/collectors.(*TimeSeriesMetrics).Complete(0xc0005cda10)
	/app/collectors/monitoring_metrics.go:161 +0x2b
github.com/prometheus-community/stackdriver_exporter/collectors.(*MonitoringCollector).reportTimeSeriesMetrics(0xc0001e2540, 0xc001c10630, 0xc0005f4500, 0xc0003940c0, 0xc001c10630, 0x0)
	/app/collectors/monitoring_collector.go:406 +0x13c4
github.com/prometheus-community/stackdriver_exporter/collectors.(*MonitoringCollector).reportMonitoringMetrics.func1.1(0xc0004f8f60, 0xc0001e2540, 0x1e63a902, 0xed69f24b4, 0x0, 0x1e63a902, 0xed69f25e0, 0x0, 0xc000720780, 0xc0005f4500, ...)
	/app/collectors/monitoring_collector.go:259 +0x6d7
created by github.com/prometheus-community/stackdriver_exporter/collectors.(*MonitoringCollector).reportMonitoringMetrics.func1
	/app/collectors/monitoring_collector.go:233 +0x3f3

@jakubbujny
Copy link

jakubbujny commented Sep 9, 2020

So I've debugged it as I have the same case.

The root cause of the problem is when you have defined custom metrics based on logs with some extractors and these extractors load the same labels as are injected by default by GCP logging

Example:
You are on GKE. You have custom metric based on logs with extractor from field resource.labels.cluster_name into label cluster_name. For custom metrics on GKE cluster_name is already reported by default by GCP so you will see the duplicated labels what cause the panic.

Workaround: delete your custom extractors which are technically not needed

Edit:
As far as I can see project_id is also injected by default.

@hanikesn
Copy link

hanikesn commented Feb 9, 2021

So we had the exact same issue as above having duplicated the project_id ourselves. But we discovered an other issue with duplicate labels after enabling audit logs for spanner:

* [from Gatherer #2] collected metric "stackdriver_spanner_instance_logging_googleapis_com_log_entry_count" { label:<name:"instance_config" value:"" > label:<name:"instance_id" value:"instance-east-1" > label:<name:"location" value:"us-east1" > label:<name:"log" value:"cloudaudit.googleapis.com/data_access" > label:<name:"project_id" value:"production" > label:<name:"severity" value:"INFO" > label:<name:"unit" value:"1" > gauge:<value:10527 > timestamp_ms:1612880903770 } was collected before with the same name and label values
* [from Gatherer #2] collected metric "stackdriver_spanner_instance_logging_googleapis_com_byte_count" { label:<name:"instance_config" value:"" > label:<name:"instance_id" value:"instance-east-1" > label:<name:"location" value:"us-east1" > label:<name:"log" value:"cloudaudit.googleapis.com/data_access" > label:<name:"project_id" value:"production" > label:<name:"severity" value:"INFO" > label:<name:"unit" value:"By" > gauge:<value:2.2907337e+07 > timestamp_ms:1612880903770 } was collected before with the same name and label values

I think it'd make sense to make the exporter more robust and only report duplicate labels on the cli and export an error metric instead.

EDIT: Same issue as in: #103

@dgarcdu
Copy link

dgarcdu commented Feb 9, 2021

So I've debugged it as I have the same case.

The root cause of the problem is when you have defined custom metrics based on logs with some extractors and these extractors load the same labels as are injected by default by GCP logging

Example:
You are on GKE. You have custom metric based on logs with extractor from field resource.labels.cluster_name into label cluster_name. For custom metrics on GKE cluster_name is already reported by default by GCP so you will see the duplicated labels what cause the panic.

Workaround: delete your custom extractors which are technically not needed

Edit:
As far as I can see project_id is also injected by default.

We finally solved this by going over all our log-based metrics. Took a while, as we have quite a few, but we removed the duplicate labels and have not had any problem since.

@gidesh
Copy link
Contributor

gidesh commented Apr 20, 2022

I've opened a PR #153 which should fix this. Can someone review it and merge it if possible?

@JediNight
Copy link

JediNight commented May 20, 2022

Still seeing this issue, however not getting a panic in the container logs. It shows up on /metrics page. I tried @jakubbujny suggestion of removing the label and label extractors but that didnt work.

Trying to scrape log-based metric for gke human initiated admin event that is a counter type: protoPayload.methodName=~"google.container.v1.ClusterManager.*" NOT protoPayload.methodName:"get" NOT protoPayload.methodName:"list" protoPayload.authenticationInfo.principalEmail:*

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants