Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Onboarding/Quick Start with self-signed HTTPS certificate creates a scraper filling the logs with errors #6956

Open
ilario opened this issue Sep 25, 2024 · 0 comments
Labels
kind/bug Something isn't working team/ui

Comments

@ilario
Copy link

ilario commented Sep 25, 2024

About the bug

Steps to reproduce:
List the minimal actions needed to reproduce the behavior.

  1. Add Influx Debian repository https://repos.influxdata.com/ (I am using Debian Bullseye oldstable on arm64)
  2. Install influxdb2 (I have version 2.7.10-1)
  3. Create a self-signed HTTPS certificate as indicated in https://docs.influxdata.com/influxdb/v2/admin/security/enable-tls/ making sure to include an IP.1 = 10.1.2.3 line between the alt_names and indicate the location of .key and .crt files in /etc/influxdb/config.toml
  4. Restart influxd
  5. Connect (in my case I am connecting to https://10.1.2.3:8086), follow the tutorial and click to Quick Start at its end.

Expected behavior:
Either:
Nothing happens
or:
A scraper gets created and it works. The scraper should get the location of the .crt file as it was indicated in the /etc/influxdb/config.toml file.

Actual behavior:
A "new target" scraper gets created, but it fails due to the certificate.

The scraper gets created by the handleQuickStart function in the src/onboarding/components/CompletionStep.tsx file here.

If the IP.1 line is missing from the OpenSSL config file, this error is flooding the logs every 10 seconds:

host influxd-systemd-start.sh[11782]: ts=2024-09-25T10:13:42.054832Z lvl=error msg="Unable to gather" log_id=0rr6jG30000 service=scraper scraper-name="new target" error="Get "https://10.1.2.3:8086/metrics\": tls: failed to verify certificate: x509: cannot validate certificate for 10.1.2.3 because it doesn't contain any IP SANs"
host influxd-systemd-start.sh[11782]: ts=2024-09-25T10:13:42.054669Z lvl=info msg="http: TLS handshake error from 10.1.2.3:35328: remote error: tls: bad certificate" log_id=0rr6jG30000 service=http

The scraper is filling the system logs with errors like:

host influxd-systemd-start.sh[15260]: ts=2024-09-25T15:53:06.463054Z lvl=error msg="Unable to gather" log_id=0rrOi2Hl000 service=scraper scraper-name="new target" error="Get "https://10.1.2.3:8086/metrics\": tls: failed to verify certificate: x509: certificate signed by unknown authority"
host influxd-systemd-start.sh[15260]: ts=2024-09-25T15:53:06.462927Z lvl=info msg="http: TLS handshake error from 10.1.2.3:52166: remote error: tls: bad certificate" log_id=0rrOi2Hl000 service=http

Visual Proof:

Screenshot 2024-09-25 175647

About your environment

Environment info:

# uname -srm
Linux 5.10.180-olimex aarch64
# influxd version
InfluxDB v2.7.10 (git: f302d9730c) build_date: 2024-08-16T20:19:39Z

Config:

# INFLUXD_CONFIG_PATH=/etc/influxdb/config.toml influxd print-config
Command "print-config" is deprecated, use the influx-cli command server-config to display the configuration values from the running server
assets-path: ""
bolt-path: /var/lib/influxdb/influxd.bolt
e2e-testing: false
engine-path: /var/lib/influxdb/engine
feature-flags: {}
flux-log-enabled: false
hardening-enabled: false
http-bind-address: :8086
http-idle-timeout: 3m0s
http-read-header-timeout: 10s
http-read-timeout: 0s
http-write-timeout: 0s
influxql-max-select-buckets: 0
influxql-max-select-point: 0
influxql-max-select-series: 0
instance-id: ""
key-name: ""
log-level: info
metrics-disabled: false
nats-max-payload-bytes: 0
nats-port: 0
no-tasks: false
pprof-disabled: false
query-concurrency: 1024
query-initial-memory-bytes: 0
query-max-memory-bytes: 0
query-memory-bytes: 0
query-queue-size: 1024
reporting-disabled: true
secret-store: bolt
session-length: 60
session-renew-disabled: false
sqlite-path: ""
storage-cache-max-memory-size: 1073741824
storage-cache-snapshot-memory-size: 26214400
storage-cache-snapshot-write-cold-duration: 10m0s
storage-compact-full-write-cold-duration: 4h0m0s
storage-compact-throughput-burst: 50331648
storage-max-concurrent-compactions: 0
storage-max-index-log-file-size: 1048576
storage-no-validate-field-size: false
storage-retention-check-interval: 30m0s
storage-series-file-max-concurrent-snapshot-compactions: 0
storage-series-id-set-cache-size: 0
storage-shard-precreator-advance-period: 30m0s
storage-shard-precreator-check-interval: 10m0s
storage-tsm-use-madv-willneed: false
storage-validate-keys: false
storage-wal-fsync-delay: 0s
storage-wal-max-concurrent-writes: 0
storage-wal-max-write-delay: 10m0s
storage-write-timeout: 10s
store: disk
strong-passwords: false
template-file-urls-disabled: false
testing-always-allow-setup: false
tls-cert: /etc/ssl/influxdb-selfsigned.crt
tls-key: /etc/ssl/influxdb-selfsigned.key
tls-min-version: "1.2"
tls-strict-ciphers: false
tracing-type: ""
ui-disabled: false
vault-addr: ""
vault-cacert: ""
vault-capath: ""
vault-client-cert: ""
vault-client-key: ""
vault-client-timeout: 0s
vault-max-retries: 0
vault-skip-verify: false
vault-tls-server-name: ""
vault-token: ""

Logs:

Sep 25 16:20:16 host influxd-systemd-start.sh[15261]: Command "print-config" is deprecated, use the influx-cli command server-config to display the configuration values from the running server
Sep 25 16:20:16 host influxd-systemd-start.sh[15260]: ts=2024-09-25T14:20:16.328170Z lvl=info msg="Welcome to InfluxDB" log_id=0rrOi2Hl000 version=v2.7.10 commit=f302d9730c build_date=2024-08-16T20:19:39Z log_level=info
Sep 25 16:20:16 host influxd-systemd-start.sh[15260]: ts=2024-09-25T14:20:16.334073Z lvl=info msg="Resources opened" log_id=0rrOi2Hl000 service=bolt path=/var/lib/influxdb/influxd.bolt
Sep 25 16:20:16 host influxd-systemd-start.sh[15260]: ts=2024-09-25T14:20:16.334750Z lvl=info msg="Resources opened" log_id=0rrOi2Hl000 service=sqlite path=/var/lib/influxdb/influxd.sqlite
Sep 25 16:20:16 host influxd-systemd-start.sh[15260]: ts=2024-09-25T14:20:16.360204Z lvl=info msg="Using data dir" log_id=0rrOi2Hl000 service=storage-engine service=store path=/var/lib/influxdb/engine/data
Sep 25 16:20:16 host influxd-systemd-start.sh[15260]: ts=2024-09-25T14:20:16.360384Z lvl=info msg="Compaction settings" log_id=0rrOi2Hl000 service=storage-engine service=store max_concurrent_compactions=2 throughput_bytes_per_second=50331648 throughput_bytes_per_second_burst=50331648
Sep 25 16:20:16 host influxd-systemd-start.sh[15260]: ts=2024-09-25T14:20:16.360443Z lvl=info msg="Open store (start)" log_id=0rrOi2Hl000 service=storage-engine service=store op_name=tsdb_open op_event=start
Sep 25 16:20:16 host influxd-systemd-start.sh[15260]: ts=2024-09-25T14:20:16.360856Z lvl=info msg="Open store (end)" log_id=0rrOi2Hl000 service=storage-engine service=store op_name=tsdb_open op_event=end op_elapsed=0.416ms
Sep 25 16:20:16 host influxd-systemd-start.sh[15260]: ts=2024-09-25T14:20:16.361035Z lvl=info msg="Starting retention policy enforcement service" log_id=0rrOi2Hl000 service=retention check_interval=30m
Sep 25 16:20:16 host influxd-systemd-start.sh[15260]: ts=2024-09-25T14:20:16.361144Z lvl=info msg="Starting precreation service" log_id=0rrOi2Hl000 service=shard-precreation check_interval=10m advance_period=30m
Sep 25 16:20:16 host influxd-systemd-start.sh[15260]: ts=2024-09-25T14:20:16.366554Z lvl=info msg="Starting query controller" log_id=0rrOi2Hl000 service=storage-reads concurrency_quota=1024 initial_memory_bytes_quota_per_query=9223372036854775807 memory_bytes_quota_per_query=9223372036854775807 max_memory_bytes=0 queue_size=1024
Sep 25 16:20:16 host influxd-systemd-start.sh[15260]: ts=2024-09-25T14:20:16.384141Z lvl=info msg="Configuring InfluxQL statement executor (zeros indicate unlimited)." log_id=0rrOi2Hl000 max_select_point=0 max_select_series=0 max_select_buckets=0
Sep 25 16:20:16 host influxd-systemd-start.sh[15260]: ts=2024-09-25T14:20:16.425344Z lvl=info msg=Listening log_id=0rrOi2Hl000 service=tcp-listener transport=https addr=:8086 port=8086
Sep 25 16:20:16 host influxd-systemd-start.sh[15285]: Command "print-config" is deprecated, use the influx-cli command server-config to display the configuration values from the running server
Sep 25 16:20:16 host influxd-systemd-start.sh[15301]: Command "print-config" is deprecated, use the influx-cli command server-config to display the configuration values from the running server
Sep 25 16:20:16 host influxd-systemd-start.sh[15259]: TLS cert and key found -- using https
Sep 25 16:20:17 host influxd-systemd-start.sh[15259]: InfluxDB started
Sep 25 16:50:16 host influxd-systemd-start.sh[15260]: ts=2024-09-25T14:50:16.361527Z lvl=info msg="Retention policy deletion check (start)" log_id=0rrOi2Hl000 service=retention op_name=retention_delete_check op_event=start
Sep 25 16:50:16 host influxd-systemd-start.sh[15260]: ts=2024-09-25T14:50:16.361805Z lvl=info msg="Pruning shard groups after retention check (start)" log_id=0rrOi2Hl000 service=retention op_name=retention_delete_check op_name=retention_prune_shard_groups op_event=start
Sep 25 16:50:16 host influxd-systemd-start.sh[15260]: ts=2024-09-25T14:50:16.361892Z lvl=info msg="Pruning shard groups after retention check (end)" log_id=0rrOi2Hl000 service=retention op_name=retention_delete_check op_name=retention_prune_shard_groups op_event=end op_elapsed=0.097ms
Sep 25 16:50:16 host influxd-systemd-start.sh[15260]: ts=2024-09-25T14:50:16.361996Z lvl=info msg="Retention policy deletion check (end)" log_id=0rrOi2Hl000 service=retention op_name=retention_delete_check op_event=end op_elapsed=0.562ms
Sep 25 17:20:16 host influxd-systemd-start.sh[15260]: ts=2024-09-25T15:20:16.361496Z lvl=info msg="Retention policy deletion check (start)" log_id=0rrOi2Hl000 service=retention op_name=retention_delete_check op_event=start
Sep 25 17:20:16 host influxd-systemd-start.sh[15260]: ts=2024-09-25T15:20:16.361699Z lvl=info msg="Pruning shard groups after retention check (start)" log_id=0rrOi2Hl000 service=retention op_name=retention_delete_check op_name=retention_prune_shard_groups op_event=start
Sep 25 17:20:16 host influxd-systemd-start.sh[15260]: ts=2024-09-25T15:20:16.361815Z lvl=info msg="Pruning shard groups after retention check (end)" log_id=0rrOi2Hl000 service=retention op_name=retention_delete_check op_name=retention_prune_shard_groups op_event=end op_elapsed=0.101ms
Sep 25 17:20:16 host influxd-systemd-start.sh[15260]: ts=2024-09-25T15:20:16.361888Z lvl=info msg="Retention policy deletion check (end)" log_id=0rrOi2Hl000 service=retention op_name=retention_delete_check op_event=end op_elapsed=0.524ms
Sep 25 17:38:36 host influxd-systemd-start.sh[15260]: ts=2024-09-25T15:38:36.579485Z lvl=info msg="http: TLS handshake error from 10.1.2.3:59736: remote error: tls: bad certificate" log_id=0rrOi2Hl000 service=http
Sep 25 17:38:36 host influxd-systemd-start.sh[15260]: ts=2024-09-25T15:38:36.579552Z lvl=error msg="Unable to gather" log_id=0rrOi2Hl000 service=scraper scraper-name="new target" error="Get \"https://10.1.2.3:8086/metrics\": tls: failed to verify certificate: x509: certificate signed by unknown authority"
Sep 25 17:38:46 host influxd-systemd-start.sh[15260]: ts=2024-09-25T15:38:46.458903Z lvl=info msg="http: TLS handshake error from 10.1.2.3:35368: remote error: tls: bad certificate" log_id=0rrOi2Hl000 service=http
Sep 25 17:38:46 host influxd-systemd-start.sh[15260]: ts=2024-09-25T15:38:46.458998Z lvl=error msg="Unable to gather" log_id=0rrOi2Hl000 service=scraper scraper-name="new target" error="Get \"https://10.1.2.3:8086/metrics\": tls: failed to verify certificate: x509: certificate signed by unknown authority"
Sep 25 17:38:56 host influxd-systemd-start.sh[15260]: ts=2024-09-25T15:38:56.458261Z lvl=info msg="http: TLS handshake error from 10.1.2.3:54814: remote error: tls: bad certificate" log_id=0rrOi2Hl000 service=http
Sep 25 17:38:56 host influxd-systemd-start.sh[15260]: ts=2024-09-25T15:38:56.458336Z lvl=error msg="Unable to gather" log_id=0rrOi2Hl000 service=scraper scraper-name="new target" error="Get \"https://10.1.2.3:8086/metrics\": tls: failed to verify certificate: x509: certificate signed by unknown authority"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working team/ui
Projects
None yet
Development

No branches or pull requests

1 participant