Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3.2.0 not working after update from 3.1.10 #9593

Closed
USBAkimbo opened this issue Nov 14, 2024 · 18 comments
Closed

3.2.0 not working after update from 3.1.10 #9593

USBAkimbo opened this issue Nov 14, 2024 · 18 comments
Assignees
Labels
bug waiting-for-user Waiting for more information, tests or requested changes

Comments

@USBAkimbo
Copy link

USBAkimbo commented Nov 14, 2024

Bug Report

Describe the bug

  • I have VMs running fluent-bit and they were on version 3.1.10
  • Yesterday at 17:10 UTC, I ran my Ansible playbook which includes updating fluent-bit using the apt module
  • This upgraded the agents to 3.2.0, but then I noticed this in my logs
Nov 14 09:16:27 h-we-vm-01 fluent-bit[311004]: [2024/11/14 09:16:27] [error] [engine] chunk '311004-1731575775.597403200.flb' cannot be retried: task_id=8, input=tail.0 > output=http.0
Nov 14 09:16:28 h-we-vm-01 fluent-bit[311004]: [2024/11/14 09:16:28] [error] [tls] error: unexpected EOF
Nov 14 09:16:28 h-we-vm-01 fluent-bit[311004]: [2024/11/14 09:16:28] [error] [output:http:http.0] no upstream connections available to my.seq.server:443
Nov 14 09:16:28 h-we-vm-01 fluent-bit[311004]: [2024/11/14 09:16:28] [error] [tls] error: unexpected EOF
Nov 14 09:16:28 h-we-vm-01 fluent-bit[311004]: [2024/11/14 09:16:28] [error] [output:http:http.0] no upstream connections available to my.seq.server:443
Nov 14 09:16:28 h-we-vm-01 fluent-bit[311004]: [2024/11/14 09:16:28] [error] [tls] error: unexpected EOF
Nov 14 09:16:28 h-we-vm-01 fluent-bit[311004]: [2024/11/14 09:16:28] [error] [output:http:http.0] no upstream connections available to my.seq.server:443
Nov 14 09:16:28 h-we-vm-01 fluent-bit[311004]: [2024/11/14 09:16:28] [error] [engine] chunk '311004-1731575776.597385227.flb' cannot be retried: task_id=2, input=tail.0 > output=http.0
Nov 14 09:16:28 h-we-vm-01 fluent-bit[311004]: [2024/11/14 09:16:28] [error] [engine] chunk '311004-1731575778.597509586.flb' cannot be retried: task_id=10, input=tail.0 > output=http.0
Nov 14 09:16:28 h-we-vm-01 fluent-bit[311004]: [2024/11/14 09:16:28] [ warn] [engine] failed to flush chunk '311004-1731575787.969547933.flb', retry in 10 seconds: task_id=7, input=tail.0 > output=http.0 (out>
  • This implies that something is wrong with my server or cert, when it was working fine
  • Uninstalling and downgrading to 3.1.10 fixes this issue

To Reproduce

  • Use this fluent-bit.conf with Seq (this is jinja formatted so please adjust the server URL and API key accordingly)
[INPUT]
    Name                    tail
    Parser                  simple
    Path                    /var/log/*.log, /var/log/*/*.log
    Path_Key                file_path

[FILTER]
    Name                    modify
    Match                   *
    Rename                  log @m
    Add                     hostname ${HOSTNAME}

[OUTPUT]
    Name                    http
    Match                   *
    Host                    {{ SEQ_SERVER_URL }}
    Port                    443
    TLS                     On
    URI                     ingest/clef
    Header                  X-Seq-ApiKey {{ SEQ_API_KEY }}
    Format                  json_lines
    Json_date_key           @t
    Json_date_format        iso8601
    Log_response_payload    False
  • And use this parsers.conf
[PARSER]
    Name            simple
    Format          regex
    Regex           ^(?<time>[^ ]+) (?<message>.+)$
    Time_Key        time
    Time_Format     %Y-%m-%dT%H:%M:%S.%L%z
  • Install the agent version 3.1.10 and use the above config on an Ubuntu 22.04 system
  • Logs should flow
  • Now update to 3.2.0
  • Logs will stop flowing

Expected behavior

  • Logs being sent to my Seq log server

Screenshots

  • N/A

Your Environment

  • Version used: 3.1.0 and 3.2.0
  • Configuration: See above
  • Environment name and version (e.g. Kubernetes? What version?): Ubuntu 22.04 running on Azure VMs
  • Server type and version: D4as Azure VM
  • Operating System and version: Ubuntu 22.04
  • Filters and plugins: None, see above config as that's the only config that's used

Additional context

  • I notice no releases in GitHub for 3.2.0 but there is a manifest on https://packages.fluentbit.io/3.2.0/
  • I got very unlucky here - it looks like 3.2.0 was released at 17:00 which was 10 mins before my Ansible run
@patrick-stephens
Copy link
Contributor

To help debugging can you clarify if this is a self-signed cert, where/how the specific cert is installed for this server and anything else that may be relevant around the actual TLS/SSL config?

@USBAkimbo
Copy link
Author

USBAkimbo commented Nov 14, 2024

The cert is a LetsEncrypt cert that's valid until 2025-02

Cert is installed on the load balancer that fluent-bit is hitting

If I do a curl I get a "cert is valid" response

The cert also has the full chain - the TL;DR is I'm using LEGO ACME to get my cert using a DNS challenge

That then outputs a full chain PFX and that's the cert that I use - all other systems have no problem with it and neither does fluent-bit on a 3.1.x build

@patrick-stephens
Copy link
Contributor

Can you check the ssl library deps as well between the two versions? Wondering if it is related to that too, I can probably do it but may not be the exact same as your ones and quicker if you do :)

@patrick-stephens
Copy link
Contributor

You could also try tls.debug 4 to see if it gives anymore details as to what/why it is failing and passing: https://docs.fluentbit.io/manual/administration/transport-security

@pierrebeaucamp
Copy link

We're seeing the same behaviour while trying to output to Datadog fyi (i.e. same [tls] error: unexpected EOF). Downgrading to 3.1.10 fixes this.

@patrick-stephens
Copy link
Contributor

patrick-stephens commented Nov 15, 2024

The changelog for 3.2.0 shows some TLS changes but nothing obviously relevant: https://github.com/fluent/fluent-bit/releases/tag/v3.2.0

There are debugging improvements in there though so please could you capture the full debug when it fails?
#9593 (comment)

@patrick-stephens
Copy link
Contributor

Looks like we've figured it out thanks to @leonardo-albertovich so will be updating and hopefully have a new release soon.

@USBAkimbo
Copy link
Author

Hey, apologies, I was away yesterday

I see that a new release of 3.2.0 was pushed an hour ago - I'll give it a push now and report back

@patrick-stephens
Copy link
Contributor

No, that's the current release - the Github page was just not up as: #9595

@USBAkimbo
Copy link
Author

Ah, that explains why it's not working hahaha

This is what tls.debug 4 shows

Works fine on 3.1.10 so I'll wait for the fix

Nov 15 15:31:16 vm-01 fluent-bit[806363]: * https://fluentbit.io
Nov 15 15:31:16 vm-01 fluent-bit[806363]: ______ _                  _    ______ _ _           _____  _____
Nov 15 15:31:16 vm-01 fluent-bit[806363]: |  ___| |                | |   | ___ (_) |         |____ |/ __  \
Nov 15 15:31:16 vm-01 fluent-bit[806363]: | |_  | |_   _  ___ _ __ | |_  | |_/ /_| |_  __   __   / /`' / /'
Nov 15 15:31:16 vm-01 fluent-bit[806363]: |  _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / /   \ \  / /
Nov 15 15:31:16 vm-01 fluent-bit[806363]: | |   | | |_| |  __/ | | | |_  | |_/ / | |_   \ V /.___/ /./ /___
Nov 15 15:31:16 vm-01 fluent-bit[806363]: \_|   |_|\__,_|\___|_| |_|\__| \____/|_|\__|   \_/ \____(_)_____/
Nov 15 15:31:16 vm-01 fluent-bit[806363]: [2024/11/15 15:31:16] [ info] [fluent bit] version=3.2.0, commit=, pid=806363
Nov 15 15:31:16 vm-01 fluent-bit[806363]: [2024/11/15 15:31:16] [ info] [storage] ver=1.5.2, type=memory, sync=normal, checksum=off, max_chunks_up=128
Nov 15 15:31:16 vm-01 fluent-bit[806363]: [2024/11/15 15:31:16] [ info] [simd    ] disabled
Nov 15 15:31:16 vm-01 fluent-bit[806363]: [2024/11/15 15:31:16] [ info] [cmetrics] version=0.9.9
Nov 15 15:31:16 vm-01 fluent-bit[806363]: [2024/11/15 15:31:16] [ info] [ctraces ] version=0.5.7
Nov 15 15:31:16 vm-01 fluent-bit[806363]: [2024/11/15 15:31:16] [ info] [input:tail:tail.0] initializing
Nov 15 15:31:16 vm-01 fluent-bit[806363]: [2024/11/15 15:31:16] [ info] [input:tail:tail.0] storage_strategy='memory' (memory only)
Nov 15 15:31:16 vm-01 fluent-bit[806363]: [2024/11/15 15:31:16] [error] [input:tail:tail.0] parser 'simple' is not registered
Nov 15 15:31:16 vm-01 fluent-bit[806363]: [2024/11/15 15:31:16] [ info] [sp] stream processor started
Nov 15 15:31:16 vm-01 fluent-bit[806363]: [2024/11/15 15:31:16] [ info] [output:http:http.0] worker #0 started
Nov 15 15:31:16 vm-01 fluent-bit[806363]: [2024/11/15 15:31:16] [ info] [input:tail:tail.0] inotify_fs_add(): inode=6987 watch_fd=1 name=/var/log/alternatives.log
Nov 15 15:31:16 vm-01 fluent-bit[806363]: [2024/11/15 15:31:16] [ info] [input:tail:tail.0] inotify_fs_add(): inode=8501 watch_fd=2 name=/var/log/auth.log
Nov 15 15:31:16 vm-01 fluent-bit[806363]: [2024/11/15 15:31:16] [ info] [input:tail:tail.0] inotify_fs_add(): inode=66744 watch_fd=3 name=/var/log/cloud-init-output.log
Nov 15 15:31:16 vm-01 fluent-bit[806363]: [2024/11/15 15:31:16] [ info] [output:http:http.0] worker #1 started
Nov 15 15:31:16 vm-01 fluent-bit[806363]: [2024/11/15 15:31:16] [ info] [input:tail:tail.0] inotify_fs_add(): inode=66742 watch_fd=4 name=/var/log/cloud-init.log
Nov 15 15:31:16 vm-01 fluent-bit[806363]: [2024/11/15 15:31:16] [ info] [input:tail:tail.0] inotify_fs_add(): inode=13414 watch_fd=5 name=/var/log/dpkg.log
Nov 15 15:31:16 vm-01 fluent-bit[806363]: [2024/11/15 15:31:16] [ info] [input:tail:tail.0] inotify_fs_add(): inode=8491 watch_fd=6 name=/var/log/kern.log
Nov 15 15:31:16 vm-01 fluent-bit[806363]: [2024/11/15 15:31:16] [ info] [input:tail:tail.0] inotify_fs_add(): inode=14255 watch_fd=7 name=/var/log/ubuntu-advantage.log
Nov 15 15:31:16 vm-01 fluent-bit[806363]: [2024/11/15 15:31:16] [ info] [input:tail:tail.0] inotify_fs_add(): inode=66823 watch_fd=8 name=/var/log/waagent.log
Nov 15 15:31:16 vm-01 fluent-bit[806363]: [2024/11/15 15:31:16] [ info] [input:tail:tail.0] inotify_fs_add(): inode=8106 watch_fd=9 name=/var/log/apt/history.log
Nov 15 15:31:16 vm-01 fluent-bit[806363]: [2024/11/15 15:31:16] [ info] [input:tail:tail.0] inotify_fs_add(): inode=13312 watch_fd=10 name=/var/log/apt/term.log
Nov 15 15:31:16 vm-01 fluent-bit[806363]: [2024/11/15 15:31:16] [ info] [input:tail:tail.0] inotify_fs_add(): inode=1584062 watch_fd=11 name=/var/log/audit/audit.log
Nov 15 15:31:16 vm-01 fluent-bit[806363]: [2024/11/15 15:31:16] [ info] [input:tail:tail.0] inotify_fs_add(): inode=66884 watch_fd=12 name=/var/log/landscape/sysinfo.log
Nov 15 15:31:16 vm-01 fluent-bit[806363]: [2024/11/15 15:31:16] [ info] [input:tail:tail.0] inotify_fs_add(): inode=6545 watch_fd=13 name=/var/log/unattended-upgrades/unattended-upgrades-dpkg.log
Nov 15 15:31:16 vm-01 fluent-bit[806363]: [2024/11/15 15:31:16] [ info] [input:tail:tail.0] inotify_fs_add(): inode=6894 watch_fd=14 name=/var/log/unattended-upgrades/unattended-upgrades-shutdown.log
Nov 15 15:31:16 vm-01 fluent-bit[806363]: [2024/11/15 15:31:16] [ info] [input:tail:tail.0] inotify_fs_add(): inode=13418 watch_fd=15 name=/var/log/unattended-upgrades/unattended-upgrades.log
Nov 15 15:31:16 vm-01 fluent-bit[806363]: [2024/11/15 15:31:16] [ info] [input:tail:tail.0] inotify_fs_add(): inode=774146 watch_fd=16 name=/var/log/zabbix/zabbix_agent2.log
Nov 15 15:35:02 vm-01 fluent-bit[806363]: [2024/11/15 15:35:02] [error] [tls] error: unexpected EOF
Nov 15 15:35:02 vm-01 fluent-bit[806363]: [2024/11/15 15:35:02] [error] [output:http:http.0] no upstream connections available to seq.mydomain.com:443
Nov 15 15:35:02 vm-01 fluent-bit[806363]: [2024/11/15 15:35:02] [ warn] [engine] failed to flush chunk '806363-1731684901.722104878.flb', retry in 6 seconds: task_id=0, input=tail.0 > output=http.0 (out_>
Nov 15 15:35:08 vm-01 fluent-bit[806363]: [2024/11/15 15:35:08] [error] [tls] error: unexpected EOF
Nov 15 15:35:08 vm-01 fluent-bit[806363]: [2024/11/15 15:35:08] [error] [output:http:http.0] no upstream connections available to seq.mydomain.com:443
Nov 15 15:35:08 vm-01 fluent-bit[806363]: [2024/11/15 15:35:08] [error] [engine] chunk '806363-1731684901.722104878.flb' cannot be retried: task_id=0, input=tail.0 > output=http.0
Nov 15 15:44:28 vm-01 fluent-bit[806363]: [2024/11/15 15:44:28] [error] [tls] error: unexpected EOF
Nov 15 15:44:28 vm-01 fluent-bit[806363]: [2024/11/15 15:44:28] [error] [output:http:http.0] no upstream connections available to seq.mydomain.com:443
Nov 15 15:44:28 vm-01 fluent-bit[806363]: [2024/11/15 15:44:28] [ warn] [engine] failed to flush chunk '806363-1731685467.628311738.flb', retry in 9 seconds: task_id=0, input=tail.0 > output=http.0 (out_>
Nov 15 15:44:37 vm-01 fluent-bit[806363]: [2024/11/15 15:44:37] [error] [tls] error: unexpected EOF
Nov 15 15:44:37 vm-01 fluent-bit[806363]: [2024/11/15 15:44:37] [error] [output:http:http.0] no upstream connections available to seq.mydomain.com:443
Nov 15 15:44:37 vm-01 fluent-bit[806363]: [2024/11/15 15:44:37] [error] [engine] chunk '806363-1731685467.628311738.flb' cannot be retried: task_id=0, input=tail.0 > output=http.0
Nov 15 15:44:39 vm-01 fluent-bit[806363]: [2024/11/15 15:44:39] [error] [tls] error: unexpected EOF
Nov 15 15:44:39 vm-01 fluent-bit[806363]: [2024/11/15 15:44:39] [error] [output:http:http.0] no upstream connections available to seq.mydomain.com:443
Nov 15 15:44:39 vm-01 fluent-bit[806363]: [2024/11/15 15:44:39] [ warn] [engine] failed to flush chunk '806363-1731685478.675660998.flb', retry in 8 seconds: task_id=0, input=tail.0 > output=http.0 (out_>
Nov 15 15:44:47 vm-01 fluent-bit[806363]: [2024/11/15 15:44:47] [error] [tls] error: unexpected EOF
Nov 15 15:44:47 vm-01 fluent-bit[806363]: [2024/11/15 15:44:47] [error] [output:http:http.0] no upstream connections available to seq.mydomain.com:443
Nov 15 15:44:47 vm-01 fluent-bit[806363]: [2024/11/15 15:44:47] [error] [engine] chunk '806363-1731685478.675660998.flb' cannot be retried: task_id=0, input=tail.0 > output=http.0
lines 949-1000/1000 (END)

@kyleli666
Copy link

kyleli666 commented Nov 16, 2024

Same issue with opensearch output, and tls.debug 4 somehow does not give more output with opensearch output. No issue in 3.1.10.

[2024/11/16 14:27:13] [error] [tls] error: unexpected EOF
[2024/11/16 14:27:13] [error] [tls] error: unexpected EOF
[2024/11/16 14:27:13] [ info] [engine] chunk '1-1731767232.345829558.flb' is not retried (no retry config): task_id=3, input=tail.0 > output=opensearch.0 (out_id=0)
[2024/11/16 14:27:13] [ info] [engine] chunk '1-1731767232.328776521.flb' is not retried (no retry config): task_id=1, input=tail.0 > output=opensearch.0 (out_id=0)
[2024/11/16 14:27:13] [error] [tls] error: unexpected EOF
[2024/11/16 14:27:13] [error] [tls] error: unexpected EOF
[2024/11/16 14:27:13] [ info] [engine] chunk '1-1731767233.87877619.flb' is not retried (no retry config): task_id=9, input=tail.0 > output=opensearch.0 (out_id=0)
[2024/11/16 14:27:13] [ info] [engine] chunk '1-1731767233.78788597.flb' is not retried (no retry config): task_id=8, input=tail.0 > output=opensearch.0 (out_id=0)
[2024/11/16 14:27:13] [error] [tls] error: unexpected EOF
[2024/11/16 14:27:13] [error] [tls] error: unexpected EOF
[2024/11/16 14:27:13] [ info] [engine] chunk '1-1731767232.379334048.flb' is not retried (no retry config): task_id=7, input=tail.0 > output=opensearch.0 (out_id=0)
[2024/11/16 14:27:13] [ info] [engine] chunk '1-1731767233.95803594.flb' is not retried (no retry config): task_id=10, input=tail.0 > output=opensearch.0 (out_id=0)
[2024/11/16 14:27:13] [error] [tls] error: unexpected EOF
    [OUTPUT]
        Name                opensearch
        Match               es.*
        Host                search-xxxxx.xxxxxx.es.amazonaws.com
        Port                443
        tls                 On
        tls.debug           4
        AWS_Auth            On
        AWS_Region          ap-southeast-1
        Retry_Limit         1
        Include_Tag_Key     On
        Tag_Key             @log_name
        Logstash_Format     On
        Logstash_DateFormat %Y%m%d.%H
        Logstash_Prefix     xxxxx.fluent
        Suppress_Type_Name  On
        Workers             1
        Compress            gzip

@patrick-stephens
Copy link
Contributor

https://github.com/fluent/fluent-bit/releases/tag/v3.2.1 should fix this I believe - all praise to @leonardo-albertovich

@patrick-stephens patrick-stephens added the waiting-for-user Waiting for more information, tests or requested changes label Nov 18, 2024
@ElectricWeasel
Copy link

I just upgraded to 3.2.1 - unfortunately problem persist. I do not have precise debug but looks similar to previous reports...

Nov 18 15:14:23 petra7.xxxx fluent-bit[2061988]: [2024/11/18 15:14:23] [error] [engine] chunk '2061988-1731939252.70591213.flb' cannot be retried: task_id=0, input=node_exporter_metrics.1 > output=opentelemetry.1
Nov 18 15:14:27 petra7.xxxx fluent-bit[2061988]: [2024/11/18 15:14:27] [error] [output:opentelemetry:opentelemetry.1] otel.xxxx:443, HTTP status=0
Nov 18 15:14:27 petra7.xxxx fluent-bit[2061988]: [2024/11/18 15:14:27] [ warn] [engine] failed to flush chunk '2061988-1731939267.70938234.flb', retry in 11 seconds: task_id=0, input=node_exporter_metrics.1 > output=opentelemetry.1
(out_id=1)

@kyleli666
Copy link

Same issue with opensearch output, and tls.debug 4 somehow does not give more output with opensearch output. No issue in 3.1.10.

[2024/11/16 14:27:13] [error] [tls] error: unexpected EOF
[2024/11/16 14:27:13] [error] [tls] error: unexpected EOF
[2024/11/16 14:27:13] [ info] [engine] chunk '1-1731767232.345829558.flb' is not retried (no retry config): task_id=3, input=tail.0 > output=opensearch.0 (out_id=0)
[2024/11/16 14:27:13] [ info] [engine] chunk '1-1731767232.328776521.flb' is not retried (no retry config): task_id=1, input=tail.0 > output=opensearch.0 (out_id=0)
[2024/11/16 14:27:13] [error] [tls] error: unexpected EOF
[2024/11/16 14:27:13] [error] [tls] error: unexpected EOF
[2024/11/16 14:27:13] [ info] [engine] chunk '1-1731767233.87877619.flb' is not retried (no retry config): task_id=9, input=tail.0 > output=opensearch.0 (out_id=0)
[2024/11/16 14:27:13] [ info] [engine] chunk '1-1731767233.78788597.flb' is not retried (no retry config): task_id=8, input=tail.0 > output=opensearch.0 (out_id=0)
[2024/11/16 14:27:13] [error] [tls] error: unexpected EOF
[2024/11/16 14:27:13] [error] [tls] error: unexpected EOF
[2024/11/16 14:27:13] [ info] [engine] chunk '1-1731767232.379334048.flb' is not retried (no retry config): task_id=7, input=tail.0 > output=opensearch.0 (out_id=0)
[2024/11/16 14:27:13] [ info] [engine] chunk '1-1731767233.95803594.flb' is not retried (no retry config): task_id=10, input=tail.0 > output=opensearch.0 (out_id=0)
[2024/11/16 14:27:13] [error] [tls] error: unexpected EOF
    [OUTPUT]
        Name                opensearch
        Match               es.*
        Host                search-xxxxx.xxxxxx.es.amazonaws.com
        Port                443
        tls                 On
        tls.debug           4
        AWS_Auth            On
        AWS_Region          ap-southeast-1
        Retry_Limit         1
        Include_Tag_Key     On
        Tag_Key             @log_name
        Logstash_Format     On
        Logstash_DateFormat %Y%m%d.%H
        Logstash_Prefix     xxxxx.fluent
        Suppress_Type_Name  On
        Workers             1
        Compress            gzip

Version 3.2.1 solves my issue with opensearch. Thanks @patrick-stephens @leonardo-albertovich and also @USBAkimbo ~

@USBAkimbo
Copy link
Author

I can confirm that 3.2.1 has resolved my issues

Many thanks!

Nov 18 14:38:15 h-we-vm-01 fluent-bit[105730]: ______ _                  _    ______ _ _           _____  _____
Nov 18 14:38:15 h-we-vm-01 fluent-bit[105730]: |  ___| |                | |   | ___ (_) |         |____ |/ __  \
Nov 18 14:38:15 h-we-vm-01 fluent-bit[105730]: | |_  | |_   _  ___ _ __ | |_  | |_/ /_| |_  __   __   / /`' / /'
Nov 18 14:38:15 h-we-vm-01 fluent-bit[105730]: |  _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / /   \ \  / /
Nov 18 14:38:15 h-we-vm-01 fluent-bit[105730]: | |   | | |_| |  __/ | | | |_  | |_/ / | |_   \ V /.___/ /./ /___
Nov 18 14:38:15 h-we-vm-01 fluent-bit[105730]: \_|   |_|\__,_|\___|_| |_|\__| \____/|_|\__|   \_/ \____(_)_____/
Nov 18 14:38:15 h-we-vm-01 fluent-bit[105730]: [2024/11/18 14:38:15] [ info] [fluent bit] version=3.2.1, commit=, pid=105730
Nov 18 14:38:15 h-we-vm-01 fluent-bit[105730]: [2024/11/18 14:38:15] [ info] [storage] ver=1.5.2, type=memory, sync=normal, checksum=off, max_chunks_up=128
Nov 18 14:38:15 h-we-vm-01 fluent-bit[105730]: [2024/11/18 14:38:15] [ info] [simd    ] disabled
Nov 18 14:38:15 h-we-vm-01 fluent-bit[105730]: [2024/11/18 14:38:15] [ info] [cmetrics] version=0.9.9
Nov 18 14:38:15 h-we-vm-01 fluent-bit[105730]: [2024/11/18 14:38:15] [ info] [ctraces ] version=0.5.7
Nov 18 14:38:15 h-we-vm-01 fluent-bit[105730]: [2024/11/18 14:38:15] [ info] [input:tail:tail.0] initializing
Nov 18 14:38:15 h-we-vm-01 fluent-bit[105730]: [2024/11/18 14:38:15] [ info] [input:tail:tail.0] storage_strategy='memory' (memory only)
Nov 18 14:38:15 h-we-vm-01 fluent-bit[105730]: [2024/11/18 14:38:15] [error] [input:tail:tail.0] parser 'simple' is not registered
Nov 18 14:38:15 h-we-vm-01 fluent-bit[105730]: [2024/11/18 14:38:15] [ info] [sp] stream processor started
Nov 18 14:38:15 h-we-vm-01 fluent-bit[105730]: [2024/11/18 14:38:15] [ info] [input:tail:tail.0] inotify_fs_add(): inode=6987 watch_fd=1 name=/var/log/alternatives.log
Nov 18 14:38:15 h-we-vm-01 fluent-bit[105730]: [2024/11/18 14:38:15] [ info] [input:tail:tail.0] inotify_fs_add(): inode=13510 watch_fd=2 name=/var/log/auth.log
Nov 18 14:38:15 h-we-vm-01 fluent-bit[105730]: [2024/11/18 14:38:15] [ info] [input:tail:tail.0] inotify_fs_add(): inode=66744 watch_fd=3 name=/var/log/cloud-init-output.log
Nov 18 14:38:15 h-we-vm-01 fluent-bit[105730]: [2024/11/18 14:38:15] [ info] [input:tail:tail.0] inotify_fs_add(): inode=66742 watch_fd=4 name=/var/log/cloud-init.log
Nov 18 14:38:15 h-we-vm-01 fluent-bit[105730]: [2024/11/18 14:38:15] [ info] [input:tail:tail.0] inotify_fs_add(): inode=13414 watch_fd=5 name=/var/log/dpkg.log
Nov 18 14:38:15 h-we-vm-01 fluent-bit[105730]: [2024/11/18 14:38:15] [ info] [input:tail:tail.0] inotify_fs_add(): inode=13481 watch_fd=6 name=/var/log/kern.log
Nov 18 14:38:15 h-we-vm-01 fluent-bit[105730]: [2024/11/18 14:38:15] [ info] [input:tail:tail.0] inotify_fs_add(): inode=14255 watch_fd=7 name=/var/log/ubuntu-advantage.log
Nov 18 14:38:15 h-we-vm-01 fluent-bit[105730]: [2024/11/18 14:38:15] [ info] [input:tail:tail.0] inotify_fs_add(): inode=66823 watch_fd=8 name=/var/log/waagent.log
Nov 18 14:38:15 h-we-vm-01 fluent-bit[105730]: [2024/11/18 14:38:15] [ info] [input:tail:tail.0] inotify_fs_add(): inode=8106 watch_fd=9 name=/var/log/apt/history.log
Nov 18 14:38:15 h-we-vm-01 fluent-bit[105730]: [2024/11/18 14:38:15] [ info] [input:tail:tail.0] inotify_fs_add(): inode=13312 watch_fd=10 name=/var/log/apt/term.log
Nov 18 14:38:15 h-we-vm-01 fluent-bit[105730]: [2024/11/18 14:38:15] [ info] [input:tail:tail.0] inotify_fs_add(): inode=1583508 watch_fd=11 name=/var/log/audit/audit.log
Nov 18 14:38:15 h-we-vm-01 fluent-bit[105730]: [2024/11/18 14:38:15] [ info] [input:tail:tail.0] inotify_fs_add(): inode=66884 watch_fd=12 name=/var/log/landscape/sysinfo.log
Nov 18 14:38:15 h-we-vm-01 fluent-bit[105730]: [2024/11/18 14:38:15] [ info] [input:tail:tail.0] inotify_fs_add(): inode=6545 watch_fd=13 name=/var/log/unattended-upgrades/unattended-upgrades-dpkg.log
Nov 18 14:38:15 h-we-vm-01 fluent-bit[105730]: [2024/11/18 14:38:15] [ info] [input:tail:tail.0] inotify_fs_add(): inode=6894 watch_fd=14 name=/var/log/unattended-upgrades/unattended-upgrades-shutdown.log
Nov 18 14:38:15 h-we-vm-01 fluent-bit[105730]: [2024/11/18 14:38:15] [ info] [input:tail:tail.0] inotify_fs_add(): inode=13418 watch_fd=15 name=/var/log/unattended-upgrades/unattended-upgrades.log
Nov 18 14:38:15 h-we-vm-01 fluent-bit[105730]: [2024/11/18 14:38:15] [ info] [output:http:http.0] worker #0 started
Nov 18 14:38:15 h-we-vm-01 fluent-bit[105730]: [2024/11/18 14:38:15] [ info] [output:http:http.0] worker #1 started
Nov 18 14:38:15 h-we-vm-01 fluent-bit[105730]: [2024/11/18 14:38:15] [ info] [input:tail:tail.0] inotify_fs_add(): inode=774150 watch_fd=16 name=/var/log/zabbix/zabbix_agent2.log
Nov 18 14:38:16 h-we-vm-01 fluent-bit[105730]: [2024/11/18 14:38:16] [ info] [output:http:http.0] seq.domain:443, HTTP status=201
Nov 18 14:38:17 h-we-vm-01 fluent-bit[105730]: [2024/11/18 14:38:17] [ info] [output:http:http.0] seq.domain:443, HTTP status=201
Nov 18 14:38:18 h-we-vm-01 fluent-bit[105730]: [2024/11/18 14:38:18] [ info] [output:http:http.0] seq.domain:443, HTTP status=201
Nov 18 14:38:20 h-we-vm-01 fluent-bit[105730]: [2024/11/18 14:38:20] [ info] [output:http:http.0] seq.domain:443, HTTP status=201

@patrick-stephens
Copy link
Contributor

patrick-stephens commented Nov 18, 2024

I just upgraded to 3.2.1 - unfortunately problem persist. I do not have precise debug but looks similar to previous reports...

Nov 18 15:14:23 petra7.xxxx fluent-bit[2061988]: [2024/11/18 15:14:23] [error] [engine] chunk '2061988-1731939252.70591213.flb' cannot be retried: task_id=0, input=node_exporter_metrics.1 > output=opentelemetry.1
Nov 18 15:14:27 petra7.xxxx fluent-bit[2061988]: [2024/11/18 15:14:27] [error] [output:opentelemetry:opentelemetry.1] otel.xxxx:443, HTTP status=0
Nov 18 15:14:27 petra7.xxxx fluent-bit[2061988]: [2024/11/18 15:14:27] [ warn] [engine] failed to flush chunk '2061988-1731939267.70938234.flb', retry in 11 seconds: task_id=0, input=node_exporter_metrics.1 > output=opentelemetry.1
(out_id=1)

@ElectricWeasel Can you double check it's definitely 3.2.1 in the log and if so send over the logs?

@ElectricWeasel
Copy link

I just upgraded to 3.2.1 - unfortunately problem persist. I do not have precise debug but looks similar to previous reports...

Nov 18 15:14:23 petra7.xxxx fluent-bit[2061988]: [2024/11/18 15:14:23] [error] [engine] chunk '2061988-1731939252.70591213.flb' cannot be retried: task_id=0, input=node_exporter_metrics.1 > output=opentelemetry.1
Nov 18 15:14:27 petra7.xxxx fluent-bit[2061988]: [2024/11/18 15:14:27] [error] [output:opentelemetry:opentelemetry.1] otel.xxxx:443, HTTP status=0
Nov 18 15:14:27 petra7.xxxx fluent-bit[2061988]: [2024/11/18 15:14:27] [ warn] [engine] failed to flush chunk '2061988-1731939267.70938234.flb', retry in 11 seconds: task_id=0, input=node_exporter_metrics.1 > output=opentelemetry.1
(out_id=1)

@ElectricWeasel Can you double check it's definitely 3.2.1 in the log and if so send over the logs?

Unfortunately yes, I double checked its from v3.2.1.

Nov 18 15:57:36 petra7.xxxx.xxx fluent-bit[2074062]: Fluent Bit v3.2.1
Nov 18 15:57:36 petra7.xxxx.xxx fluent-bit[2074062]: * Copyright (C) 2015-2024 The Fluent Bit Authors
Nov 18 15:57:36 petra7.xxxx.xxx fluent-bit[2074062]: * Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
Nov 18 15:57:36 petra7.xxxx.xxx fluent-bit[2074062]: * https://fluentbit.io
Nov 18 15:57:36 petra7.xxxx.xxx fluent-bit[2074062]: ______ _                  _    ______ _ _           _____  _____
Nov 18 15:57:36 petra7.xxxx.xxx fluent-bit[2074062]: |  ___| |                | |   | ___ (_) |         |____ |/ __  \
Nov 18 15:57:36 petra7.xxxx.xxx fluent-bit[2074062]: | |_  | |_   _  ___ _ __ | |_  | |_/ /_| |_  __   __   / /`' / /'
Nov 18 15:57:36 petra7.xxxx.xxx fluent-bit[2074062]: |  _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / /   \ \  / /
Nov 18 15:57:36 petra7.xxxx.xxx fluent-bit[2074062]: | |   | | |_| |  __/ | | | |_  | |_/ / | |_   \ V /.___/ /./ /___
Nov 18 15:57:36 petra7.xxxx.xxx fluent-bit[2074062]: \_|   |_|\__,_|\___|_| |_|\__| \____/|_|\__|   \_/ \____(_)_____/
Nov 18 15:57:36 petra7.xxxx.xxx fluent-bit[2074062]: [2024/11/18 15:57:36] [ info] [fluent bit] version=3.2.1, commit=, pid=2074062
Nov 18 15:57:36 petra7.xxxx.xxx fluent-bit[2074062]: [2024/11/18 15:57:36] [ info] [storage] ver=1.5.2, type=memory+filesystem, sync=normal, checksum=off, max_chunks_up=128
Nov 18 15:57:36 petra7.xxxx.xxx fluent-bit[2074062]: [2024/11/18 15:57:36] [ info] [storage] backlog input plugin: storage_backlog.3
Nov 18 15:57:36 petra7.xxxx.xxx fluent-bit[2074062]: [2024/11/18 15:57:36] [ info] [simd    ] disabled
Nov 18 15:57:36 petra7.xxxx.xxx fluent-bit[2074062]: [2024/11/18 15:57:36] [ info] [cmetrics] version=0.9.9
Nov 18 15:57:36 petra7.xxxx.xxx fluent-bit[2074062]: [2024/11/18 15:57:36] [ info] [ctraces ] version=0.5.7
Nov 18 15:57:36 petra7.xxxx.xxx fluent-bit[2074062]: [2024/11/18 15:57:36] [ info] [input:systemd:systemd.0] initializing
Nov 18 15:57:36 petra7.xxxx.xxx fluent-bit[2074062]: [2024/11/18 15:57:36] [ info] [input:systemd:systemd.0] storage_strategy='filesystem' (memory + filesystem)
Nov 18 15:57:36 petra7.xxxx.xxx fluent-bit[2074062]: [2024/11/18 15:57:36] [ info] [input:systemd:systemd.0] seek_cursor=s=b923fc5084f94faeb06a2ef99a188c9c;i=857... OK
Nov 18 15:57:36 petra7.xxxx.xxx fluent-bit[2074062]: [2024/11/18 15:57:36] [ info] [input:node_exporter_metrics:node_exporter_metrics.1] initializing
Nov 18 15:57:36 petra7.xxxx.xxx fluent-bit[2074062]: [2024/11/18 15:57:36] [ info] [input:node_exporter_metrics:node_exporter_metrics.1] storage_strategy='memory' (memory only)
Nov 18 15:57:36 petra7.xxxx.xxx fluent-bit[2074062]: [2024/11/18 15:57:36] [ info] [input:node_exporter_metrics:node_exporter_metrics.1] path.procfs = /proc
Nov 18 15:57:36 petra7.xxxx.xxx fluent-bit[2074062]: [2024/11/18 15:57:36] [ info] [input:node_exporter_metrics:node_exporter_metrics.1] path.sysfs  = /sys
Nov 18 15:57:36 petra7.xxxx.xxx fluent-bit[2074062]: [2024/11/18 15:57:36] [ info] [input:node_exporter_metrics:node_exporter_metrics.1] thread instance initialized
Nov 18 15:57:36 petra7.xxxx.xxx fluent-bit[2074062]: [2024/11/18 15:57:36] [ info] [input:forward:forward.2] initializing
Nov 18 15:57:36 petra7.xxxx.xxx fluent-bit[2074062]: [2024/11/18 15:57:36] [ info] [input:forward:forward.2] storage_strategy='filesystem' (memory + filesystem)
Nov 18 15:57:36 petra7.xxxx.xxx fluent-bit[2074062]: [2024/11/18 15:57:36] [ info] [input:forward:forward.2] listening on unix:///run/fluentd-forward.sock
Nov 18 15:57:36 petra7.xxxx.xxx fluent-bit[2074062]: [2024/11/18 15:57:36] [ info] [input:storage_backlog:storage_backlog.3] initializing
Nov 18 15:57:36 petra7.xxxx.xxx fluent-bit[2074062]: [2024/11/18 15:57:36] [ info] [input:storage_backlog:storage_backlog.3] storage_strategy='memory' (memory only)
Nov 18 15:57:36 petra7.xxxx.xxx fluent-bit[2074062]: [2024/11/18 15:57:36] [ info] [input:storage_backlog:storage_backlog.3] queue memory limit: 95.4M
Nov 18 15:57:36 petra7.xxxx.xxx fluent-bit[2074062]: [2024/11/18 15:57:36] [ info] [sp] stream processor started
Nov 18 15:57:52 petra7.xxxx.xxx fluent-bit[2074062]: [2024/11/18 15:57:52] [error] [output:opentelemetry:opentelemetry.1] otel.xxxx.xxx:443, HTTP status=0
Nov 18 15:57:52 petra7.xxxx.xxx fluent-bit[2074062]: [2024/11/18 15:57:52] [ warn] [engine] failed to flush chunk '2074062-1731941872.70900888.flb', retry in 6 seconds: task_id=0, input=node_exporter_metrics.1 > output=opentelemetry.1 (out_id=1)
Nov 18 15:57:59 petra7.xxxx.xxx fluent-bit[2074062]: [2024/11/18 15:57:59] [error] [engine] chunk '2074062-1731941872.70900888.flb' cannot be retried: task_id=0, input=node_exporter_metrics.1 > output=opentelemetry.1

@patrick-stephens
Copy link
Contributor

Looks like there is a duplicate for the OTEL specifically on #9613 so going to close this and move the OTEL issues over to that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug waiting-for-user Waiting for more information, tests or requested changes
Projects
None yet
Development

No branches or pull requests

6 participants