Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

input: Add ingestion_paused metrics to confirm whether an input plugin is paused or not #8044

Merged

Conversation

cosmo0920
Copy link
Contributor

@cosmo0920 cosmo0920 commented Oct 16, 2023

To observe whether input plugins are paused or not, we should provide any_overlimitingestion_paused metrics which represents the input is paused or not.


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
$ bin/fluent-bit -i dummy -o stdout -Y -H -P 2021
  • Debug log output from testing the change
Fluent Bit v2.2.0
* Copyright (C) 2015-2023 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2023/10/16 15:18:06] [ info] Configuration:
[2023/10/16 15:18:06] [ info]  flush time     | 1.000000 seconds
[2023/10/16 15:18:06] [ info]  grace          | 5 seconds
[2023/10/16 15:18:06] [ info]  daemon         | 0
[2023/10/16 15:18:06] [ info] ___________
[2023/10/16 15:18:06] [ info]  inputs:
[2023/10/16 15:18:06] [ info]      dummy
[2023/10/16 15:18:06] [ info] ___________
[2023/10/16 15:18:06] [ info]  filters:
[2023/10/16 15:18:06] [ info] ___________
[2023/10/16 15:18:06] [ info]  outputs:
[2023/10/16 15:18:06] [ info]      stdout.0
[2023/10/16 15:18:06] [ info] ___________
[2023/10/16 15:18:06] [ info]  collectors:
[2023/10/16 15:18:06] [ info] [fluent bit] version=2.2.0, commit=8401076f11, pid=79753
[2023/10/16 15:18:06] [debug] [engine] coroutine stack size: 36864 bytes (36.0K)
[2023/10/16 15:18:06] [ info] [storage] ver=1.2.0, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2023/10/16 15:18:06] [ info] [cmetrics] version=0.6.3
[2023/10/16 15:18:06] [ info] [ctraces ] version=0.3.1
[2023/10/16 15:18:06] [ info] [input:dummy:dummy.0] initializing
[2023/10/16 15:18:06] [ info] [input:dummy:dummy.0] storage_strategy='memory' (memory only)
[2023/10/16 15:18:06] [debug] [dummy:dummy.0] created event channels: read=21 write=22
[2023/10/16 15:18:06] [debug] [stdout:stdout.0] created event channels: read=23 write=24
[2023/10/16 15:18:06] [ info] [output:stdout:stdout.0] worker #0 started
[2023/10/16 15:18:06] [ info] [http_server] listen iface=0.0.0.0 tcp_port=2021
[2023/10/16 15:18:06] [ info] [sp] stream processor started
[2023/10/16 15:18:07] [debug] [input chunk] update output instances with new chunk size diff=36, records=1, input=dummy.0
[2023/10/16 15:18:08] [debug] [task] created task=0x109c0adb0 id=0 OK
[2023/10/16 15:18:08] [debug] [output:stdout:stdout.0] task_id=0 assigned to thread #0
[0] dummy.0: [[1697437087.743990000, {}], {"message"=>"dummy"}]
[2023/10/16 15:18:08] [debug] [out flush] cb_destroy coro_id=0
[2023/10/16 15:18:08] [debug] [input chunk] update output instances with new chunk size diff=36, records=1, input=dummy.0
[2023/10/16 15:18:08] [debug] [task] destroy task=0x109c0adb0 (task_id=0)
[2023/10/16 15:18:09] [debug] [task] created task=0x109c0acc0 id=0 OK
[2023/10/16 15:18:09] [debug] [output:stdout:stdout.0] task_id=0 assigned to thread #0
[0] dummy.0: [[1697437088.743829000, {}], {"message"=>"dummy"}]
[2023/10/16 15:18:09] [debug] [out flush] cb_destroy coro_id=1
[2023/10/16 15:18:09] [debug] [input chunk] update output instances with new chunk size diff=36, records=1, input=dummy.0
[2023/10/16 15:18:09] [debug] [task] destroy task=0x109c0acc0 (task_id=0)
[2023/10/16 15:18:10] [debug] [task] created task=0x109c0abd0 id=0 OK
[2023/10/16 15:18:10] [debug] [output:stdout:stdout.0] task_id=0 assigned to thread #0
[0] dummy.0: [[1697437089.740780000, {}], {"message"=>"dummy"}]
[2023/10/16 15:18:10] [debug] [out flush] cb_destroy coro_id=2
[2023/10/16 15:18:10] [debug] [input chunk] update output instances with new chunk size diff=36, records=1, input=dummy.0
[2023/10/16 15:18:10] [debug] [task] destroy task=0x109c0abd0 (task_id=0)
[2023/10/16 15:18:11] [debug] [task] created task=0x109c0aae0 id=0 OK
[2023/10/16 15:18:11] [debug] [output:stdout:stdout.0] task_id=0 assigned to thread #0
[0] dummy.0: [[1697437090.744696000, {}], {"message"=>"dummy"}]
[2023/10/16 15:18:11] [debug] [out flush] cb_destroy coro_id=3
[2023/10/16 15:18:11] [debug] [task] destroy task=0x109c0aae0 (task_id=0)
[2023/10/16 15:18:11] [debug] [input chunk] update output instances with new chunk size diff=36, records=1, input=dummy.0
[2023/10/16 15:18:12] [debug] [task] created task=0x109c0a9f0 id=0 OK
[2023/10/16 15:18:12] [debug] [output:stdout:stdout.0] task_id=0 assigned to thread #0
[0] dummy.0: [[1697437091.743928000, {}], {"message"=>"dummy"}]
[2023/10/16 15:18:12] [debug] [out flush] cb_destroy coro_id=4
[2023/10/16 15:18:12] [debug] [input chunk] update output instances with new chunk size diff=36, records=1, input=dummy.0
[2023/10/16 15:18:12] [debug] [task] destroy task=0x109c0a9f0 (task_id=0)
^C[2023/10/16 15:18:13] [engine] caught signal (SIGINT)
[2023/10/16 15:18:13] [ info] [input] pausing dummy.0
[2023/10/16 15:18:13] [ info] [output:stdout:stdout.0] thread worker #0 stopping...
[2023/10/16 15:18:13] [ info] [output:stdout:stdout.0] thread worker #0 stopped

And from another terminal:

$ curl localhost:2021/api/v2/metrics
2023-10-17T05:36:39.133296767Z fluentbit_uptime{hostname="Hiroshi-no-MacBook-Air-M2.local"} = 1
2023-10-17T05:36:38.127135918Z fluentbit_input_bytes_total{name="dummy.0"} = 0
2023-10-17T05:36:38.127135918Z fluentbit_input_records_total{name="dummy.0"} = 0
2023-10-17T05:36:38.128055678Z fluentbit_output_proc_records_total{name="stdout.0"} = 0
2023-10-17T05:36:38.128055678Z fluentbit_output_proc_bytes_total{name="stdout.0"} = 0
2023-10-17T05:36:38.128055678Z fluentbit_output_errors_total{name="stdout.0"} = 0
2023-10-17T05:36:38.128055678Z fluentbit_output_retries_total{name="stdout.0"} = 0
2023-10-17T05:36:38.128055678Z fluentbit_output_retries_failed_total{name="stdout.0"} = 0
2023-10-17T05:36:38.128055678Z fluentbit_output_dropped_records_total{name="stdout.0"} = 0
2023-10-17T05:36:38.128055678Z fluentbit_output_retried_records_total{name="stdout.0"} = 0
2023-10-17T05:36:39.133296767Z fluentbit_process_start_time_seconds{hostname="Hiroshi-no-MacBook-Air-M2.local"} = 1697520998
2023-10-17T05:36:39.133296767Z fluentbit_build_info{hostname="Hiroshi-no-MacBook-Air-M2.local",version="2.2.0",os="macos"} = 1697520998
2023-10-17T05:36:39.133296767Z fluentbit_hot_reloaded_times{hostname="Hiroshi-no-MacBook-Air-M2.local"} = 0
2023-10-17T05:36:39.133506269Z fluentbit_storage_chunks = 0
2023-10-17T05:36:39.133506269Z fluentbit_storage_mem_chunks = 0
2023-10-17T05:36:39.133506269Z fluentbit_storage_fs_chunks = 0
2023-10-17T05:36:39.133506269Z fluentbit_storage_fs_chunks_up = 0
2023-10-17T05:36:39.133506269Z fluentbit_storage_fs_chunks_down = 0
2023-10-17T05:36:38.127135918Z fluentbit_input_ingestion_paused{name="dummy.0"} = 0
2023-10-17T05:36:38.127135918Z fluentbit_input_storage_overlimit{name="dummy.0"} = 0
2023-10-17T05:36:38.127135918Z fluentbit_input_storage_memory_bytes{name="dummy.0"} = 0
2023-10-17T05:36:38.127135918Z fluentbit_input_storage_chunks{name="dummy.0"} = 0
2023-10-17T05:36:38.127135918Z fluentbit_input_storage_chunks_up{name="dummy.0"} = 0
2023-10-17T05:36:38.127135918Z fluentbit_input_storage_chunks_down{name="dummy.0"} = 0
2023-10-17T05:36:38.127135918Z fluentbit_input_storage_chunks_busy{name="dummy.0"} = 0
2023-10-17T05:36:38.127135918Z fluentbit_input_storage_chunks_busy_bytes{name="dummy.0"} = 0
2023-10-17T05:36:38.128055678Z fluentbit_output_upstream_total_connections{name="stdout.0"} = 0
2023-10-17T05:36:38.128055678Z fluentbit_output_upstream_busy_connections{name="stdout.0"} = 0
  • Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • Run local packaging test showing all targets (including any new ones) build.
  • Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • Documentation required for this feature

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

@cosmo0920 cosmo0920 temporarily deployed to pr October 16, 2023 06:26 — with GitHub Actions Inactive
@cosmo0920 cosmo0920 temporarily deployed to pr October 16, 2023 06:26 — with GitHub Actions Inactive
@cosmo0920 cosmo0920 temporarily deployed to pr October 16, 2023 06:26 — with GitHub Actions Inactive
@cosmo0920 cosmo0920 temporarily deployed to pr October 16, 2023 06:57 — with GitHub Actions Inactive
@cosmo0920 cosmo0920 force-pushed the cosmo0920-add-metrics-whether-input-plugin-is-paused-or-not branch from 5e51d60 to 9baf76f Compare October 16, 2023 07:59
@cosmo0920 cosmo0920 temporarily deployed to pr October 16, 2023 07:59 — with GitHub Actions Inactive
@cosmo0920 cosmo0920 temporarily deployed to pr October 16, 2023 07:59 — with GitHub Actions Inactive
@cosmo0920 cosmo0920 temporarily deployed to pr October 16, 2023 07:59 — with GitHub Actions Inactive
@cosmo0920 cosmo0920 force-pushed the cosmo0920-add-metrics-whether-input-plugin-is-paused-or-not branch from 9baf76f to 9056ecf Compare October 16, 2023 08:01
@cosmo0920 cosmo0920 temporarily deployed to pr October 16, 2023 08:01 — with GitHub Actions Inactive
@cosmo0920 cosmo0920 temporarily deployed to pr October 16, 2023 08:01 — with GitHub Actions Inactive
@cosmo0920 cosmo0920 temporarily deployed to pr October 16, 2023 08:01 — with GitHub Actions Inactive
@cosmo0920 cosmo0920 marked this pull request as ready for review October 16, 2023 08:22
@cosmo0920 cosmo0920 temporarily deployed to pr October 16, 2023 08:29 — with GitHub Actions Inactive
@edsiper
Copy link
Member

edsiper commented Oct 16, 2023

thanks for opening this PR.

I think the name any_overlimit is not representative to answer the question is the input plugin paused ? . I would suggest to change it to something represents the status, eg: ingestion_paused

@cosmo0920
Copy link
Contributor Author

Sounds reasonable. I’ll change the metrics name into more suitable one

src/flb_input.c Outdated Show resolved Hide resolved
@cosmo0920 cosmo0920 temporarily deployed to pr October 17, 2023 05:36 — with GitHub Actions Inactive
@cosmo0920 cosmo0920 temporarily deployed to pr October 17, 2023 05:36 — with GitHub Actions Inactive
@cosmo0920 cosmo0920 temporarily deployed to pr October 17, 2023 05:36 — with GitHub Actions Inactive
@cosmo0920 cosmo0920 changed the title input: Add any overlimit metrics to confirm whether an input plugin is paused or not input: Add ingestion_paused metrics to confirm whether an input plugin is paused or not Oct 17, 2023
@cosmo0920 cosmo0920 temporarily deployed to pr October 17, 2023 06:08 — with GitHub Actions Inactive
@edsiper edsiper added this to the Fluent Bit v2.2.0 milestone Oct 17, 2023
@edsiper edsiper merged commit 92b9053 into master Oct 17, 2023
44 of 45 checks passed
@edsiper edsiper deleted the cosmo0920-add-metrics-whether-input-plugin-is-paused-or-not branch October 17, 2023 16:18
leonardo-albertovich pushed a commit that referenced this pull request Nov 3, 2023
@yeya24
Copy link

yeya24 commented Feb 16, 2024

Hi, may I know if this metric is exposed to Prometheus metrics also? Or only available via api/v2/metrics API.
I think it would be very useful to have it exposed to Prometheus format.

@cosmo0920
Copy link
Contributor Author

Hi, may I know if this metric is exposed to Prometheus metrics also? Or only available via api/v2/metrics API.
I think it would be very useful to have it exposed to Prometheus format.

Hi, this is only for api/v2/metrics API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants