Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

grep: add common grep APIs #7336

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open

grep: add common grep APIs #7336

wants to merge 6 commits into from

Conversation

nokute78
Copy link
Collaborator

@nokute78 nokute78 commented May 7, 2023

This patch is to add a common API to grep.

Currently filter_grep and filter_log_to_metrics implement grep individually.
This patch adds src/flb_grep.c and filter_grep/filter_log_to_metrics use the API.


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • [N/A] Example configuration file for the change
  • Debug log output from testing the change
  • Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • [N/A] Run local packaging test showing all targets (including any new ones) build.
  • [N/A] Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • [N/A] Documentation required for this feature

Backporting

  • [N/A] Backport to latest stable release.

Debug/Valgrind output

flb-rt-filter_grep:

$ valgrind --leak-check=full bin/flb-rt-filter_grep
==76954== Memcheck, a memory error detector
==76954== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==76954== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==76954== Command: bin/flb-rt-filter_grep
==76954== 
Test regex...                                   [2023/05/07 20:23:40] [ info] [fluent bit] version=2.1.3, commit=f18463f53a, pid=76954
[2023/05/07 20:23:40] [ info] [storage] ver=1.2.0, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2023/05/07 20:23:40] [ info] [cmetrics] version=0.6.1
[2023/05/07 20:23:40] [ info] [ctraces ] version=0.3.0
[2023/05/07 20:23:40] [ info] [input:lib:lib.0] initializing
[2023/05/07 20:23:40] [ info] [input:lib:lib.0] storage_strategy='memory' (memory only)
[2023/05/07 20:23:40] [ info] [output:stdout:stdout.0] worker #0 started
[2023/05/07 20:23:40] [ info] [sp] stream processor started

(snip)

Test issue_5209...                              [2023/05/07 20:23:57] [ info] [fluent bit] version=2.1.3, commit=f18463f53a, pid=76954
[2023/05/07 20:23:57] [ info] [storage] ver=1.2.0, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2023/05/07 20:23:57] [ info] [cmetrics] version=0.6.1
[2023/05/07 20:23:57] [ info] [ctraces ] version=0.3.0
[2023/05/07 20:23:57] [ info] [input:lib:lib.0] initializing
[2023/05/07 20:23:57] [ info] [input:lib:lib.0] storage_strategy='memory' (memory only)
[2023/05/07 20:23:57] [ info] [sp] stream processor started
[2023/05/07 20:23:59] [ warn] [engine] service will shutdown in max 5 seconds
[2023/05/07 20:23:59] [ info] [engine] service has stopped (0 pending tasks)
[ OK ]
SUCCESS: All unit tests have passed.
==76954== 
==76954== HEAP SUMMARY:
==76954==     in use at exit: 0 bytes in 0 blocks
==76954==   total heap usage: 63,588 allocs, 63,588 frees, 117,081,880 bytes allocated
==76954== 
==76954== All heap blocks were freed -- no leaks are possible
==76954== 
==76954== For lists of detected and suppressed errors, rerun with: -s
==76954== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

flb-rt-filter_log_to_metrics:

$ valgrind --leak-check=full bin/flb-rt-filter_log_to_metrics 
==76988== Memcheck, a memory error detector
==76988== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==76988== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==76988== Command: bin/flb-rt-filter_log_to_metrics
==76988== 
Test counter_k8s...                             [ OK ]
Test counter...                                 [ OK ]
Test counter_k8s_two_tuples...                  [ OK ]
Test gauge...                                   [ OK ]
Test histogram...                               [ OK ]
Test counter_regex...                           [ OK ]
Test regex_empty_label_keys...                  [ OK ]
SUCCESS: All unit tests have passed.
==76988== 
==76988== HEAP SUMMARY:
==76988==     in use at exit: 0 bytes in 0 blocks
==76988==   total heap usage: 16,727 allocs, 16,727 frees, 9,858,669 bytes allocated
==76988== 
==76988== All heap blocks were freed -- no leaks are possible
==76988== 
==76988== For lists of detected and suppressed errors, rerun with: -s
==76988== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

flb-it-grep:

$ valgrind --leak-check=full bin/flb-it-grep 
==77007== Memcheck, a memory error detector
==77007== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==77007== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==77007== Command: bin/flb-it-grep
==77007== 
Test create_destroy...                          [ OK ]
==77007== Warning: invalid file descriptor -1 in syscall close()
Test legacy_regex_match...                      [ OK ]
==77007== Warning: invalid file descriptor -1 in syscall close()
Test legacy_regex_unmatch...                    [ OK ]
==77007== Warning: invalid file descriptor -1 in syscall close()
Test legacy_exclude_match...                    [ OK ]
==77007== Warning: invalid file descriptor -1 in syscall close()
Test legacy_exclude_unmatch...                  [ OK ]
==77007== Warning: invalid file descriptor -1 in syscall close()
Test or_regex_match...                          [ OK ]
==77007== Warning: invalid file descriptor -1 in syscall close()
Test or_regex_unmatch...                        [ OK ]
==77007== Warning: invalid file descriptor -1 in syscall close()
Test or_exclude_match...                        [ OK ]
==77007== Warning: invalid file descriptor -1 in syscall close()
Test or_exclude_unmatch...                      [ OK ]
==77007== Warning: invalid file descriptor -1 in syscall close()
Test or_regexs_match...                         [ OK ]
==77007== Warning: invalid file descriptor -1 in syscall close()
Test or_regexs_unmatch...                       [ OK ]
==77007== Warning: invalid file descriptor -1 in syscall close()
Test or_excludes_match...                       [ OK ]
==77007== Warning: invalid file descriptor -1 in syscall close()
Test or_excludes_unmatch...                     [ OK ]
==77007== Warning: invalid file descriptor -1 in syscall close()
Test and_regex_match...                         [ OK ]
==77007== Warning: invalid file descriptor -1 in syscall close()
Test and_regex_unmatch...                       [ OK ]
==77007== Warning: invalid file descriptor -1 in syscall close()
Test and_exclude_match...                       [ OK ]
==77007== Warning: invalid file descriptor -1 in syscall close()
Test and_exclude_unmatch...                     [ OK ]
==77007== Warning: invalid file descriptor -1 in syscall close()
Test and_regexs_match...                        [ OK ]
==77007== Warning: invalid file descriptor -1 in syscall close()
Test and_regexs_unmatch...                      [ OK ]
==77007== Warning: invalid file descriptor -1 in syscall close()
Test and_excludes_match...                      [ OK ]
==77007== Warning: invalid file descriptor -1 in syscall close()
Test and_excludes_unmatch...                    [ OK ]
==77007== Warning: invalid file descriptor -1 in syscall close()
Test or_regex_exclude_rules...                  [2023/05/07 20:26:06] [error] Both 'regex' and 'exclude' are set.
[ OK ]
==77007== Warning: invalid file descriptor -1 in syscall close()
Test and_regex_exclude_rules...                 [2023/05/07 20:26:06] [error] Both 'regex' and 'exclude' are set.
[ OK ]
==77007== Warning: invalid file descriptor -1 in syscall close()
SUCCESS: All unit tests have passed.
==77007== 
==77007== HEAP SUMMARY:
==77007==     in use at exit: 0 bytes in 0 blocks
==77007==   total heap usage: 20,310 allocs, 20,310 frees, 2,538,241 bytes allocated
==77007== 
==77007== All heap blocks were freed -- no leaks are possible
==77007== 
==77007== For lists of detected and suppressed errors, rerun with: -s
==77007== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

@nokute78 nokute78 temporarily deployed to pr May 7, 2023 11:28 — with GitHub Actions Inactive
@nokute78 nokute78 temporarily deployed to pr May 7, 2023 11:28 — with GitHub Actions Inactive
@nokute78 nokute78 temporarily deployed to pr May 7, 2023 11:28 — with GitHub Actions Inactive
@nokute78 nokute78 temporarily deployed to pr May 7, 2023 11:30 — with GitHub Actions Inactive
@nokute78 nokute78 temporarily deployed to pr May 7, 2023 11:30 — with GitHub Actions Inactive
@nokute78 nokute78 temporarily deployed to pr May 7, 2023 11:30 — with GitHub Actions Inactive
@nokute78 nokute78 temporarily deployed to pr May 7, 2023 11:45 — with GitHub Actions Inactive
@nokute78 nokute78 temporarily deployed to pr May 7, 2023 11:45 — with GitHub Actions Inactive
@nokute78 nokute78 temporarily deployed to pr May 7, 2023 11:46 — with GitHub Actions Inactive
@leonardo-albertovich
Copy link
Collaborator

Hi @nokute78, please let me know when you are done with this PR so I can review it.

I was about to do it but I noticed you pushed some changes and cancelled the process.

I was going to point out some coding style changes such as adding a space between the structure name and opening bracket and keeping the bracken in the same line.

Additionally, when subscripting the first character of strings please use variable[0] rather than *variable to improve legibility.

Other than that it's a great improvement, thank you very much.

@leonardo-albertovich
Copy link
Collaborator

There are a few sds related things I forgot to mention in my previous message :

  1. If possible replace flb_sds with cfl_sds so we don't introduce more instances of flb_sds usage.
  2. There are a few points where you are creating or concatenating to sds strings where you don't check the result of the functions and that's something we need to do.
  3. Instead of using flb_sds_cat use flb_sds_cat_safe

@nokute78 nokute78 temporarily deployed to pr May 7, 2023 11:57 — with GitHub Actions Inactive
@nokute78 nokute78 temporarily deployed to pr May 7, 2023 11:58 — with GitHub Actions Inactive
@nokute78 nokute78 temporarily deployed to pr May 7, 2023 11:58 — with GitHub Actions Inactive
@nokute78 nokute78 temporarily deployed to pr May 7, 2023 12:18 — with GitHub Actions Inactive
@nokute78 nokute78 temporarily deployed to pr May 7, 2023 12:27 — with GitHub Actions Inactive
@nokute78 nokute78 temporarily deployed to pr May 7, 2023 12:27 — with GitHub Actions Inactive
@nokute78 nokute78 temporarily deployed to pr May 7, 2023 12:28 — with GitHub Actions Inactive
@nokute78 nokute78 temporarily deployed to pr May 7, 2023 12:48 — with GitHub Actions Inactive
@nokute78
Copy link
Collaborator Author

nokute78 commented May 7, 2023

@leonardo-albertovich Could you review it and add review comment to clarify line No?
I've fixed CI build error.

There are a few sds related things

These codes came from filter_grep. I just moved from the plugin code.
https://github.com/fluent/fluent-bit/blob/v2.1.2/plugins/filter_grep/grep.c#L119

I fixed to use flb_sds_cat_safe and check the result.

If possible replace flb_sds with cfl_sds so we don't introduce more instances of flb_sds usage.

What is the reason ?
Currently, cfl_sds doesn't support some APIs like flb_sds_snprintf. I think we need to support them at first.

@leonardo-albertovich
Copy link
Collaborator

I understand that the code comes from filter_grep and because of that I don't expect you to change it.

The reason why we are switching from flb_sds to cfl_sds is we are moving the core data structures to a neutral place (cfl) as an effort to deduplicate code and standardize.

Other things such as mk_list and flb_kv have already been moved and others such as atomics are in the process of being moved over as well.

In the end, the goal is to have a more homogeneous code base and a reasonable dependency graph.

@nokute78 nokute78 temporarily deployed to pr May 19, 2023 23:16 — with GitHub Actions Inactive
Copy link
Collaborator

@leonardo-albertovich leonardo-albertovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is ready to be merged @edsiper

@nokute78
Copy link
Collaborator Author

Ping.

@nokute78
Copy link
Collaborator Author

@edsiper ping

@nokute78 nokute78 temporarily deployed to pr September 16, 2023 00:00 — with GitHub Actions Inactive
@nokute78 nokute78 temporarily deployed to pr September 16, 2023 00:00 — with GitHub Actions Inactive
@nokute78 nokute78 temporarily deployed to pr September 16, 2023 00:00 — with GitHub Actions Inactive
@nokute78
Copy link
Collaborator Author

nokute78 commented Sep 16, 2023

I merged from master.

Valgrind output of flb_it_grep

$ valgrind --leak-check=full bin/flb-it-grep 
==17819== Memcheck, a memory error detector
==17819== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==17819== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==17819== Command: bin/flb-it-grep
==17819== 
Test create_destroy...                          [ OK ]
==17819== Warning: invalid file descriptor -1 in syscall close()
Test legacy_regex_match...                      [ OK ]
==17819== Warning: invalid file descriptor -1 in syscall close()
Test legacy_regex_unmatch...                    [ OK ]
==17819== Warning: invalid file descriptor -1 in syscall close()
Test legacy_exclude_match...                    [ OK ]
==17819== Warning: invalid file descriptor -1 in syscall close()
Test legacy_exclude_unmatch...                  [ OK ]
==17819== Warning: invalid file descriptor -1 in syscall close()
Test or_regex_match...                          [ OK ]
==17819== Warning: invalid file descriptor -1 in syscall close()
Test or_regex_unmatch...                        [ OK ]
==17819== Warning: invalid file descriptor -1 in syscall close()
Test or_exclude_match...                        [ OK ]
==17819== Warning: invalid file descriptor -1 in syscall close()
Test or_exclude_unmatch...                      [ OK ]
==17819== Warning: invalid file descriptor -1 in syscall close()
Test or_regexs_match...                         [ OK ]
==17819== Warning: invalid file descriptor -1 in syscall close()
Test or_regexs_unmatch...                       [ OK ]
==17819== Warning: invalid file descriptor -1 in syscall close()
Test or_excludes_match...                       [ OK ]
==17819== Warning: invalid file descriptor -1 in syscall close()
Test or_excludes_unmatch...                     [ OK ]
==17819== Warning: invalid file descriptor -1 in syscall close()
Test and_regex_match...                         [ OK ]
==17819== Warning: invalid file descriptor -1 in syscall close()
Test and_regex_unmatch...                       [ OK ]
==17819== Warning: invalid file descriptor -1 in syscall close()
Test and_exclude_match...                       [ OK ]
==17819== Warning: invalid file descriptor -1 in syscall close()
Test and_exclude_unmatch...                     [ OK ]
==17819== Warning: invalid file descriptor -1 in syscall close()
Test and_regexs_match...                        [ OK ]
==17819== Warning: invalid file descriptor -1 in syscall close()
Test and_regexs_unmatch...                      [ OK ]
==17819== Warning: invalid file descriptor -1 in syscall close()
Test and_excludes_match...                      [ OK ]
==17819== Warning: invalid file descriptor -1 in syscall close()
Test and_excludes_unmatch...                    [ OK ]
==17819== Warning: invalid file descriptor -1 in syscall close()
Test or_regex_exclude_rules...                  [2023/09/16 09:00:51] [error] Both 'regex' and 'exclude' are set.
[ OK ]
==17819== Warning: invalid file descriptor -1 in syscall close()
Test and_regex_exclude_rules...                 [2023/09/16 09:00:51] [error] Both 'regex' and 'exclude' are set.
[ OK ]
==17819== Warning: invalid file descriptor -1 in syscall close()
SUCCESS: All unit tests have passed.
==17819== 
==17819== HEAP SUMMARY:
==17819==     in use at exit: 0 bytes in 0 blocks
==17819==   total heap usage: 20,448 allocs, 20,448 frees, 2,563,288 bytes allocated
==17819== 
==17819== All heap blocks were freed -- no leaks are possible
==17819== 
==17819== For lists of detected and suppressed errors, rerun with: -s
==17819== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

@nokute78 nokute78 temporarily deployed to pr September 16, 2023 00:30 — with GitHub Actions Inactive
Copy link
Contributor

This PR is stale because it has been open 45 days with no activity. Remove stale label or comment or this will be closed in 10 days.

@github-actions github-actions bot added the Stale label Dec 17, 2023
@nokute78
Copy link
Collaborator Author

nokute78 commented Feb 2, 2024

Fix conflict.

Copy link
Contributor

github-actions bot commented May 3, 2024

This PR is stale because it has been open 45 days with no activity. Remove stale label or comment or this will be closed in 10 days.

@github-actions github-actions bot added the Stale label May 3, 2024
@github-actions github-actions bot removed the Stale label Aug 16, 2024
Copy link
Contributor

This PR is stale because it has been open 45 days with no activity. Remove stale label or comment or this will be closed in 10 days.

@github-actions github-actions bot added the Stale label Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants