Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replace function does not working propertly #20988

Closed
Jerrimikkihvatai opened this issue Aug 2, 2024 · 2 comments
Closed

replace function does not working propertly #20988

Jerrimikkihvatai opened this issue Aug 2, 2024 · 2 comments
Labels
type: bug A code related bug.

Comments

@Jerrimikkihvatai
Copy link

Jerrimikkihvatai commented Aug 2, 2024

A note for the community

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Problem

Hello!
I am trying to change keys names from journald sink as some fields has underscore(s) as prefix and elastic/opensearch treats them as system fields.
My common message looks like this

{
  "PRIORITY": "6",
  "SYSLOG_FACILITY": "0",
  "SYSLOG_IDENTIFIER": "kernel",
  "_BOOT_ID": "32ab6df56d264b1ea2ab02f1e7116e2a",
  "_KERNEL_DEVICE": "+pci:0000:0c:00.0",
  "_KERNEL_SUBSYSTEM": "pci",
  "_MACHINE_ID": "fec6cb209a734e489e0ce888d3f974fe",
  "_SOURCE_MONOTONIC_TIMESTAMP": "14276040468",
  "_TRANSPORT": "kernel",
  "_UDEV_SYSNAME": "0000:0c:00.0",
  "__MONOTONIC_TIMESTAMP": "14276284013",
  "__REALTIME_TIMESTAMP": "1722420917883538",
  "host": "host.mycompany.com",
  "message": "Just a message",
  "source_type": "journald",
  "vector_msg_fetch_time": "2024-07-31T10:15:17.883538Z"
}

I want to get this

{
  "PRIORITY": "6",
  "SYSLOG_FACILITY": "0",
  "SYSLOG_IDENTIFIER": "kernel",
  "BOOT_ID": "32ab6df56d264b1ea2ab02f1e7116e2a",
  "KERNEL_DEVICE": "+pci:0000:0c:00.0",
  "KERNEL_SUBSYSTEM": "pci",
  "MACHINE_ID": "fec6cb209a734e489e0ce888d3f974fe",
  "SOURCE_MONOTONIC_TIMESTAMP": "14276040468",
  "TRANSPORT": "kernel",
  "UDEV_SYSNAME": "0000:0c:00.0",
  "MONOTONIC_TIMESTAMP": "14276284013",
  "REALTIME_TIMESTAMP": "1722420917883538",
  "host": "host.mycompany.com",
  "message": "Just a message",
  "source_type": "journald",
  "vector_msg_fetch_time": "2024-07-31T10:15:17.883538Z"
}

My config looks like this

...
transforms:
  trf_journald:
    drop_on_error: true
    type: remap
    inputs:
      - src_journald
    source: |-
      . = map_keys(., recursive: true) -> |key| { 
        replace(key, r'_*(?P<field_name>[A-Z0-9]+(?:_[A-Z0-9]+)*)', "$field_name") 
      }
      .REALTIME_TIMESTAMP = from_unix_timestamp!(to_int!(.REALTIME_TIMESTAMP), unit: "microseconds")
...

I found this expression on vector docs
https://vector.dev/docs/reference/vrl/functions/#replace-examples-replace-with-capture-groups
I have tested this expression in vector vrl processor, but when I add it to my config file, vector refuses to start with error message

2024-08-02T08:32:51.701370Z ERROR vector::cli: Configuration error. error=Missing environment variable in config. name = "field_name"

I also tried to use replace_with function, but vector also says that there is no match.field_name variable

Configuration

...
transforms:
  trf_journald:
    drop_on_error: true
    type: remap
    inputs:
      - src_journald
    source: |-
      . = map_keys(., recursive: true) -> |key| { 
        replace(key, r'_*(?P<field_name>[A-Z0-9]+(?:_[A-Z0-9]+)*)', "$field_name") 
      }
      .REALTIME_TIMESTAMP = from_unix_timestamp!(to_int!(.REALTIME_TIMESTAMP), unit: "microseconds")
...

Version

Vector image 0.39/0.40-debian

Debug Output

2024-08-02T08:32:51.701370Z ERROR vector::cli: Configuration error. error=Missing environment variable in config. name = "field_name"

Example Data

No response

Additional Context

No response

References

No response

@Jerrimikkihvatai Jerrimikkihvatai added the type: bug A code related bug. label Aug 2, 2024
@Jerrimikkihvatai Jerrimikkihvatai changed the title replace function do not working propertly replace function does not working propertly Aug 2, 2024
@iFurySt
Copy link
Contributor

iFurySt commented Aug 2, 2024

Hi, you need to use $$field_name instead of $field_name, like this:

# vector.toml

[sources.my_source]
  type = "file"
  include = ["/etc/vector/x.log"]

[transforms.parse_json]
  type = "remap"
  inputs = ["my_source"]
  source = '''
    . = parse_json!(string!(.message))
  '''

[transforms.remap_keys]
  type = "remap"
  inputs = ["parse_json"]
  source = '''
    . = map_keys(., recursive: true) -> |key| { replace(key, r'^_*(?P<field_name>[A-Z0-9]+(?:_[A-Z0-9]+)*)', "$$field_name") }
  '''

[sinks.my_sink]
  type = "console"
  inputs = ["remap_keys"]
  encoding.codec = "json"

Reference: replace

The pattern argument accepts regular expression capture groups. Note: Use $$foo instead of $foo, which is interpreted in a configuration file.

@jszwedko
Copy link
Member

jszwedko commented Aug 2, 2024

Thanks for responding to this question @iFurySt and opening the PR to update the docs. I'll close this out, but let me know if you have more questions @Jerrimikkihvatai .

@jszwedko jszwedko closed this as not planned Won't fix, can't repro, duplicate, stale Aug 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug A code related bug.
Projects
None yet
Development

No branches or pull requests

3 participants