Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filestream returns error when an input with duplicated ID is created #41954

Merged

Conversation

AndersonQ
Copy link
Member

@AndersonQ AndersonQ commented Dec 9, 2024

Proposed commit message

The Filestream input now enforces uniqueness of it ID when starting. When the manager is creating an input it checks whether another input with the same ID has already been created, if this is true, then it will log the duplicated input configuration and return an error.

An empty Filestream input ID is a valid ID and treated as any other, so if there are two or more Filestream inputs without an ID, only the first one to be created will run and the others will error.

Filestream inputs now accept allow_deprecated_id_duplication: true to keep the previous behaviour of running multiple inputs with the same ID.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Disruptive User Impact

This is a breaking change, if there are two or more Filestream inputs with the same ID, only the first to be created will run, all others will error on creation.

Author's Checklist

  • Test with Elastic-Agent
  • Add allow_deprecated_id_duplication and tests for it.

How to test this PR locally

Testing input reload

  1. Create the following filebeat.yml and inputs.d/filestream.yml

    filebeat.yml

    filebeat.config.inputs:
      enabled: true
      path: inputs.d/*.yml
      reload.enabled: true
      reload.period: 1s
    
    output.discard.enabled: true
    
    logging:
      level: info
      metrics:
        enabled: false

    inputs.d/filestream.yml

    - type: filestream
      id: duplicated-id
      enabled: true
      paths:
        - /tmp/duplicated-id-1/*.log
    - type: filestream
      id: duplicated-id
      enabled: true
      paths:
        - /tmp/duplicated-id-2/*.log

  2. Build filebeat from this PR

  3. Start Filebeat

  4. Look the logs, you should see the following errors

    error logs

    {
      "log.level": "error",
      "@timestamp": "2024-12-16T18:23:03.539-0500",
      "log.logger": "input",
      "log.origin": {
        "function": "github.com/elastic/beats/v7/filebeat/input/filestream/internal/input-logfile.(*InputManager).Create",
        "file.name": "input-logfile/manager.go",
        "file.line": 180
      },
      "message": "filestream input 'duplicated-id' is duplicated: input will NOT start",
      "service.name": "filebeat",
      "input.cfg": "{\n  \"enabled\": true,\n  \"id\": \"duplicated-id\",\n  \"paths\": [\n    \"/tmp/duplicated-id-2/*.log\"\n  ],\n  \"type\": \"filestream\"\n}",
      "ecs.version": "1.6.0"
    }
    
    {
      "log.level": "error",
      "@timestamp": "2024-12-16T18:23:03.539-0500",
      "log.logger": "reload",
      "log.origin": {
        "function": "github.com/elastic/beats/v7/libbeat/cfgfile.(*RunnerList).Reload",
        "file.name": "cfgfile/list.go",
        "file.line": 138
      },
      "message": "Error creating runner from config: filestream input with ID 'duplicated-id' already exists, this will lead to data duplication, please use a different ID",
      "service.name": "filebeat",
      "ecs.version": "1.6.0"
    }
    

  5. You can also check the IDs and number of running inputs with:

    tiago@millennium-falcon ~ % curl -s localhost:5066/inputs/|jq '.[].id'
    "duplicated-id"
    
    tiago@millennium-falcon ~ % curl -s localhost:5066/inputs/|jq '.[].id'|wc -l
    1
    

Testing Autodiscover

Any autodiscover will do, so we'll use docker on this example for the sake of simplicity

  1. Start Filebeat with the following configuration

    filebeat.yml

    filebeat.autodiscover.providers:
      - type: docker
        hints.enabled: true
        hints.default_config:
          type: filestream
          id: docker-id
          prospector:
            scanner:
              fingerprint.enabled: true
              symlinks: true
          file_identity.fingerprint: ~
          paths:
            - /var/lib/docker/containers/${data.container.id}/*.log
    
    output.discard.enabled: true
    
    http.enabled: true
    http.port: 5066

  2. Make sure you have at least two docker container running

  3. Look at the logs and make sure the following entry appear

    autodiscover log error

    {
      "log.level": "error",
      "@timestamp": "2024-12-18T16:43:14.602-0500",
      "log.logger": "autodiscover",
      "log.origin": {
        "function": "github.com/elastic/beats/v7/libbeat/autodiscover.(*Autodiscover).worker",
        "file.name": "autodiscover/autodiscover.go",
        "file.line": 175
      },
      "message": "all new inputs failed to start with a non-retriable error",
      "service.name": "filebeat",
      "error": {
        "message": "Error creating runner from config: ErrNonReloadable: filestream input with ID 'docker-id' already exists, this will lead to data duplication, please use a different ID\nError creating runner from config: ErrNonReloadable: filestream input with ID 'docker-id' already exists, this will lead to data duplication, please use a different ID\nError creating runner from config: ErrNonReloadable: filestream input with ID 'docker-id' already exists, this will lead to data duplication, please use a different ID"
      },
      "ecs.version": "1.6.0"
    }

Testing allow_deprecated_id_duplication

  1. Follow any of the above testing procedures, but add allow_deprecated_id_duplication: true to the filestream configuration

  2. Look at the logs and make sure the following entry appear

    running duplicated ID error

    {
      "log.level": "error",
      "@timestamp": "2024-12-18T17:09:08.819-0500",
      "log.logger": "input",
      "log.origin": {
        "function": "github.com/elastic/beats/v7/filebeat/input/filestream/internal/input-logfile.(*InputManager).Create",
        "file.name": "input-logfile/manager.go",
        "file.line": 184
      },
      "message": "filestream input with ID 'duplicated-id' already exists, this will lead to data duplication, please use a different ID. Metrics collection has been disabled on this input. The input will start only because 'allow_deprecated_id_duplication' is set to true",
      "service.name": "filebeat",
      "ecs.version": "1.6.0"
    }

Related issues

## Use cases
## Screenshots

Logs

Warning logged during config validation when 'allow_deprecated_id_duplication' is enabled

{
  "log.level": "warn",
  "@timestamp": "2024-12-18T17:09:08.819-0500",
  "log.logger": "input.filestream",
  "log.origin": {
    "function": "github.com/elastic/beats/v7/filebeat/input/filestream.(*config).Validate",
    "file.name": "filestream/config.go",
    "file.line": 150
  },
  "message": "setting `allow_deprecated_id_duplication` will lead to data duplication and incomplete input metrics, it's use is highly discouraged.",
  "service.name": "filebeat",
  "ecs.version": "1.6.0"
}

Error logged by Filestream's input manager when an input has got a duplicated ID and 'allow_deprecated_id_duplication' is enabled

{
  "log.level": "error",
  "@timestamp": "2024-12-18T17:09:08.819-0500",
  "log.logger": "input",
  "log.origin": {
    "function": "github.com/elastic/beats/v7/filebeat/input/filestream/internal/input-logfile.(*InputManager).Create",
    "file.name": "input-logfile/manager.go",
    "file.line": 184
  },
  "message": "filestream input with ID 'duplicated-id' already exists, this will lead to data duplication, please use a different ID. Metrics collection has been disabled on this input. The  input will start only because 'allow_deprecated_id_duplication' is set to true",
  "service.name": "filebeat",
  "ecs.version": "1.6.0"
}

Error logged by input reloader (inputs.d directory) when a Filestream input has got duplicated IDs

{
  "log.level": "error",
  "@timestamp": "2024-12-16T18:23:03.539-0500",
  "log.logger": "reload",
  "log.origin": {
    "function": "github.com/elastic/beats/v7/libbeat/cfgfile.(*RunnerList).Reload",
    "file.name": "cfgfile/list.go",
    "file.line": 138
  },
  "message": "Error creating runner from config: filestream input with ID 'duplicated-id' already exists, this will lead to data duplication, please use a different ID",
  "service.name": "filebeat",
  "ecs.version": "1.6.0"
}

Error logged by Filestream's input manager when an input has got a duplicated ID

{
  "log.level": "error",
  "@timestamp": "2024-12-16T18:23:03.539-0500",
  "log.logger": "input",
  "log.origin": {
    "function": "github.com/elastic/beats/v7/filebeat/input/filestream/internal/input-logfile.(*InputManager).Create",
    "file.name": "input-logfile/manager.go",
    "file.line": 180
  },
  "message": "filestream input 'duplicated-id' is duplicated: input will NOT start",
  "service.name": "filebeat",
  "input.cfg": "{\n  \"enabled\": true,\n  \"id\": \"duplicated-id\",\n  \"paths\": [\n    \"/tmp/duplicated-id-2/*.log\"\n  ],\n  \"type\": \"filestream\"\n}",
  "ecs.version": "1.6.0"
}

Error logged by autodiscover when Filestream inputs with duplicated IDs cannot be created

{
  "log.level": "error",
  "@timestamp": "2024-12-18T16:43:14.602-0500",
  "log.logger": "autodiscover",
  "log.origin": {
    "function": "github.com/elastic/beats/v7/libbeat/autodiscover.(*Autodiscover).worker",
    "file.name": "autodiscover/autodiscover.go",
    "file.line": 175
  },
  "message": "all new inputs failed to start with a non-retriable error",
  "service.name": "filebeat",
  "error": {
    "message": "Error creating runner from config: ErrNonReloadable: filestream input with ID 'docker-id' already exists, this will lead to data duplication, please use a different ID\nError creating runner from config: ErrNonReloadable: filestream input with ID 'docker-id' already exists, this will lead to data duplication, please use a different ID\nError creating runner from config: ErrNonReloadable: filestream input with ID 'docker-id' already exists, this will lead to data duplication, please use a different ID"
  },
  "ecs.version": "1.6.0"
}

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Dec 9, 2024
Copy link
Contributor

mergify bot commented Dec 9, 2024

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @AndersonQ? 🙏.
For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-8./d is the label to automatically backport to the 8./d branch. /d is the digit

Copy link
Contributor

mergify bot commented Dec 9, 2024

backport-8.x has been added to help with the transition to the new branch 8.x.
If you don't need it please use backport-skip label and remove the backport-8.x label.

@mergify mergify bot added the backport-8.x Automated backport to the 8.x branch with mergify label Dec 9, 2024
@jlind23 jlind23 assigned belimawr and unassigned AndersonQ Dec 9, 2024
@jlind23
Copy link
Collaborator

jlind23 commented Dec 9, 2024

@belimawr assigning this to you as discussed over slack.

@belimawr belimawr added the Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team label Dec 16, 2024
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Dec 16, 2024
@belimawr belimawr changed the title WIP: filestream do not run input with duplicated ID Filestream returns error when an input with duplicated ID is created Dec 16, 2024
@belimawr belimawr force-pushed the 41938-filestream-do-not-run-duplicated-id branch from d15f99f to 510f68a Compare December 17, 2024 15:00
@@ -0,0 +1,123 @@
// Licensed to Elasticsearch B.V. under one or more contributor
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file was actually renamed from reload_test.go. However because a test is added to reload_test.go, Git got confused and thinks this file is new and the other has been edited.

@belimawr belimawr added breaking change and removed backport-8.x Automated backport to the 8.x branch with mergify labels Dec 17, 2024
@belimawr belimawr marked this pull request as ready for review December 17, 2024 15:07
@belimawr belimawr requested a review from a team as a code owner December 17, 2024 15:07
@belimawr belimawr requested review from belimawr and rdner December 17, 2024 15:07
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

Copy link
Contributor

mergify bot commented Dec 17, 2024

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @AndersonQ? 🙏.
For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-8./d is the label to automatically backport to the 8./d branch. /d is the digit

Copy link
Contributor

mergify bot commented Dec 17, 2024

backport-8.x has been added to help with the transition to the new branch 8.x.
If you don't need it please use backport-skip label and remove the backport-8.x label.

Copy link
Contributor

mergify bot commented Dec 26, 2024

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b 41938-filestream-do-not-run-duplicated-id upstream/41938-filestream-do-not-run-duplicated-id
git merge upstream/main
git push upstream 41938-filestream-do-not-run-duplicated-id

@belimawr belimawr force-pushed the 41938-filestream-do-not-run-duplicated-id branch from 144e61a to e5d6bdf Compare January 2, 2025 16:27
@belimawr belimawr requested a review from mauri870 January 2, 2025 16:27
Copy link
Member

@mauri870 mauri870 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I tested the first two cases manually (input reload and autodiscover) and confirmed the log messages appear in the output.

For some reason this test became flaky, the root of the flakiness
is not on the test, it is on how a rename operation is detected.
Even though this test uses `os.Rename`, it does not seem to be an atomic
operation. https://www.man7.org/linux/man-pages/man2/rename.2.html
does not make it clear whether 'renameat' (used by `os.Rename`) is
atomic.

On a flaky execution, the file is actually perceived as removed
and then a new file is created, both with the same inode. This
happens on a system that does not reuse inodes as soon they're
freed. Because the file is detected as removed, it's state is also
removed. Then when more data is added, only the offset of the new
data is tracked by the registry, causing the test to fail.

A workaround for this is to not remove the state when the file is
removed, hence `clean_removed: false` is set in the test config.
@belimawr
Copy link
Contributor

belimawr commented Jan 3, 2025

The failing tests seem to be unrelated, I created a standalone PR (#42213) to fix them, however I also included the fix commit on this PR to unblock it.

@jlind23 jlind23 merged commit e3e2332 into elastic:main Jan 4, 2025
140 of 142 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-skip Skip notification from the automated backport with mergify breaking change Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team
Projects
None yet
6 participants