Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Osquerybeat: Fix data_stream configuration, enforce the default values used before 8.6.0 #34246

Merged
merged 4 commits into from
Jan 13, 2023

Conversation

aleksmaus
Copy link
Contributor

@aleksmaus aleksmaus commented Jan 12, 2023

What does this PR do?

Fixes the values for the data_stream: type anddataset.

The 8.6.0 has wrong values set after switching to using the V2 libbeat management.CreateInputsFromStreams implementation that is inserting the policy processor:

  processors:
  - add_fields:
      fields:
        dataset: generic
        namespace: default
        type: osquery
      target: data_stream

resulting in these data_stream properties.

"data_stream": {
            "namespace": "default",
            "type": "osquery",
            "dataset": "generic"
},

This PR sets the expected default values on the datastream before the configuration transformation takes place thus setting the values back to what it was before 8.6.0:

"data_stream": {
            "namespace": "default",
            "type": "logs",
            "dataset": "osquery_manager.result"
},

These fields are mapped as the constant_keywords and should never be changed.

Why is it important?

Fixes the breakage with the Osquery Logstash integration because the logstash sends the data based on the data_stream "hint", resulting in the output going to osquery-generic-default instead of logs-osquery_manager.result-default

Also the defect makes 8.6.0 osquery results incompatible with previous releases results documents data. And since the fields are constant, in the mixed agent versions environment or upon upgrade this can cause the issues indexing the osquery results documents.
Resulting in the errors such as:

[constant_keyword] field [data_stream.type] only accepts values that are equal to the value defined in the mappings [logs], but got [osquery]

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Related issues

Screenshots

Verified the document data_stream is set correctly when sending data to elasticsearch output

Screen Shot 2023-01-12 at 12 07 15 PM

Screen Shot 2023-01-12 at 12 07 22 PM

Screen Shot 2023-01-12 at 12 07 56 PM

Screen Shot 2023-01-12 at 12 08 34 PM

Verified the document data_stream is set correctly when sending data to logstash output

Screen Shot 2023-01-12 at 12 06 01 PM

Screen Shot 2023-01-12 at 12 05 51 PM

Screen Shot 2023-01-12 at 12 04 34 PM

Screen Shot 2023-01-12 at 12 03 31 PM

@aleksmaus aleksmaus added bug backport-v8.6.0 Automated backport with mergify labels Jan 12, 2023
@aleksmaus aleksmaus requested a review from a team as a code owner January 12, 2023 18:22
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Jan 12, 2023
@aleksmaus aleksmaus added Team:Elastic-Agent Label for the Agent team Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team labels Jan 12, 2023
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent (Team:Elastic-Agent)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Jan 12, 2023
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

@aleksmaus aleksmaus requested a review from cmacknz January 12, 2023 18:23
@elasticmachine
Copy link
Collaborator

elasticmachine commented Jan 12, 2023

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2023-01-13T16:05:26.522+0000

  • Duration: 44 min 54 sec

Test stats 🧪

Test Results
Failed 0
Passed 1632
Skipped 0
Total 1632

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

ds := *rawIn.GetDataStream()
ds.Dataset = config.DefaultDataset
ds.Type = config.DefaultType
datastream = &ds
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works, but if the dataset or type ever becomes configurable it will break. That's probably unlikely though.

rawIn.Streams = streams

procs := defaultProcessors()

Copy link
Member

@cmacknz cmacknz Jan 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't comment directly on this, but this block below can be removed:

for iter := range modules {
modules[iter]["type"] = "log"
}

That is trying to set an input type, which I don't think matters for osquerybeat. It is just adding an arbitrary type: log into the generated config that osquerybeat will be ignoring.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, don't remember the reason for this. can clean up later, separately from this more urgent fix

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is a copy/paste error from one of the other Beats, agree this can be removed separately.

@cmacknz
Copy link
Member

cmacknz commented Jan 12, 2023

You should probably have a changelog entry for this one.

@cmacknz
Copy link
Member

cmacknz commented Jan 12, 2023

The packaging failure is unrelated, caused by elastic/golang-crossbuild#232

@aleksmaus aleksmaus merged commit f9ed028 into elastic:main Jan 13, 2023
mergify bot pushed a commit that referenced this pull request Jan 13, 2023
…s used before 8.6.0 (#34246)

* Osquerybeat: Fix data_stream configuration, enforce the default values used before 8.6.0

* Added changelog entry

(cherry picked from commit f9ed028)
aleksmaus added a commit that referenced this pull request Jan 14, 2023
…s used before 8.6.0 (#34246) (#34262)

* Osquerybeat: Fix data_stream configuration, enforce the default values used before 8.6.0

* Added changelog entry

(cherry picked from commit f9ed028)

Co-authored-by: Aleksandr Maus <[email protected]>
chrisberkhout pushed a commit that referenced this pull request Jun 1, 2023
…s used before 8.6.0 (#34246)

* Osquerybeat: Fix data_stream configuration, enforce the default values used before 8.6.0

* Added changelog entry
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-v8.6.0 Automated backport with mergify bug Team:Elastic-Agent Label for the Agent team Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team
Projects
None yet
3 participants