Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fleet] Enhance integration pipelines by adding a additional @custom pipelines for global, type, and integration processors #168019

Closed
Harmlos opened this issue Oct 4, 2023 · 12 comments · Fixed by #170270
Assignees
Labels
enhancement New value added to drive a business result QA:Validated Issue has been validated by QA Team:Fleet Team label for Observability Data Collection Fleet team

Comments

@Harmlos
Copy link

Harmlos commented Oct 4, 2023

We should add a new global@custom pipeline processor to the pipelines created for integrations, similar to the {type}-{integration}.{dataset}@custom pipeline we have today. This should use the ignore_missing_pipeline: true option to not require the custom pipeline to exist.

While we're at it, we should also add a type-level and integration-level pipeline. So in total the pipeline (and order) for custom pipelines should be this, from least to most specific execution order:

  • global@custom
  • {type}@custom
  • {type}-{integration}@custom
  • {type}-{integration}.{dataset}@custom - already exists

We will also need to bump the installation format version to force Fleet to update all the existing pipelines on Kibana upgrade.

Original description

Describe the feature:
Add a call to the custom pipeline .fleet_final_pipeline-1@custom at the end of the managed pipeline .fleet_final_pipeline-1.

Describe a specific use case for the feature:
This will enable the addition of some ingest processors to all data coming from customer agents without the need to manually control the managed pipeline after each Elasticsearch update.

In our case, we are adding an enrich policy to append tags for network locations. However, modifying the managed ingest pipeline may introduce potential issues, especially after updates, and may result in the loss of enrichment data before the next update.

@botelastic botelastic bot added the needs-team Issues missing a team label label Oct 4, 2023
@Harmlos Harmlos changed the title [Fleet] Enchance .fleet_final_pipeline-1 by adding a custom pipeline for customer data [Fleet] Enchance .fleet_final_pipeline-1 by adding a custom pipeline for customer processors Oct 5, 2023
@jsanz jsanz added enhancement New value added to drive a business result Team:Fleet Team label for Observability Data Collection Fleet team labels Oct 13, 2023
@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Team:Fleet)

@botelastic botelastic bot removed the needs-team Issues missing a team label label Oct 13, 2023
@kpollich
Copy link
Member

@joshdover @nimarezainia - Curious for your thoughts on this. Seems like a pretty valid use case that may overlap with our other conversation around more granular levels of customization throughout Fleet.

@joshdover
Copy link
Contributor

We haven't discussed how we should add more levels of granularity to the custom ingest pipelines, but I'd rather not do it on the final pipeline. Instead, I think we should consider adding some additional pipeline processors to the end of the main pipeline added for each integration for consistency:

  • logs@custom
  • logs-nginx@custom
  • logs-nginx.access@custom - we already have this

We need to think about execution order here, for mappings/settings we're preferring the more specific template to take precedence. I think that would make sense here and is what I put in my suggestion.

cc @felixbarny

@felixbarny
Copy link
Member

This makes sense to me.
Similar to elastic/elasticsearch#97664, I wish there was something built-in in Elasticsearch that would allow us to do something like that for the built-in logs-*-* template so that we can do what you've proposed for data streams that are not managed via Fleet.

@Harmlos
Copy link
Author

Harmlos commented Oct 18, 2023

In our case, we use global to make one setting to enrich all events, regardless of which integration they enter the system through.
We tried to configure it through additional integration settings, but they were difficult to change

Also, adding any new integration will require adding new settings to enrich the event, and in the case of a global pipeline, such settings will be made only once

@joshdover
Copy link
Contributor

@Harmlos My proposal could also be extended to include a global@custom pipeline to accomplish the same, similar to what has been proposed for data stream mappings.

One of the reasons I want to avoid extending the final pipeline is that it includes some functionality that not all users want (agent ID verification) and we may make it disablable in the future. This has the potential to break the custom pipelines, which I'd like to avoid. Another reason is for consistency with the existing @Custom pipelines.

@joshdover joshdover changed the title [Fleet] Enchance .fleet_final_pipeline-1 by adding a custom pipeline for customer processors [Fleet] Enhance integration pipelines by adding a global@custom pipeline for global processors Oct 24, 2023
@joshdover
Copy link
Contributor

Updated the description with my proposal, including adding additional processors for other levels of granularity

@joshdover joshdover changed the title [Fleet] Enhance integration pipelines by adding a global@custom pipeline for global processors [Fleet] Enhance integration pipelines by adding a additional @custom pipelines for global, type, and integration processors Oct 25, 2023
@nchaulet nchaulet self-assigned this Oct 30, 2023
@kpollich kpollich added the QA:Needs Validation Issue needs to be validated by QA label Nov 8, 2023
@kpollich
Copy link
Member

@nchaulet Could we provide any detailed testing instructions for QA here? I think the docs that @kilfoyle put together should do a good job explaining how to use/test, but is there anything else on top of those that would help test here?

elastic/ingest-docs#675

@nchaulet
Copy link
Member

@kpollich Yes the doc here will be a good test instructions, also there is some automated integration tests for that, so not sure there is a need for manual tests here

@kpollich
Copy link
Member

@amolnater-qasource - If you're able to walk through the docs above and verify the steps laid out there that should suffice for test cases here.

@harshitgupta-qasource
Copy link

Hi Team,

We have executed 04 testcases under the Feature test run for the 8.12.0 release at the link:

Status:

PASS: 04

Build details:
VERSION: 8.12.0 BC4
BUILD: 70016
COMMIT: c2fda47
Artifact Link: https://staging.elastic.co/8.12.0-e9640208/summary-8.12.0.html

As the testing is completed on this feature, we are marking this as QA:Validated.

Please let us know if anything else is required from our end.
Thanks

@harshitgupta-qasource harshitgupta-qasource added QA:Validated Issue has been validated by QA and removed QA:Needs Validation Issue needs to be validated by QA labels Jan 2, 2024
kpollich added a commit that referenced this issue Jan 25, 2024
…ns + add descriptions to each pipeline (#175448)

## Summary

Closes #175254
Ref #168019
Ref #170270

In 8.12.0, Fleet unintentionally shipped a breaking change in
#170270 for APM users who make use
of a custom `traces-apm` data stream. If a user had previously defined
this ingest pipeline to customize documents ingested for the
`traces-apm` data stream (defined
[here](https://github.com/elastic/integrations/blob/9a36183f0bd12e39a957d2f7bd65f3de4ee685b1/packages/apm/data_stream/traces/manifest.yml#L2-L3),
then they would unexpectedly see that pipeline called when documents
were ingested to the `traces-apm.rum` and `traces-apm.sampled`
datastreams as well.

This PR addresses this collision by adding a `.package` suffix to the
"package level" ingest pipeline introduced in 8.12.0.

So, in 8.12.0 a processor would be defined as such on the
`traces-apm.rum` or `traces-apm.sampled` ingest pipeline

```
{
  "pipeline": {
    "name": "traces-apm@custom",
    "ignore_missing_pipeline": true,
  }
},
```

This PR replaces the pipeline with one that looks as follows:

```
{
  "pipeline": {
    "name": "traces-apm.package@custom",
    "ignore_missing_pipeline": true,
    "description": "[Fleet] Pipeline for all data streams of type `traces` defined by the `apm` integration"
  }
},
```

**To be clear: this is a breaking change if you have defined the
`traces-apm@custom` integration on 8.12. In 8.12.1, it will no longer be
called for documents ingested to the `traces-apm`, `traces-apm.rum`, or
`traces-apm.sampled` data streams. You will need to rename your pipeline
to `traces-apm.package@custom` to preserve this behavior.**

This change also applies to `logs-elastic_agent.*` ingest pipelines. See
[this
comment](#175254 (comment))
for more information.

There is still technically room for a collision, though it's unlikely,
if the data stream name is `package`. This will be handled by a package
spec validation proposed in
elastic/package-spec#699.

---------

Co-authored-by: Kibana Machine <[email protected]>
kibanamachine pushed a commit to kibanamachine/kibana that referenced this issue Jan 25, 2024
…ns + add descriptions to each pipeline (elastic#175448)

## Summary

Closes elastic#175254
Ref elastic#168019
Ref elastic#170270

In 8.12.0, Fleet unintentionally shipped a breaking change in
elastic#170270 for APM users who make use
of a custom `traces-apm` data stream. If a user had previously defined
this ingest pipeline to customize documents ingested for the
`traces-apm` data stream (defined
[here](https://github.com/elastic/integrations/blob/9a36183f0bd12e39a957d2f7bd65f3de4ee685b1/packages/apm/data_stream/traces/manifest.yml#L2-L3),
then they would unexpectedly see that pipeline called when documents
were ingested to the `traces-apm.rum` and `traces-apm.sampled`
datastreams as well.

This PR addresses this collision by adding a `.package` suffix to the
"package level" ingest pipeline introduced in 8.12.0.

So, in 8.12.0 a processor would be defined as such on the
`traces-apm.rum` or `traces-apm.sampled` ingest pipeline

```
{
  "pipeline": {
    "name": "traces-apm@custom",
    "ignore_missing_pipeline": true,
  }
},
```

This PR replaces the pipeline with one that looks as follows:

```
{
  "pipeline": {
    "name": "traces-apm.package@custom",
    "ignore_missing_pipeline": true,
    "description": "[Fleet] Pipeline for all data streams of type `traces` defined by the `apm` integration"
  }
},
```

**To be clear: this is a breaking change if you have defined the
`traces-apm@custom` integration on 8.12. In 8.12.1, it will no longer be
called for documents ingested to the `traces-apm`, `traces-apm.rum`, or
`traces-apm.sampled` data streams. You will need to rename your pipeline
to `traces-apm.package@custom` to preserve this behavior.**

This change also applies to `logs-elastic_agent.*` ingest pipelines. See
[this
comment](elastic#175254 (comment))
for more information.

There is still technically room for a collision, though it's unlikely,
if the data stream name is `package`. This will be handled by a package
spec validation proposed in
elastic/package-spec#699.

---------

Co-authored-by: Kibana Machine <[email protected]>
(cherry picked from commit 9fe5a66)
kibanamachine referenced this issue Jan 25, 2024
…oid collisions + add descriptions to each pipeline (#175448) (#175547)

# Backport

This will backport the following commits from `main` to `8.12`:
- [[Fleet] Update Fleet&#x27;s custom ingest pipeline names to avoid
collisions + add descriptions to each pipeline
(#175448)](#175448)

<!--- Backport version: 9.4.3 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Kyle
Pollich","email":"[email protected]"},"sourceCommit":{"committedDate":"2024-01-25T14:07:43Z","message":"[Fleet]
Update Fleet's custom ingest pipeline names to avoid collisions + add
descriptions to each pipeline (#175448)\n\n## Summary\r\n\r\nCloses
https://github.com/elastic/kibana/issues/175254\r\nRef
https://github.com/elastic/kibana/issues/168019\r\nRef
https://github.com/elastic/kibana/pull/170270\r\n\r\nIn 8.12.0, Fleet
unintentionally shipped a breaking change
in\r\nhttps://github.com//pull/170270 for APM users who
make use\r\nof a custom `traces-apm` data stream. If a user had
previously defined\r\nthis ingest pipeline to customize documents
ingested for the\r\n`traces-apm` data stream
(defined\r\n[here](https://github.com/elastic/integrations/blob/9a36183f0bd12e39a957d2f7bd65f3de4ee685b1/packages/apm/data_stream/traces/manifest.yml#L2-L3),\r\nthen
they would unexpectedly see that pipeline called when documents\r\nwere
ingested to the `traces-apm.rum` and `traces-apm.sampled`\r\ndatastreams
as well.\r\n\r\nThis PR addresses this collision by adding a `.package`
suffix to the\r\n\"package level\" ingest pipeline introduced in
8.12.0.\r\n\r\nSo, in 8.12.0 a processor would be defined as such on
the\r\n`traces-apm.rum` or `traces-apm.sampled` ingest
pipeline\r\n\r\n```\r\n{\r\n \"pipeline\": {\r\n \"name\":
\"traces-apm@custom\",\r\n \"ignore_missing_pipeline\": true,\r\n
}\r\n},\r\n```\r\n\r\nThis PR replaces the pipeline with one that looks
as follows:\r\n\r\n```\r\n{\r\n \"pipeline\": {\r\n \"name\":
\"traces-apm.package@custom\",\r\n \"ignore_missing_pipeline\":
true,\r\n \"description\": \"[Fleet] Pipeline for all data streams of
type `traces` defined by the `apm` integration\"\r\n
}\r\n},\r\n```\r\n\r\n**To be clear: this is a breaking change if you
have defined the\r\n`traces-apm@custom` integration on 8.12. In 8.12.1,
it will no longer be\r\ncalled for documents ingested to the
`traces-apm`, `traces-apm.rum`, or\r\n`traces-apm.sampled` data streams.
You will need to rename your pipeline\r\nto `traces-apm.package@custom`
to preserve this behavior.**\r\n\r\nThis change also applies to
`logs-elastic_agent.*` ingest pipelines.
See\r\n[this\r\ncomment](https://github.com/elastic/kibana/issues/175254#issuecomment-1906202137)\r\nfor
more information.\r\n\r\nThere is still technically room for a
collision, though it's unlikely,\r\nif the data stream name is
`package`. This will be handled by a package\r\nspec validation proposed
in\r\nhttps://github.com/elastic/package-spec/issues/699.\r\n\r\n---------\r\n\r\nCo-authored-by:
Kibana Machine
<[email protected]>","sha":"9fe5a66faf4e06fc444c6078edafc29e91126f8d","branchLabelMapping":{"^v8.13.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:breaking","Team:Fleet","backport:prev-minor","v8.12.1","v8.13.0"],"title":"[Fleet]
Update Fleet's custom ingest pipeline names to avoid collisions + add
descriptions to each
pipeline","number":175448,"url":"https://github.com/elastic/kibana/pull/175448","mergeCommit":{"message":"[Fleet]
Update Fleet's custom ingest pipeline names to avoid collisions + add
descriptions to each pipeline (#175448)\n\n## Summary\r\n\r\nCloses
https://github.com/elastic/kibana/issues/175254\r\nRef
https://github.com/elastic/kibana/issues/168019\r\nRef
https://github.com/elastic/kibana/pull/170270\r\n\r\nIn 8.12.0, Fleet
unintentionally shipped a breaking change
in\r\nhttps://github.com//pull/170270 for APM users who
make use\r\nof a custom `traces-apm` data stream. If a user had
previously defined\r\nthis ingest pipeline to customize documents
ingested for the\r\n`traces-apm` data stream
(defined\r\n[here](https://github.com/elastic/integrations/blob/9a36183f0bd12e39a957d2f7bd65f3de4ee685b1/packages/apm/data_stream/traces/manifest.yml#L2-L3),\r\nthen
they would unexpectedly see that pipeline called when documents\r\nwere
ingested to the `traces-apm.rum` and `traces-apm.sampled`\r\ndatastreams
as well.\r\n\r\nThis PR addresses this collision by adding a `.package`
suffix to the\r\n\"package level\" ingest pipeline introduced in
8.12.0.\r\n\r\nSo, in 8.12.0 a processor would be defined as such on
the\r\n`traces-apm.rum` or `traces-apm.sampled` ingest
pipeline\r\n\r\n```\r\n{\r\n \"pipeline\": {\r\n \"name\":
\"traces-apm@custom\",\r\n \"ignore_missing_pipeline\": true,\r\n
}\r\n},\r\n```\r\n\r\nThis PR replaces the pipeline with one that looks
as follows:\r\n\r\n```\r\n{\r\n \"pipeline\": {\r\n \"name\":
\"traces-apm.package@custom\",\r\n \"ignore_missing_pipeline\":
true,\r\n \"description\": \"[Fleet] Pipeline for all data streams of
type `traces` defined by the `apm` integration\"\r\n
}\r\n},\r\n```\r\n\r\n**To be clear: this is a breaking change if you
have defined the\r\n`traces-apm@custom` integration on 8.12. In 8.12.1,
it will no longer be\r\ncalled for documents ingested to the
`traces-apm`, `traces-apm.rum`, or\r\n`traces-apm.sampled` data streams.
You will need to rename your pipeline\r\nto `traces-apm.package@custom`
to preserve this behavior.**\r\n\r\nThis change also applies to
`logs-elastic_agent.*` ingest pipelines.
See\r\n[this\r\ncomment](https://github.com/elastic/kibana/issues/175254#issuecomment-1906202137)\r\nfor
more information.\r\n\r\nThere is still technically room for a
collision, though it's unlikely,\r\nif the data stream name is
`package`. This will be handled by a package\r\nspec validation proposed
in\r\nhttps://github.com/elastic/package-spec/issues/699.\r\n\r\n---------\r\n\r\nCo-authored-by:
Kibana Machine
<[email protected]>","sha":"9fe5a66faf4e06fc444c6078edafc29e91126f8d"}},"sourceBranch":"main","suggestedTargetBranches":["8.12"],"targetPullRequestStates":[{"branch":"8.12","label":"v8.12.1","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"main","label":"v8.13.0","branchLabelMappingKey":"^v8.13.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/175448","number":175448,"mergeCommit":{"message":"[Fleet]
Update Fleet's custom ingest pipeline names to avoid collisions + add
descriptions to each pipeline (#175448)\n\n## Summary\r\n\r\nCloses
https://github.com/elastic/kibana/issues/175254\r\nRef
https://github.com/elastic/kibana/issues/168019\r\nRef
https://github.com/elastic/kibana/pull/170270\r\n\r\nIn 8.12.0, Fleet
unintentionally shipped a breaking change
in\r\nhttps://github.com//pull/170270 for APM users who
make use\r\nof a custom `traces-apm` data stream. If a user had
previously defined\r\nthis ingest pipeline to customize documents
ingested for the\r\n`traces-apm` data stream
(defined\r\n[here](https://github.com/elastic/integrations/blob/9a36183f0bd12e39a957d2f7bd65f3de4ee685b1/packages/apm/data_stream/traces/manifest.yml#L2-L3),\r\nthen
they would unexpectedly see that pipeline called when documents\r\nwere
ingested to the `traces-apm.rum` and `traces-apm.sampled`\r\ndatastreams
as well.\r\n\r\nThis PR addresses this collision by adding a `.package`
suffix to the\r\n\"package level\" ingest pipeline introduced in
8.12.0.\r\n\r\nSo, in 8.12.0 a processor would be defined as such on
the\r\n`traces-apm.rum` or `traces-apm.sampled` ingest
pipeline\r\n\r\n```\r\n{\r\n \"pipeline\": {\r\n \"name\":
\"traces-apm@custom\",\r\n \"ignore_missing_pipeline\": true,\r\n
}\r\n},\r\n```\r\n\r\nThis PR replaces the pipeline with one that looks
as follows:\r\n\r\n```\r\n{\r\n \"pipeline\": {\r\n \"name\":
\"traces-apm.package@custom\",\r\n \"ignore_missing_pipeline\":
true,\r\n \"description\": \"[Fleet] Pipeline for all data streams of
type `traces` defined by the `apm` integration\"\r\n
}\r\n},\r\n```\r\n\r\n**To be clear: this is a breaking change if you
have defined the\r\n`traces-apm@custom` integration on 8.12. In 8.12.1,
it will no longer be\r\ncalled for documents ingested to the
`traces-apm`, `traces-apm.rum`, or\r\n`traces-apm.sampled` data streams.
You will need to rename your pipeline\r\nto `traces-apm.package@custom`
to preserve this behavior.**\r\n\r\nThis change also applies to
`logs-elastic_agent.*` ingest pipelines.
See\r\n[this\r\ncomment](https://github.com/elastic/kibana/issues/175254#issuecomment-1906202137)\r\nfor
more information.\r\n\r\nThere is still technically room for a
collision, though it's unlikely,\r\nif the data stream name is
`package`. This will be handled by a package\r\nspec validation proposed
in\r\nhttps://github.com/elastic/package-spec/issues/699.\r\n\r\n---------\r\n\r\nCo-authored-by:
Kibana Machine
<[email protected]>","sha":"9fe5a66faf4e06fc444c6078edafc29e91126f8d"}}]}]
BACKPORT-->

Co-authored-by: Kyle Pollich <[email protected]>
CoenWarmer pushed a commit to CoenWarmer/kibana that referenced this issue Feb 15, 2024
…ns + add descriptions to each pipeline (elastic#175448)

## Summary

Closes elastic#175254
Ref elastic#168019
Ref elastic#170270

In 8.12.0, Fleet unintentionally shipped a breaking change in
elastic#170270 for APM users who make use
of a custom `traces-apm` data stream. If a user had previously defined
this ingest pipeline to customize documents ingested for the
`traces-apm` data stream (defined
[here](https://github.com/elastic/integrations/blob/9a36183f0bd12e39a957d2f7bd65f3de4ee685b1/packages/apm/data_stream/traces/manifest.yml#L2-L3),
then they would unexpectedly see that pipeline called when documents
were ingested to the `traces-apm.rum` and `traces-apm.sampled`
datastreams as well.

This PR addresses this collision by adding a `.package` suffix to the
"package level" ingest pipeline introduced in 8.12.0.

So, in 8.12.0 a processor would be defined as such on the
`traces-apm.rum` or `traces-apm.sampled` ingest pipeline

```
{
  "pipeline": {
    "name": "traces-apm@custom",
    "ignore_missing_pipeline": true,
  }
},
```

This PR replaces the pipeline with one that looks as follows:

```
{
  "pipeline": {
    "name": "traces-apm.package@custom",
    "ignore_missing_pipeline": true,
    "description": "[Fleet] Pipeline for all data streams of type `traces` defined by the `apm` integration"
  }
},
```

**To be clear: this is a breaking change if you have defined the
`traces-apm@custom` integration on 8.12. In 8.12.1, it will no longer be
called for documents ingested to the `traces-apm`, `traces-apm.rum`, or
`traces-apm.sampled` data streams. You will need to rename your pipeline
to `traces-apm.package@custom` to preserve this behavior.**

This change also applies to `logs-elastic_agent.*` ingest pipelines. See
[this
comment](elastic#175254 (comment))
for more information.

There is still technically room for a collision, though it's unlikely,
if the data stream name is `package`. This will be handled by a package
spec validation proposed in
elastic/package-spec#699.

---------

Co-authored-by: Kibana Machine <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New value added to drive a business result QA:Validated Issue has been validated by QA Team:Fleet Team label for Observability Data Collection Fleet team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants