-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Fleet] flag package policy SO to trigger agent policy bump #200536
[Fleet] flag package policy SO to trigger agent policy bump #200536
Conversation
@juliaElastic I am wondering if this could introduce a new bug, with infinite bump loop (we had similar issues with preconfiguration where we some times miss things during comparaison), I am wondering if we have a mechanism where we explicitly tell we want a policy bump during the migration maybe a new property |
Thanks for the suggestion, I think it makes sense to add an explicit flag to avoid accidentally triggering updates in an infinite loop. |
} | ||
|
||
export async function _updatePackagePoliciesThatNeedBump(logger: Logger) { | ||
// TODO spaces? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we probably need to use the SO for the correct space here
appContextService.getInternalUserESClient(), | ||
packagePoliciesToBump.items.map((item) => ({ | ||
...item, | ||
bump_agent_policy_revision: false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The flag has to be set to false, otherwise an update will happen on every Fleet setup.
This update triggers the agent policy bump anyway, so no need to bump separately.
… src/core/server/integration_tests/ci_checks'
@nchaulet While testing with spaces, I have something to confirm. Do we need to support both Also noticed that locally this API seems to throw an error:
|
Yes we need to support both saved object, as the feature will be opt-in for users. You should be able to trigger the migration with this call (internal call need a
|
} | ||
|
||
async function getPackagePoliciesToBump() { | ||
return await packagePolicyService.list(appContextService.getInternalUserSOClient(), { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to query package policies from all spaces, we need to query with soClient for each space. For this, we need to query all spaces first.
I think similarly the deploy policies task doesn't work correctly, because the logic only queries agent policies from the default space: https://github.com/elastic/kibana/blob/main/x-pack/plugins/fleet/server/services/setup/fleet_server_policies_enrollment_keys.ts#L35
I'll query from all spaces like this:
kibana/x-pack/plugins/fleet/server/services/agent_policy.ts
Lines 920 to 927 in b3f27a9
.getInternalUserSOClientWithoutSpaceExtension() | |
.find<AgentPolicySOAttributes>({ | |
type: savedObjectType, | |
fields: ['revision', 'data_output_id', 'monitoring_output_id'], | |
searchFields: ['data_output_id', 'monitoring_output_id'], | |
search: escapeSearchQueryPhrase(outputId), | |
perPage: SO_SEARCH_LIMIT, | |
namespaces: ['*'], |
{ id, version, attributes }: SavedObject<PackagePolicySOAttributes>, | ||
namespaces?: string[] | ||
): PackagePolicy => { | ||
const { bump_agent_policy_revision: bumpAgentPolicyRevision, ...restAttributes } = attributes; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bump_agent_policy_revision
is only added to the SO type, it is removed from PackagePolicy
to avoid it leaking out in the API responses.
…t --include-path /api/status --include-path /api/alerting/rule/ --include-path /api/alerting/rules --include-path /api/actions --include-path /api/security/role --include-path /api/spaces --include-path /api/fleet --update'
… src/core/server/integration_tests/ci_checks'
Pinging @elastic/fleet (Team:Fleet) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fleet SO update LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code LGTM 🚀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! only additive (non-breaking) changes in the mappings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm requesting a change to the way the task cancellation is working, explained in a comment.
} | ||
|
||
await runWithCache(async () => { | ||
await _updatePackagePoliciesThatNeedBump(appContextService.getLogger(), cancelled); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't going to work, to handle being cancelled while it's running. cancelled
will always be false here.
Here's an example of handling this correctly; continue to capture it locally, but provide a function that to test it, and pass the function to the "inner" functions rather than the value:
Lines 392 to 408 in ebd3f0d
({ taskInstance }: { taskInstance: ConcreteTaskInstance }) => { | |
let cancelled = false; | |
const isCancelled = () => cancelled; | |
return { | |
run: async () => | |
runTask({ | |
getRiskScoreService, | |
isCancelled, | |
logger, | |
taskInstance, | |
telemetry, | |
entityAnalyticsConfig, | |
}), | |
cancel: async () => { | |
cancelled = true; | |
}, | |
}; |
const start = Date.now(); | ||
|
||
for (const [spaceId, packagePolicies] of Object.entries(packagePoliciesIndexedBySpace)) { | ||
if (cancelled) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is where you'd check isCancelled()
vs the boolean value. And note that this is a good sort of place to put this, in case the array being processed ends up being extremely large.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ResponseOps changes LGTM; thx for making the change to the task cancellation!
💛 Build succeeded, but was flaky
Failed CI StepsTest FailuresMetrics [docs]
History
|
Starting backport for target branches: 8.x |
💔 All backports failed
Manual backportTo create the backport manually run:
Questions ?Please refer to the Backport tool documentation |
…200536) Closes elastic#193352 Update: Using a new SO field `bump_agent_policy_revision` in package policy type to mark package policies for update, this will trigger an agent policy revision bump. The feature supports both legacy and new package policy SO types, and queries policies from all spaces. To test, add a model version change to the package policy type and save. After Fleet setup is run, the agent policies using the package policies should be bumped and deployed. The same effect can be achieved by manually updating a package policy SO and loading Fleet UI to trigger setup. ``` '2': { changes: [ { type: 'data_backfill', backfillFn: (doc) => { return { attributes: { ...doc.attributes, bump_agent_policy_revision: true } }; }, }, ], }, curl -sk -XPOST --user fleet_superuser:password -H 'content-type:application/json' \ -H'x-elastic-product-origin:fleet' \ http://localhost:9200/.kibana_ingest/_update_by_query -d ' { "query": { "match": { "type": "fleet-package-policies" } },"script": { "source": "ctx._source[\"fleet-package-policies\"].bump_agent_policy_revision = true", "lang": "painless" } }' ``` ``` [2024-11-20T14:40:30.064+01:00][INFO ][plugins.fleet] Found 1 package policies that need agent policy revision bump [2024-11-20T14:40:31.933+01:00][DEBUG][plugins.fleet] Updated 1 package policies in space space1 in 1869ms, bump 1 agent policies [2024-11-20T14:40:35.056+01:00][DEBUG][plugins.fleet] Deploying 1 policies [2024-11-20T14:40:35.493+01:00][DEBUG][plugins.fleet] Deploying policies: 7f108cf2-4cf0-4a11-8df4-fc69d00a3484:10 ``` TODO: - the same flag has to be added on agent policy and output types, and the task extended to update them - I plan to do this in another pr, so that this doesn't become too big - add integration test if possible Tested with 500 agent policies split to 2 spaces, 1 integration per policy and bumping the flag in a new saved object model version, the bump task took about 6s. The deploy policies step is async, took about 30s. ``` [2024-11-20T15:53:55.628+01:00][INFO ][plugins.fleet] Found 501 package policies that need agent policy revision bump [2024-11-20T15:53:57.881+01:00][DEBUG][plugins.fleet] Updated 250 package policies in space space1 in 2253ms, bump 250 agent policies [2024-11-20T15:53:59.926+01:00][DEBUG][plugins.fleet] Updated 251 package policies in space default in 4298ms, bump 251 agent policies [2024-11-20T15:54:01.186+01:00][DEBUG][plugins.fleet] Deploying 250 policies [2024-11-20T15:54:29.989+01:00][DEBUG][plugins.fleet] Deploying policies: test-policy-space1-1:4, ... [2024-11-20T15:54:33.538+01:00][DEBUG][plugins.fleet] Deploying policies: policy-elastic-agent-on-cloud:4, test-policy-default-1:4, ... ``` - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios --------- Co-authored-by: kibanamachine <[email protected]>
just tested this locally, adding |
…0536) (#201542) Backport #200536 to 8.x branch --------- Co-authored-by: kibanamachine <[email protected]>
…200536) ## Summary Closes elastic#193352 Update: Using a new SO field `bump_agent_policy_revision` in package policy type to mark package policies for update, this will trigger an agent policy revision bump. The feature supports both legacy and new package policy SO types, and queries policies from all spaces. To test, add a model version change to the package policy type and save. After Fleet setup is run, the agent policies using the package policies should be bumped and deployed. The same effect can be achieved by manually updating a package policy SO and loading Fleet UI to trigger setup. ``` '2': { changes: [ { type: 'data_backfill', backfillFn: (doc) => { return { attributes: { ...doc.attributes, bump_agent_policy_revision: true } }; }, }, ], }, curl -sk -XPOST --user fleet_superuser:password -H 'content-type:application/json' \ -H'x-elastic-product-origin:fleet' \ http://localhost:9200/.kibana_ingest/_update_by_query -d ' { "query": { "match": { "type": "fleet-package-policies" } },"script": { "source": "ctx._source[\"fleet-package-policies\"].bump_agent_policy_revision = true", "lang": "painless" } }' ``` ``` [2024-11-20T14:40:30.064+01:00][INFO ][plugins.fleet] Found 1 package policies that need agent policy revision bump [2024-11-20T14:40:31.933+01:00][DEBUG][plugins.fleet] Updated 1 package policies in space space1 in 1869ms, bump 1 agent policies [2024-11-20T14:40:35.056+01:00][DEBUG][plugins.fleet] Deploying 1 policies [2024-11-20T14:40:35.493+01:00][DEBUG][plugins.fleet] Deploying policies: 7f108cf2-4cf0-4a11-8df4-fc69d00a3484:10 ``` TODO: - the same flag has to be added on agent policy and output types, and the task extended to update them - I plan to do this in another pr, so that this doesn't become too big - add integration test if possible ### Scale testing Tested with 500 agent policies split to 2 spaces, 1 integration per policy and bumping the flag in a new saved object model version, the bump task took about 6s. The deploy policies step is async, took about 30s. ``` [2024-11-20T15:53:55.628+01:00][INFO ][plugins.fleet] Found 501 package policies that need agent policy revision bump [2024-11-20T15:53:57.881+01:00][DEBUG][plugins.fleet] Updated 250 package policies in space space1 in 2253ms, bump 250 agent policies [2024-11-20T15:53:59.926+01:00][DEBUG][plugins.fleet] Updated 251 package policies in space default in 4298ms, bump 251 agent policies [2024-11-20T15:54:01.186+01:00][DEBUG][plugins.fleet] Deploying 250 policies [2024-11-20T15:54:29.989+01:00][DEBUG][plugins.fleet] Deploying policies: test-policy-space1-1:4, ... [2024-11-20T15:54:33.538+01:00][DEBUG][plugins.fleet] Deploying policies: policy-elastic-agent-on-cloud:4, test-policy-default-1:4, ... ``` ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios --------- Co-authored-by: kibanamachine <[email protected]>
…200536) ## Summary Closes elastic#193352 Update: Using a new SO field `bump_agent_policy_revision` in package policy type to mark package policies for update, this will trigger an agent policy revision bump. The feature supports both legacy and new package policy SO types, and queries policies from all spaces. To test, add a model version change to the package policy type and save. After Fleet setup is run, the agent policies using the package policies should be bumped and deployed. The same effect can be achieved by manually updating a package policy SO and loading Fleet UI to trigger setup. ``` '2': { changes: [ { type: 'data_backfill', backfillFn: (doc) => { return { attributes: { ...doc.attributes, bump_agent_policy_revision: true } }; }, }, ], }, curl -sk -XPOST --user fleet_superuser:password -H 'content-type:application/json' \ -H'x-elastic-product-origin:fleet' \ http://localhost:9200/.kibana_ingest/_update_by_query -d ' { "query": { "match": { "type": "fleet-package-policies" } },"script": { "source": "ctx._source[\"fleet-package-policies\"].bump_agent_policy_revision = true", "lang": "painless" } }' ``` ``` [2024-11-20T14:40:30.064+01:00][INFO ][plugins.fleet] Found 1 package policies that need agent policy revision bump [2024-11-20T14:40:31.933+01:00][DEBUG][plugins.fleet] Updated 1 package policies in space space1 in 1869ms, bump 1 agent policies [2024-11-20T14:40:35.056+01:00][DEBUG][plugins.fleet] Deploying 1 policies [2024-11-20T14:40:35.493+01:00][DEBUG][plugins.fleet] Deploying policies: 7f108cf2-4cf0-4a11-8df4-fc69d00a3484:10 ``` TODO: - the same flag has to be added on agent policy and output types, and the task extended to update them - I plan to do this in another pr, so that this doesn't become too big - add integration test if possible ### Scale testing Tested with 500 agent policies split to 2 spaces, 1 integration per policy and bumping the flag in a new saved object model version, the bump task took about 6s. The deploy policies step is async, took about 30s. ``` [2024-11-20T15:53:55.628+01:00][INFO ][plugins.fleet] Found 501 package policies that need agent policy revision bump [2024-11-20T15:53:57.881+01:00][DEBUG][plugins.fleet] Updated 250 package policies in space space1 in 2253ms, bump 250 agent policies [2024-11-20T15:53:59.926+01:00][DEBUG][plugins.fleet] Updated 251 package policies in space default in 4298ms, bump 251 agent policies [2024-11-20T15:54:01.186+01:00][DEBUG][plugins.fleet] Deploying 250 policies [2024-11-20T15:54:29.989+01:00][DEBUG][plugins.fleet] Deploying policies: test-policy-space1-1:4, ... [2024-11-20T15:54:33.538+01:00][DEBUG][plugins.fleet] Deploying policies: policy-elastic-agent-on-cloud:4, test-policy-default-1:4, ... ``` ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios --------- Co-authored-by: kibanamachine <[email protected]>
Summary
Closes #193352
Update:
Using a new SO field
bump_agent_policy_revision
in package policy type to mark package policies for update, this will trigger an agent policy revision bump.The feature supports both legacy and new package policy SO types, and queries policies from all spaces.
To test, add a model version change to the package policy type and save. After Fleet setup is run, the agent policies using the package policies should be bumped and deployed.
The same effect can be achieved by manually updating a package policy SO and loading Fleet UI to trigger setup.
TODO:
Scale testing
Tested with 500 agent policies split to 2 spaces, 1 integration per policy and bumping the flag in a new saved object model version, the bump task took about 6s.
The deploy policies step is async, took about 30s.
Checklist