Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Security Solution] Fix losing data upon prebuilt rule upgrade to a new version in which the rule's type is different #176421

Merged
merged 7 commits into from
Feb 12, 2024

Conversation

maximpn
Copy link
Contributor

@maximpn maximpn commented Feb 7, 2024

Fixes: #169480

Summary

This PR fixes losing the following rule data upon prebuilt rule upgrade to a new version in which the rule's type is different

  • Saved Object id
  • exceptions list (default and shared)
  • Timeline id
  • Timeline title

Details

The problem occurs when user upgrades a prebuilt rule to a newer version which has a different rule type.

Checking the code it's not so hard to find upgradeRule() function which performs prebuilt rule upgrade. It has the following comment

If we're trying to change the type of a prepackaged rule, we need to delete the old one and replace it with the new rule, keeping the enabled setting, actions, throttle, id, and exception lists from the old rule.

Looking below in the code it's clear that only enabled state and actions get restored upon rule upgrade. Missing to restore exceptions lists leads to disappearing exceptions upon rule upgrade.

On top of this execution results and execution events also get lost due to missing to restore saved object id. Execution log isn't gone anywhere but can't be bound to a new id. Direct links to rule details page won't work neither after upgrade.

This PR fixes the problem by restoring rule bound data after upgrade.

FTR tests were restructured to accommodate extra tests to cover this bug fix.

Checklist

  • Unit or functional tests were updated or added to match the most common scenarios
  • Ran in Flaky test runner for ESS and Serverless and no flakiness has been revealed

@maximpn maximpn added bug Fixes for quality problems that affect the customer experience release_note:fix impact:high Addressing this issue will have a high level of impact on the quality/strength of our product. Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. Team:Detection Rule Management Security Detection Rule Management Team Feature:Prebuilt Detection Rules Security Solution Prebuilt Detection Rules area v8.13.0 v8.12.2 labels Feb 7, 2024
@maximpn maximpn self-assigned this Feb 7, 2024
@maximpn maximpn force-pushed the fix-losing-exception-upon-rule-upgrade branch from 88bf61e to c49b541 Compare February 9, 2024 11:30
@maximpn
Copy link
Contributor Author

maximpn commented Feb 9, 2024

/ci

@maximpn maximpn requested a review from jpdjere February 9, 2024 16:22
@maximpn maximpn marked this pull request as ready for review February 9, 2024 16:22
@maximpn maximpn requested review from a team as code owners February 9, 2024 16:22
@elasticmachine
Copy link
Contributor

Pinging @elastic/security-detections-response (Team:Detections and Resp)

@elasticmachine
Copy link
Contributor

Pinging @elastic/security-solution (Team: SecuritySolution)

@elasticmachine
Copy link
Contributor

Pinging @elastic/security-detection-rule-management (Team:Detection Rule Management)

@jpdjere
Copy link
Contributor

jpdjere commented Feb 12, 2024

Question to both @maximpn and @banderror

The PR description states:

On top of this execution results and execution events also get lost due to missing to restore saved object id. Execution log isn't gone anywhere but can't be bound to a new id. Direct links to rule details page won't work neither after upgrade.

And this PR changes the behaviour of a rule being re-created by passing the same id as the existing rule.

However, what I see in the execution results endpoint is that we search and aggregate by ruleId from the params and not by saved object id.

So, if I'm understanding this correctly, this won't help to recover old execution results anyways in the case of rule updates with type change, since (again, if I remember correctly) the TRaDE team changes the rule_id when that happens.

But my main doubt is about the behaviour of the endpoint. Should we be searching/aggregating by SO id instead of ruleId? Especially since ruleId can be duplicate.

Copy link
Contributor

@jpdjere jpdjere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a couple nits and one major doubt. But it's not a blocker for this PR anyways, so LGTM

Then the rule bound data should be preserved
```

CASEs: generated alerts, exception lists (rule exception list, shared exception list, endpoint exception list), timeline reference, actions, enabled state, execution results and execution events.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was explained today that Examples is the keyword to use in Gherkin (not "Cases") :)

#176474 (comment)

const supertest = getService('supertest');
const log = getService('log');

describe('@ess @serverless @skipInQA install prebuilt rules from package w/ historical versions with mock rule assets', () => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

w/ -> with

Two characters difference and much more clarity. Same with w/o -> without

@maximpn maximpn force-pushed the fix-losing-exception-upon-rule-upgrade branch from 8e3b050 to 0045030 Compare February 12, 2024 11:44
@maximpn
Copy link
Contributor Author

maximpn commented Feb 12, 2024

@jpdjere here is an answer on your question

But my main doubt is about the behaviour of the endpoint. Should we be searching/aggregating by SO id instead of ruleId? Especially since ruleId can be duplicate.

Checking rule details page and type definitions here and here it becomes clear that Saved Object id is used. And it's super confusing it's named ruleId. It's good there is a comment to clarify the meaning.

@maximpn maximpn enabled auto-merge (squash) February 12, 2024 12:44
@banderror
Copy link
Contributor

in the case of rule updates with type change, since (again, if I remember correctly) the TRaDE team changes the rule_id when that happens

@jpdjere In most cases, TRADE don't change rule_id when they update a rule's type. This is why we handle this situation in our code in the first place.

@jpdjere
Copy link
Contributor

jpdjere commented Feb 12, 2024

@maximpn

Thanks 👍

Sorry, got mixed up when reading the endpoint handler code. It's the ruleId from the request params that we are using, not from the rule params.

@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

✅ unchanged

History

  • 💚 Build #192573 succeeded 8e3b0503bee3813f8f26d2fea997ac4e44a626d4
  • 💔 Build #192546 failed dfc023da43b1eb5cc69d48957722d3b553e59991
  • 💔 Build #192519 failed c49b5416ce486c5792316661b3744910e93dfbeb

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @maximpn

@maximpn maximpn merged commit ffdcc34 into elastic:main Feb 12, 2024
37 checks passed
@banderror
Copy link
Contributor

@maximpn Can you please run the updated tests through https://ci-stats.kibana.dev/trigger_flaky_test_runner/1? This is part of our DoD.

@kibanamachine
Copy link
Contributor

💔 All backports failed

Status Branch Result
8.12 Backport failed because of merge conflicts

Manual backport

To create the backport manually run:

node scripts/backport --pr 176421

Questions ?

Please refer to the Backport tool documentation

@maximpn maximpn deleted the fix-losing-exception-upon-rule-upgrade branch February 12, 2024 14:43
@maximpn
Copy link
Contributor Author

maximpn commented Feb 12, 2024

@banderror

Can you please run the updated tests through https://ci-stats.kibana.dev/trigger_flaky_test_runner/1? This is part of our DoD.

Here is a successful result. I've added it to the PR's checklist as well. It's interesting that Flaky test runner doesn't allow to specify more than 100 executions while our DoD says 200 executions required.

@maximpn
Copy link
Contributor Author

maximpn commented Feb 13, 2024

💚 All backports created successfully

Status Branch Result
8.12

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

maximpn added a commit to maximpn/kibana that referenced this pull request Feb 13, 2024
…ew version in which the rule's type is different (elastic#176421)

**Fixes:** elastic#169480

## Summary

This PR fixes losing the following rule data upon prebuilt rule upgrade to a new version in which the rule's type is different

- Saved Object id
- exceptions list (default and shared)
- Timeline id
- Timeline title

## Details

The problem occurs when user upgrades a prebuilt rule to a newer version which has a different rule type.

Checking the code it's not so hard to find [`upgradeRule()`](https://github.com/elastic/kibana/blob/main/x-pack/plugins/security_solution/server/lib/detection_engine/prebuilt_rules/logic/rule_objects/upgrade_prebuilt_rules.ts#L49) function which performs prebuilt rule upgrade. It has the following comment

> If we're trying to change the type of a prepackaged rule, we need to delete the old one and replace it with the new rule, keeping the enabled setting, actions, throttle, id, and exception lists from the old rule.

Looking below in the code it's clear that only enabled state and actions get restored upon rule upgrade. Missing to restore `exceptions lists` leads to disappearing exceptions upon rule upgrade.

On top of this `execution results` and `execution events` also get lost due to missing to restore saved object `id`. Execution log isn't gone anywhere but can't be bound to a new id. Direct links to rule details page won't work neither after upgrade.

This PR fixes the problem by restoring rule bound data after upgrade.

FTR tests were restructured to accommodate extra tests to cover this bug fix.

### Checklist

- [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios

(cherry picked from commit ffdcc34)

# Conflicts:
#	x-pack/plugins/security_solution/docs/testing/test_plans/detection_response/prebuilt_rules/installation_and_upgrade.md
#	x-pack/test/security_solution_api_integration/test_suites/detections_response/default_license/prebuilt_rules/management/install_and_upgrade_prebuilt_rules.ts
#	x-pack/test/security_solution_api_integration/test_suites/detections_response/default_license/prebuilt_rules/management/install_prebuilt_rules.ts
#	x-pack/test/security_solution_api_integration/test_suites/detections_response/default_license/prebuilt_rules/management/install_prebuilt_rules_with_historical_versions.ts
#	x-pack/test/security_solution_api_integration/test_suites/detections_response/default_license/prebuilt_rules/management/upgrade_prebuilt_rules.ts
#	x-pack/test/security_solution_api_integration/test_suites/detections_response/default_license/prebuilt_rules/management/upgrade_prebuilt_rules_with_historical_versions.ts
#	x-pack/test/security_solution_api_integration/test_suites/detections_response/rules_management/rule_import_export/trial_license_complete_tier/export_rules.ts
maximpn added a commit that referenced this pull request Feb 13, 2024
… to a new version in which the rule's type is different (#176421) (#176811)

# Backport

This will backport the following commits from `main` to `8.12`:
- [[Security Solution] Fix losing data upon prebuilt rule upgrade to a
new version in which the rule's type is different
(#176421)](#176421)

<!--- Backport version: 8.9.8 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Maxim
Palenov","email":"[email protected]"},"sourceCommit":{"committedDate":"2024-02-12T13:09:23Z","message":"[Security
Solution] Fix losing data upon prebuilt rule upgrade to a new version in
which the rule's type is different (#176421)\n\n**Fixes:**
https://github.com/elastic/kibana/issues/169480\r\n\r\n##
Summary\r\n\r\nThis PR fixes losing the following rule data upon
prebuilt rule upgrade to a new version in which the rule's type is
different\r\n\r\n- Saved Object id\r\n- exceptions list (default and
shared)\r\n- Timeline id\r\n- Timeline title\r\n\r\n##
Details\r\n\r\nThe problem occurs when user upgrades a prebuilt rule to
a newer version which has a different rule type.\r\n\r\nChecking the
code it's not so hard to find
[`upgradeRule()`](https://github.com/elastic/kibana/blob/main/x-pack/plugins/security_solution/server/lib/detection_engine/prebuilt_rules/logic/rule_objects/upgrade_prebuilt_rules.ts#L49)
function which performs prebuilt rule upgrade. It has the following
comment\r\n\r\n> If we're trying to change the type of a prepackaged
rule, we need to delete the old one and replace it with the new rule,
keeping the enabled setting, actions, throttle, id, and exception lists
from the old rule.\r\n\r\nLooking below in the code it's clear that only
enabled state and actions get restored upon rule upgrade. Missing to
restore `exceptions lists` leads to disappearing exceptions upon rule
upgrade.\r\n\r\nOn top of this `execution results` and `execution
events` also get lost due to missing to restore saved object `id`.
Execution log isn't gone anywhere but can't be bound to a new id. Direct
links to rule details page won't work neither after upgrade.\r\n\r\nThis
PR fixes the problem by restoring rule bound data after
upgrade.\r\n\r\nFTR tests were restructured to accommodate extra tests
to cover this bug fix. \r\n\r\n### Checklist\r\n\r\n- [x] [Unit or
functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common
scenarios","sha":"ffdcc34d0d4f05aad8ad979775e8b0f503af313d","branchLabelMapping":{"^v8.13.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["bug","release_note:fix","impact:high","Team:Detections
and Resp","Team: SecuritySolution","Team:Detection Rule
Management","Feature:Prebuilt Detection
Rules","v8.13.0","v8.12.2"],"number":176421,"url":"https://github.com/elastic/kibana/pull/176421","mergeCommit":{"message":"[Security
Solution] Fix losing data upon prebuilt rule upgrade to a new version in
which the rule's type is different (#176421)\n\n**Fixes:**
https://github.com/elastic/kibana/issues/169480\r\n\r\n##
Summary\r\n\r\nThis PR fixes losing the following rule data upon
prebuilt rule upgrade to a new version in which the rule's type is
different\r\n\r\n- Saved Object id\r\n- exceptions list (default and
shared)\r\n- Timeline id\r\n- Timeline title\r\n\r\n##
Details\r\n\r\nThe problem occurs when user upgrades a prebuilt rule to
a newer version which has a different rule type.\r\n\r\nChecking the
code it's not so hard to find
[`upgradeRule()`](https://github.com/elastic/kibana/blob/main/x-pack/plugins/security_solution/server/lib/detection_engine/prebuilt_rules/logic/rule_objects/upgrade_prebuilt_rules.ts#L49)
function which performs prebuilt rule upgrade. It has the following
comment\r\n\r\n> If we're trying to change the type of a prepackaged
rule, we need to delete the old one and replace it with the new rule,
keeping the enabled setting, actions, throttle, id, and exception lists
from the old rule.\r\n\r\nLooking below in the code it's clear that only
enabled state and actions get restored upon rule upgrade. Missing to
restore `exceptions lists` leads to disappearing exceptions upon rule
upgrade.\r\n\r\nOn top of this `execution results` and `execution
events` also get lost due to missing to restore saved object `id`.
Execution log isn't gone anywhere but can't be bound to a new id. Direct
links to rule details page won't work neither after upgrade.\r\n\r\nThis
PR fixes the problem by restoring rule bound data after
upgrade.\r\n\r\nFTR tests were restructured to accommodate extra tests
to cover this bug fix. \r\n\r\n### Checklist\r\n\r\n- [x] [Unit or
functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common
scenarios","sha":"ffdcc34d0d4f05aad8ad979775e8b0f503af313d"}},"sourceBranch":"main","suggestedTargetBranches":["8.12"],"targetPullRequestStates":[{"branch":"main","label":"v8.13.0","labelRegex":"^v8.13.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/176421","number":176421,"mergeCommit":{"message":"[Security
Solution] Fix losing data upon prebuilt rule upgrade to a new version in
which the rule's type is different (#176421)\n\n**Fixes:**
https://github.com/elastic/kibana/issues/169480\r\n\r\n##
Summary\r\n\r\nThis PR fixes losing the following rule data upon
prebuilt rule upgrade to a new version in which the rule's type is
different\r\n\r\n- Saved Object id\r\n- exceptions list (default and
shared)\r\n- Timeline id\r\n- Timeline title\r\n\r\n##
Details\r\n\r\nThe problem occurs when user upgrades a prebuilt rule to
a newer version which has a different rule type.\r\n\r\nChecking the
code it's not so hard to find
[`upgradeRule()`](https://github.com/elastic/kibana/blob/main/x-pack/plugins/security_solution/server/lib/detection_engine/prebuilt_rules/logic/rule_objects/upgrade_prebuilt_rules.ts#L49)
function which performs prebuilt rule upgrade. It has the following
comment\r\n\r\n> If we're trying to change the type of a prepackaged
rule, we need to delete the old one and replace it with the new rule,
keeping the enabled setting, actions, throttle, id, and exception lists
from the old rule.\r\n\r\nLooking below in the code it's clear that only
enabled state and actions get restored upon rule upgrade. Missing to
restore `exceptions lists` leads to disappearing exceptions upon rule
upgrade.\r\n\r\nOn top of this `execution results` and `execution
events` also get lost due to missing to restore saved object `id`.
Execution log isn't gone anywhere but can't be bound to a new id. Direct
links to rule details page won't work neither after upgrade.\r\n\r\nThis
PR fixes the problem by restoring rule bound data after
upgrade.\r\n\r\nFTR tests were restructured to accommodate extra tests
to cover this bug fix. \r\n\r\n### Checklist\r\n\r\n- [x] [Unit or
functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common
scenarios","sha":"ffdcc34d0d4f05aad8ad979775e8b0f503af313d"}},{"branch":"8.12","label":"v8.12.2","labelRegex":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"}]}]
BACKPORT-->
@banderror
Copy link
Contributor

It's interesting that Flaky test runner doesn't allow to specify more than 100 executions while our DoD says 200 executions required.

@maximpn Yes, looks like it's a new limitation of the flaky runner. We could still create several flaky runs to reach these numbers (200 for ESS and 200 for Serverless), right? Probably we should chat about that with the rest of the team at Tech Time.

CoenWarmer pushed a commit to CoenWarmer/kibana that referenced this pull request Feb 15, 2024
…ew version in which the rule's type is different (elastic#176421)

**Fixes:** elastic#169480

## Summary

This PR fixes losing the following rule data upon prebuilt rule upgrade to a new version in which the rule's type is different

- Saved Object id
- exceptions list (default and shared)
- Timeline id
- Timeline title

## Details

The problem occurs when user upgrades a prebuilt rule to a newer version which has a different rule type.

Checking the code it's not so hard to find [`upgradeRule()`](https://github.com/elastic/kibana/blob/main/x-pack/plugins/security_solution/server/lib/detection_engine/prebuilt_rules/logic/rule_objects/upgrade_prebuilt_rules.ts#L49) function which performs prebuilt rule upgrade. It has the following comment

> If we're trying to change the type of a prepackaged rule, we need to delete the old one and replace it with the new rule, keeping the enabled setting, actions, throttle, id, and exception lists from the old rule.

Looking below in the code it's clear that only enabled state and actions get restored upon rule upgrade. Missing to restore `exceptions lists` leads to disappearing exceptions upon rule upgrade.

On top of this `execution results` and `execution events` also get lost due to missing to restore saved object `id`. Execution log isn't gone anywhere but can't be bound to a new id. Direct links to rule details page won't work neither after upgrade.

This PR fixes the problem by restoring rule bound data after upgrade.

FTR tests were restructured to accommodate extra tests to cover this bug fix. 

### Checklist

- [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios
fkanout pushed a commit to fkanout/kibana that referenced this pull request Mar 4, 2024
…ew version in which the rule's type is different (elastic#176421)

**Fixes:** elastic#169480

## Summary

This PR fixes losing the following rule data upon prebuilt rule upgrade to a new version in which the rule's type is different

- Saved Object id
- exceptions list (default and shared)
- Timeline id
- Timeline title

## Details

The problem occurs when user upgrades a prebuilt rule to a newer version which has a different rule type.

Checking the code it's not so hard to find [`upgradeRule()`](https://github.com/elastic/kibana/blob/main/x-pack/plugins/security_solution/server/lib/detection_engine/prebuilt_rules/logic/rule_objects/upgrade_prebuilt_rules.ts#L49) function which performs prebuilt rule upgrade. It has the following comment

> If we're trying to change the type of a prepackaged rule, we need to delete the old one and replace it with the new rule, keeping the enabled setting, actions, throttle, id, and exception lists from the old rule.

Looking below in the code it's clear that only enabled state and actions get restored upon rule upgrade. Missing to restore `exceptions lists` leads to disappearing exceptions upon rule upgrade.

On top of this `execution results` and `execution events` also get lost due to missing to restore saved object `id`. Execution log isn't gone anywhere but can't be bound to a new id. Direct links to rule details page won't work neither after upgrade.

This PR fixes the problem by restoring rule bound data after upgrade.

FTR tests were restructured to accommodate extra tests to cover this bug fix. 

### Checklist

- [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience Feature:Prebuilt Detection Rules Security Solution Prebuilt Detection Rules area impact:high Addressing this issue will have a high level of impact on the quality/strength of our product. release_note:fix Team:Detection Rule Management Security Detection Rule Management Team Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. v8.12.2 v8.13.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants