-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Skip rule execution when it has a newer typeVersion than Kibana instance #172161
Skip rule execution when it has a newer typeVersion than Kibana instance #172161
Conversation
@@ -941,7 +943,6 @@ export class TaskRunner< | |||
|
|||
return { interval: retryInterval }; | |||
}), | |||
monitoring: this.ruleMonitoring.getMonitoring(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This field is redundant and not used by TaskManager.
const { data } = await this.task.loadIndirectParams(); | ||
if (data) { | ||
// validate indirect params e.g. connector or rule.params | ||
if (indirectParamsSchema && indirectParams) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Patrick left a note about this check in this PR's POC.
When there is a version mismatch the task is skipped 100 times (default number) then doesn't return a skipError in order to let the rule run.
So it comes to this line and does indirectParamsSchema
validation as well. As it has already reached the maxAttempts
just logs a warning and doesn't return any skipError either.
We may skip this validation when there is already an error (hasValidationError). WDYT?
Pinging @elastic/response-ops (Team:ResponseOps) |
@@ -212,7 +212,7 @@ export default function executionStatusAlertTests({ getService }: FtrProviderCon | |||
}) | |||
.expect(200); | |||
|
|||
executionStatus = await waitForStatus(alertId, new Set(['error'])); | |||
executionStatus = await waitForStatus(alertId, new Set(['error']), 50000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this waits for the task execution to be skipped 2 times and then run the rule, increased the delay to make it long enough for 2 skips.
@@ -315,7 +314,7 @@ export class RuleTypeRegistry { | |||
spaceId: schema.string(), | |||
consumer: schema.maybe(schema.string()), | |||
}), | |||
indirectParamsSchema: rawRuleSchemaV1, | |||
indirectParamsSchema: ruleType.validate.params, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we no longer validating the rule schema and just the rule parameters? Seems like changes to the rule schema could also potentially be backwards incompatible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually version check replaces the schema check, but i wanted to use it for params validation rather than removing the whole feature. Maybe removing is just better as we already have another issue for params, config and secrets validation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And the rule schema is just used for modelVersions which validates the rule just before creating the SO in ES.
This is to force developers to update the schema and bump the version.
Can you provide instructions for verifying this PR? Under what conditions should it succeed? Under what conditions should it fail? I am familiar at a high level but it would be helpful to document these things as well. |
e441fb8
to
08777fd
Compare
Added how-to-verify section to the description |
I'm not seeing the expected results from To verify: - the rule seems. to continue running without extra logging. Perhaps because I had a 1s interval? Will try again with a 1m. Also I figured I'd disable the rule before making the code change, make the code change, let Kibana restart and spam the console, wait for it to finish, then re-enable the rule in hopes of more clearly seeing the messages. Should work, right? |
Opss, I was confused... edit: There was one more missing thing, you need to pull the PR again. |
…sion' into 166967-skip-tasks-with-wrong-version
# Conflicts: # x-pack/plugins/task_manager/server/task_running/task_runner.ts
💚 Build Succeeded
Metrics [docs]Public APIs missing comments
Unknown metric groupsAPI count
History
To update your PR or re-run it, just comment with: cc @ersin-erdal |
The feature has been reverted. |
Resolves: #166967
This PR adds a new logic to skip task execution by using the
modelVersions
of saved objects.schemas.create
in modelVersions are used by Kibana to validate the SO that they are attached before saving them.So it will force developers to bump the versions whenever there is a change in the SO (add, remove etc.)
We have already saved the latestRuleVersion in ruleTypeRegistry with the following PR #171927
And with this PR, we also save the
typeVersion
in a rule on its create, update and execute actions, so we can pass the rule type version to TaskManager withloadIndirectParams
.Then we can compare the
latestRuleVersion
withtypeVersion
in TaskManager and skip the execution iftypeVersion
of a rule is greater thanlatestRuleVersion
. This means the rule has already been updated by a newer Kibana version.But before passing
typeVersion
to TM, we recursively call theisCompatibleWithPreviousVersion
method of the rule versions until one of them return false (or up to version 1) then return that version to TM.isCompatibleWithPreviousVersion
takes the rawRule as param, so a developer can decide if the new version is compatible with the previous one on run time.For instance: The new field
muted
is added to the rule SO. And a developer wants to if the fetched rule SO is compatible with the previous version, so they can do:Another change that is introduced by tis PR:
As we are replacing the indirectParams validation with version check in TM, we decided to validate only the
rule.params
rather than the whole rawRule. Therefore we returnrule.params
asindirectParams
andruleType.validate.params
asindirectParamsSchema
Note: we do
indirectParamsSchema.validate(indirectParams)
in TM.To verify:
Enable skipping feature in config:
Create a rule and check if it has the
typeVersion:1
field,replace
typeVersion
withtypeVersion: 2
(current latest rule type version is 1) in hereTM should be skipping the rule execution and print 2 warnings:
Task (xxx/xxx) has a newer version(2) than expected(1)
Task Manager has skipped executing the Task (xxx/xxx) x times as it has invalid params.
after 10 skips, the should be executed.
typeVersion field should be addedd to the rules from an older Kibana version (after upgrade) on the first execution.