-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ResponseOps] implement task claiming strategy mget #180485
Conversation
746af9d
to
80f8a9a
Compare
/ci |
a855be0
to
f589608
Compare
Taking $ qaf rac alert-load \
--rule-count 200 \
--rule-interval 1m \
--run-minutes 10 \
--percent-firing 0 \
--es-url https://keepkibana-pr-180485-elasticsearch-ea83fb.es.eu-west-1.aws.qa.elastic.cloud \
--kibana-url https://keepkibana-pr-180485-elasticsearch-ea83fb.kb.eu-west-1.aws.qa.elastic.cloud \
--username testing-internal \
--password [secret-here] That command will create 200 rules for 10m, and then produce some Dashboards showing some stats. Nice! The TM stats don't seem to be there, guessing that's because the TM health report is not really available for serverless. I'm going to take a look at the event log directly instead ... |
resolves: elastic#155770 Adds a new task claiming strategy `mget`, which can be used instead of the default one `default`. Add the following to your `kibana.yml` to enable it: xpack.task_manager.claim_strategy: 'mget'
f589608
to
61c616f
Compare
/ci |
I removed the cluster / project auto-deployments - they're hard to control, I figure using the custom images will be good enough. |
/ci |
/ci |
I changed the default task claimer to the one implemented here, to make it easy to test in cloud without overrides. Interesting to see the FT failures - I thought there would be more! |
/ci |
x-pack/plugins/task_manager/server/task_claimers/strategy_mget.ts
Outdated
Show resolved
Hide resolved
My last commit before my last comment-based merge upstream had some functional test errors. Hoping they're fixed with the revert of the metrics PR 🤞🏻 . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got all the feedback I needed out from my side, thanks @pmuellr for getting this through ❤️ Once the PR is merged, we should create the follow up issues as we'll most likely prioritize them right after.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for addressing comments
@elasticmachine merge upstream |
@elasticmachine merge upstream |
…only claimable task types (#185894) ## Summary This PR updates the overdue metrics collector to filter to only claimable task types. I borrowed the `OneOfTaskTypes` clause from #180485 ``` // a task type that's not excluded (may be removed or not) OneOfTaskTypes('task.taskType', searchedTypes), ``` ### Checklist - [ ] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios
…o only claimable task types (#185892) ## Summary This PR updates the overdue metrics collector to filter to only claimable task types. I borrowed the `OneOfTaskTypes` clause from #180485 ``` // a task type that's not excluded (may be removed or not) OneOfTaskTypes('task.taskType', searchedTypes), ``` ### Checklist - [ ] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios
@elasticmachine merge upstream |
💛 Build succeeded, but was flaky
Failed CI StepsTest Failures
Metrics [docs]Unknown metric groupsESLint disabled line counts
Total ESLint disabled count
History
To update your PR or re-run it, just comment with: |
resolves: #181325
Summary
Adds a new task claiming strategy
mget
, which can be used instead of the default onedefault
. Add the following to yourkibana.yml
to enable it:Work still to be done
To Verify
Add
xpack.task_manager.claim_strategy: 'unsafe_mget'
to yourkibana.dev.yml
and you'll be using the new claimer. The most obvious problem is the task manager health report has null/0 values for the task claim section (not sure why, yet).You can set TM debug logging on, and should see a message every claim cycle:
Here's the logging stanza:
When running multiple Kibana instances, you'll see cases where all the potential tasks are stale, or the updates were in conflict, and nothing actually gets claimed (noted in things to fix, above).
If you want to dig a little further, a command-line tool is available in
x-pack/plugins/task_manager/server/manual_tests/get_rule_run_event_logs.js
. It is used to pull rule run execution documents from multiple clusters at once, and provide some augmented info in them, regarding workers, idle time, etc. The idea is to do an A/B test of using the new task claimer vs default, then see how the runs compare.