-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fleet policy deployment on multiple Agents in a consecutive/sequenced 1by1 way #474
Comments
@jen-huang @nimarezainia I think this should be move in kibana repository, this is some kind of canary deployment. |
The following might be also useful to solve first: Let me better describe the ER open by Gema and our current use case: 3(or more to come) Agents running on 3 physical hosts(with more than generous cpu cores&ram each) using the same Fleet policy with multiple(20+)inputs(integrations). If we do a change on that already applied policy(eg. Output from default to an output with dead letter) the Agents do not wait for each other to finish the change until they get back into a healthy state. It might be that all transition to an unhealthy state because of an wrong custom input config or whatever else. We would like to avoid that. There is no sequenced or awarness option, neither in Fleet nor in the Agent at the moment, where if one policy change on an agent fails the others should continue working (living). The current unsequenced way would make sense when you have a huge number of Agents and do not care if some breake but not for this use case. At least one should live. @ruflin showed in one of his presentations an very interesting possible feature/enhancement: I would like to have something similar available but use our own LB to determine if the Agents and underlying inputs are alive. |
@zez3 you have noted some of the enhancements regarding the reporting of the status of an integration which are the first steps in what's being requested here. We will be looking at implementing more control over how the policy gets distributed. You can track this public issue: elastic/kibana#108267 Right now as you noted the policy, when updated, is rolled out to all the agents in that policy. We have to define a way that a tranche of agents get the update first then another tranche. Would that be acceptable in this case or do you think we need to update a tranche and wait for human intervention to carry on to the next? (in some cases as a platform we wouldn't know if an input is operating correctly) |
I guess that the update attempt(verification step) should be perhaps based on OS, because the same integration (e.g. System, AV, others?) can be applied to windows, linux, mac. Regarding "as a platform"(not my case) perhaps it should, like now, silently fail...I need more info to comment on this. Also this feature discussed here is perhaps not desired by all. I think, it should/could be activated on demand for specific policies or globally in fleet. |
Describe the enhancement:
Looking for a way for fleet policy on multiple underlying Agents in a consecutive/sequenced way
Describe a specific use case for the enhancement or feature:
Fleet policy that is configured on multiple Agents that needs some changes, but that change might break the agents health status.
We would like rollout of changes preferably waiting until each Agent is healthy again.
The text was updated successfully, but these errors were encountered: