You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently when the Elastic Agent checks-in with Fleet Server it doesn't send the policy_id or revision of the policy that it is currently running. The Fleet Server stores this information by the fact that the Elastic Agent ACK'd the policy change notification, but there are many cases where this could be come out of sync.
VM Snapshot
VM is snapshotted
new policy revision occurs
ACK'd by Elastic Agent (stored new revision in Fleet)
VM is rolled back
Now the running Elastic Agent policy is the old version, but to Fleet it is the new version.
Bad Error Case
This is just a weird case but a coding issue could result in this problem.
New revision is sent to Elastic Agent
Policy failed to be saved to disk (could be coding issue or just with filesystem problem)
policy revision is ACK'd anyway (shouldn't happen, but if it does...)
Elastic Agent is now running old version of policy but Fleet Server believes that its the new revision
Backup/restore of fleet.enc
In the case of backup/restore of fleet.enc.
fleet.enc is backed up
new policy revision occurs
ACK'd by Elastic Agent (stored new revision in Fleet)
fleet.enc is replaced with backup from 1
Elastic Agent restarted
Elastic Agent is now running old version of policy but Fleet Server believes that its the new revision
How to solve it?
Upon check-in the Elastic Agent should be sending its current policy ID and revision. That is then compared to what Fleet Server expects and if it is not correct then it sends the correct policy.
The text was updated successfully, but these errors were encountered:
The Fleet Server stores this information by the fact that the Elastic Agent ACK'd the policy change notification, but there are many cases where this could be come out of sync.
This is another example of a place where we don't actually need explicit ACKs like upgrades, because the actual state of the agent can be detected from the checkin payload.
More and more I think the way ACKs are used in action processing needs to be completely revisited or just designed out of the system. We frequently have an ACK and an actual state change or action result that are two separate updates to the system state, allowing one of them to not happen regardless of the other. We should eliminate this problem.
I believe We already have that with ACK. Elastic Agent should not ACK a revision that it cannot apply. Or are you thinking of something different than me?
This is more like it did ACK but it has been rollback out of control of the Elastic Agent.
We need to send the policy_id and revision on checkin so Fleet Server can check if it's on the correct policy and correct revision. If any of those are incorrect it should send a new policy update action with the policy information.
Overview
Currently when the Elastic Agent checks-in with Fleet Server it doesn't send the
policy_id
orrevision
of the policy that it is currently running. The Fleet Server stores this information by the fact that the Elastic Agent ACK'd the policy change notification, but there are many cases where this could be come out of sync.VM Snapshot
Now the running Elastic Agent policy is the old version, but to Fleet it is the new version.
Bad Error Case
This is just a weird case but a coding issue could result in this problem.
Elastic Agent is now running old version of policy but Fleet Server believes that its the new revision
Backup/restore of
fleet.enc
In the case of backup/restore of
fleet.enc
.fleet.enc
is backed upfleet.enc
is replaced with backup from 1Elastic Agent is now running old version of policy but Fleet Server believes that its the new revision
How to solve it?
Upon check-in the Elastic Agent should be sending its current policy ID and revision. That is then compared to what Fleet Server expects and if it is not correct then it sends the correct policy.
The text was updated successfully, but these errors were encountered: