Handling Agent's heartbeat #36

christian-pinto · 2024-08-16T15:23:16Z

Implement the heartbeat event for each agent to log the last seen.What is the amount of time between heartbeats we can consider an agent as still alive? If that time passes, what do we do with the agent and the associated resources?

christian-pinto · 2024-08-16T15:24:24Z

In the library we should have a routine for keeping track of agent's heartbeats

verify if agents are timing out
should this be a service internal to the sunfish core or an external one?

We should also figure out what to do with resources of an allegedly failed agent (missed heartbeat)

heartbeat messages could be lost so a request for agent lifeness proof is required. Perhaps in the form of an event sent to the agent
resources owned by a failed agent could be still functional. One approach would be that of leaving attached resources untouched, unless other clients report failure, and stop allocating resourced from the failing agent until it is operational again.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handling Agent's heartbeat #36

Handling Agent's heartbeat #36

christian-pinto commented Aug 16, 2024

christian-pinto commented Aug 16, 2024 •

edited

Loading

Handling Agent's heartbeat #36

Handling Agent's heartbeat #36

Comments

christian-pinto commented Aug 16, 2024

christian-pinto commented Aug 16, 2024 • edited Loading

christian-pinto commented Aug 16, 2024 •

edited

Loading