Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling Agent's heartbeat #36

Open
christian-pinto opened this issue Aug 16, 2024 · 1 comment
Open

Handling Agent's heartbeat #36

christian-pinto opened this issue Aug 16, 2024 · 1 comment

Comments

@christian-pinto
Copy link
Collaborator

Implement the heartbeat event for each agent to log the last seen.What is the amount of time between heartbeats we can consider an agent as still alive? If that time passes, what do we do with the agent and the associated resources?

@christian-pinto
Copy link
Collaborator Author

christian-pinto commented Aug 16, 2024

In the library we should have a routine for keeping track of agent's heartbeats

  • verify if agents are timing out
  • should this be a service internal to the sunfish core or an external one?

We should also figure out what to do with resources of an allegedly failed agent (missed heartbeat)

  • heartbeat messages could be lost so a request for agent lifeness proof is required. Perhaps in the form of an event sent to the agent
  • resources owned by a failed agent could be still functional. One approach would be that of leaving attached resources untouched, unless other clients report failure, and stop allocating resourced from the failing agent until it is operational again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant