Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat(watcher): add event retry queue #218

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

amirRamirfatahi
Copy link
Collaborator

@amirRamirfatahi amirRamirfatahi commented Nov 21, 2024

bin to handle failures without disrupting the watcher's flow of reading events.

Pre-submission Checklist

For tests to work you need a working neo4j and redis instance with the example dataset in docker/db-graph

  • [*] Testing: Implement and pass new tests for the new features/fixes, cargo test.
  • Performance: Ensure new code has relevant performance benchmarks, cargo bench

@amirRamirfatahi amirRamirfatahi changed the title Adds a retry bin to handle failures without disrupting the watcher's flow of reading events. Adds a retry Nov 21, 2024
@amirRamirfatahi amirRamirfatahi changed the title Adds a retry Adds a retry mechanism Nov 21, 2024
@SHAcollision
Copy link
Collaborator

SHAcollision commented Nov 21, 2024

Hey @amirRamirfatahi, thanks for putting together this PR!

A couple of different things here intertwining in this PR with different value/priority: 1. error curation (we need now) and 2. retry queue (future need)

  • Error Curation: It could be useful to start curating a list of errors to identify watcher bugs when handling real user events. The approach in this PR has some potential, though dumping detailed errors straight to the terminal and manually collecting them might be the simplest option for now. Still, feel free to experiment and collect some Errors/bugs that we can act on.

This relates directly to #169 (comment) and I would say this is pretty high priority. Quoting from the linked comment:

This is a janitorial task: compile an Issue with unexpected errors, their logs and if possible, how to reproduce them.

  • Retry Queue: A retry mechanism could be handy in the future, but it feels premature at the moment. We’d need to carefully define conditions for retries, e.g., "Homeserver not found" (retry in X hours) vs. validation errors (never retry). I think we can only be careful enough on defining "what to retry" after we make an exploration of the Errors we get in the watcher. That said, when we scale to multiple homeservers, the retry service concept will definitely be worth revisiting. Low priority at the moment.

I think we can let this PR in paused until the complexity is actually merited, as going in the weeds to review in depth, solve tests issues and merge this code subtracts work-power from other tasks.

@SHAcollision SHAcollision changed the title Adds a retry mechanism Feat(watcher): add event retry queue Nov 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants