Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reset federation retry timer after receiving a message from the (formerly) offline homeserver #17769

Closed
aerusso opened this issue Sep 30, 2024 · 4 comments

Comments

@aerusso
Copy link

aerusso commented Sep 30, 2024

Description:

If homeserver A goes down, and a user from homeserver B tries to contact someone on A, B will (correctly) notice that A is offline. If A comes back online, B (as far as I can tell) will wait until its retry timer expires before attempting to contact A again.

However, if a user on A sends a message to a user on B, B becomes aware that A is online again. At this point, it makes sense for B to start retrying sending messages to A.

This would significantly improve the user experience. Until these retry timers expire, users on B will be able to send message to users on A, but not vice-versa.

@daedric7
Copy link

daedric7 commented Oct 4, 2024

I was under the impression that the backoff behaviour is exactly that, it gets reset to 0 as soon as event from MIA is received.

Do you have any evidence that that's not the case ?

@aerusso
Copy link
Author

aerusso commented Oct 5, 2024

Yes. Quite literally the example I gave was my observation using 1.115.0. I manage two homeservers, took one down, and accidentally sent a test message to it before bringing it back up. As an add-on to this problem, the encryption keys for some messages were never federated between the clients, leading to apparently permanently undecryptable messages on one of the clients.

That said, I have only tested this with E2E encryption. Could that cause this to happen? Can you point to the line that re-sends outstanding requests?

@daedric7
Copy link

daedric7 commented Oct 5, 2024

Ok, a pratical test:

image

My synapse believes that server is down (and it is, server failed to boot)

Server came up:

image

I did nothing else other than make the server boot.

@aerusso
Copy link
Author

aerusso commented Oct 5, 2024

I can't seem to reproduce this on 1.116. Maybe something changed since 1.115? I'll reopen this if I can get a reproducer.

Thanks for bearing with me.

@aerusso aerusso closed this as completed Oct 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants