-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
panic in pubsub lib Topic #2554
Comments
This happened in production at Aug 29, 2024 at 2:10:40.517 pm AEST production logs:
|
line in the lib: https://github.com/alecthomas/types/blob/92ffae5908acce44483cd09ed1c7918fea61f7d8/pubsub/pubsub.go#L197 We use the lib in a few places, but judging from the linked CI logs, it could be:
I don't think it can be |
…#2563) fixes #2554 Current theory is this: - cluster has a low number of modules active (let's say 0) - gRPC call comes into controller to PullSchema - this causes us to subscribe to schema changes, with a chan with the length of the count of the current active modules (in our example, 0) - The lib then creates an extra buffer to queue up messages while the subscriber processes messages, with the buffer being the same size as the provided chan (again, 0) - As a change schema change message is received by the subscriber, we stream the message back over gRPC. This may take time. - While this is happening, another schema update may occur. the lib will not be able to ack the message because the buffer size is too small (0) so it will wait for the message to be received. - the lib will timeout if the networking is not done fast enough. Repro'd by doing the following: - Added a sleep step before sending the schema update over the network. - Start up FTL without any modules - Ran `ftl schema get --watch` - Ran `ftl deploy` with a bunch of module - Hit the `ack timout` panic Could not repro after making the chan length always be a decent size.
Running
just e2e-frontend
in CI is getting a panic here: https://github.com/TBD54566975/ftl/actions/runs/10622270938/job/29446142300#step:6:1238This command is running
ftl dev --recreate
under the hood.Note: this is not necessarily part of FTL's pubsub feature. It is caused by something using @alecthomass pubsub library
The text was updated successfully, but these errors were encountered: