Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setting Global Timeouts beyond 60000 causes 500 error #989

Open
alex8bitw opened this issue Sep 30, 2024 · 6 comments
Open

Setting Global Timeouts beyond 60000 causes 500 error #989

alex8bitw opened this issue Sep 30, 2024 · 6 comments

Comments

@alex8bitw
Copy link

When we set:

betweenBytesTimeout
firstByteTimeout
connectTimeout

to 180000 from 60000 the Compute experienced complete failure. According to the Fastly documentation, this value can go up to 4294967296 as per the documentation here: https://js-compute-reference-docs.edgecompute.app/docs/fastly:experimental/allowDynamicBackends#parameters

image

@guybedford
Copy link
Member

I believe the reason we diagnosed before with changes on this service was that changing these options can result in the Fastly compute systems overloading your backend itself causing failures.

Previously I believe @ulyssa discussed the fix to this being better controlling / restricting the concurrent connections made to your backend across the network to avoid overloading it.

@ulyssa
Copy link

ulyssa commented Oct 1, 2024

Yes, it is possible when increasing the connection or other backend timeout that all of the spots available for a given service get used up by instances of the service waiting on something to hit the backend. If they hit the max guest runtime (~120s) then they'll get killed and a 5xx error will be generated for them.

When dealing with origin requests that might take a while to send a body, as long as it returns the response headers then you can send the response body handle and exit the guest. (I'm not entirely sure what this looks like in the JS SDK, but @guybedford may be able to point you at an example.) This then frees up the guest slot and its memory and allows handling a new incoming request while the response gets streamed in the background.

@alex8bitw
Copy link
Author

We didn't see any increase of connections at the origin. In fact, we saw the opposite, nothing was getting through at all.

@guybedford
Copy link
Member

Do you have a test service that you use here? Can you replicate the issue on the test service with a single request? Are you able to check the error logs for that 500 request to determine the 500 error data?

@alex8bitw
Copy link
Author

This was the original error from the Compute. I'll setup a test next weekend.

stdout | b3c222fb | Error: DOMException { code: 19, message: "HTTP response timeout", name: "NetworkError", INDEX_SIZE_ERR: 1, DOMSTRING_SIZE_ERR: 2, HIERARCHY_REQUEST_ERR: 3, WRONG_DOCUMENT_ERR: 4, INVALID_CHARACTER_ERR: 5, NO_DATA_ALLOWED_ERR: 6, NO_MODIFICATION_ALLOWED_ERR: 7, NOT_FOUND_ERR: 8, NOT_SUPPORTED_ERR: 9, INUSE_ATTRIBUTE_ERR: 10, INVALID_STATE_ERR: 11, SYNTAX_ERR: 12, INVALID_MODIFICATION_ERR: 13, NAMESPACE_ERR: 14, INVALID_ACCESS_ERR: 15, VALIDATION_ERR: 16, TYPE_MISMATCH_ERR: 17, SECURITY_ERR: 18, NETWORK_ERR: 19, ABORT_ERR: 20, URL_MISMATCH_ERR: 21, QUOTA_EXCEEDED_ERR: 22, TIMEOUT_ERR: 23, INVALID_NODE_TYPE_ERR: 24, DATA_CLONE_ERR: 25 }

Not very informative unfortunately as to the true cause.

@guybedford
Copy link
Member

This is exactly the error for the timeout triggering. Yes a test setup would help to check it is the expected timeout behaviour.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants