Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error handling on fetch #69

Open
bflora2 opened this issue Jun 22, 2024 · 3 comments
Open

Error handling on fetch #69

bflora2 opened this issue Jun 22, 2024 · 3 comments

Comments

@bflora2
Copy link

bflora2 commented Jun 22, 2024

I use aws4fetch.esm.js (v1.0.18) in a Cloudflare worker to access DynamoDB. Occasionally I see the following exception thrown from the AWSClient.fetch method:

    Error: Network connection lost.
    at async AwsClient.fetch (aws4fetch.esm.js:69:19)

The problem is that exceptions thrown by await fetched in AWSClient.fetch short-circuit the retry loop. Short-circuiting the retry loop is fine for exceptions caused by a bad request, but it's not good for exceptions thrown due to transient network errors. Perhaps network exceptions could be caught in AWSClient.fetch so the retry loop can continue in those cases?

@mhart
Copy link
Owner

mhart commented Jun 22, 2024

The problem with these sorts of errors is you don't know whether they're safe to retry or not (or at least, aws4fetch wouldn't know generally). Eg, you might accidentally send duplicate messages over SNS or SQS, etc, etc.

It's possible I could introduce an opt-in option for it though. Aside from this error, are there any others you've observed?

@bflora2
Copy link
Author

bflora2 commented Jun 23, 2024

I've been using aws4fetch for at least a few years and I don't recall ever seeing it throw an exception for anything other than network errors. The MDN documentation page for fetch lists the reasons an exception might thrown, and network errors appears to be the only scenario listed that would warrant a retry.

In my case, I usually only see occasional instances of a single network error log appear even though the Cloudflare worker is normally being invoked multiple times per second (and always calls to AWS). So, retrying these AWS requests resulting in network errors would almost certainly be successful.

I suspect the majority of AWS calls are reads/writes to resources where a duplicate read/write would not cause an issue, and therefore, a retry on network error would be very desirable. In the scenarios you mentioned where a duplicate AWS call would be a problem, I think most devs would want to retry and deal with potential duplication issues on the AWS side. Otherwise, the dev has to choose between either not retrying or else attempting to communicate further with AWS to determine whether a retry is safe, neither of which seems like an optimal solution. In my case, I ignored the network errors, but only because the worker is performing non-critical work where a few dropped AWS calls per month is an insignificant issue. However, I will soon begin using aws4fetch for new high traffic Cloudflare workers whose work will be more critical, which has prompted me to finally look into the network errors.

If aws4fetch was new, and not yet released, I think retrying on network errors by default, while also providing an opt-out option, would be the most desirable choice. However, I can understand that you might be concerned about potential unintended consequences of making such a change given that aws4fetch is already in use. So, as you suggested, perhaps a new opt-in option is best. I'm sure whatever solution you might consider would be appreciated by us aws4fetch fans. :-) Thank you for listening.

@Manouchehri
Copy link
Contributor

@bflora2 What I recommend doing, is using .sign from aws4fetch, and then implementing whatever retry logic you want "normally".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants