Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend retryablehttp limits in SDK base client #945

Merged
merged 1 commit into from
Mar 27, 2024

Conversation

manicminer
Copy link
Contributor

@manicminer manicminer commented Mar 26, 2024

Explicitly set three fields in the retryablehttp.Client struct.

RetryWaitMin is the shortest wait time before attempting to retry the request, and is the same as the default, but set explicitly here for clarity.

RetryWaitMax is the longest wait time before attempting to retry the request. The default is 30 seconds, we extend this to 61 second in consideration of potential API throttling for especially long-lasting eventual consistency issues.

With these values, the retry curve looks something like this.

RetryMax is the maximum number of retries for a given API request, and is given the most consideration here. The default is 4, which is far too little for our purposes. Here, we set it to 16, which in conjunction with the above two field values, and when implemented using an exponential backoff algorithm, should take approximately 10 minutes to exhaust.

However, in consideration of really long lasting retries, we inspect the context and increase this if it looks like the retries will be exhausted before the context deadlines. The goal is to try and deadline the context just before the retries are exhausted.

For compatibility, since we don't strictly require a deadline to execute a request, we don't require one here.

Related: hashicorp/terraform-provider-azurerm#25397

Explicitly set three fields in the `retryablehttp.Client` struct.

`RetryWaitMin` is the shortest wait time before attempting to retry the
request, and is the same as the default, but set explicitly here for
clarity.

`RetryWaitMax` is the longest wait time before attempting to retry the
request. The default is 30 seconds, we extend this to 61 second in
consideration of potential API throttling for especially long-lasting
eventual consistency issues.

`RetryMax` is the maximum number of retries for a given API request, and
is given the most consideration here. The default is 4, which is far too
little for our purposes. Here, we set it to 16, which in conjunction
with the above two field values, and when implemented using an
exponential backoff algorithm, should take approximately 10 minutes to
exhaust.

However, in consideration of _really_ long lasting retries, we inspect
the context and increase this if it looks like the retries will be
exhausted before the context deadlines. The goal is to try and deadline
the context just before the retries are exhausted.

For compatibility, since we don't strictly require a deadline to execute
a request, we don't require one here.
@manicminer manicminer added enhancement New feature or request release-once-merged The SDK should be released once this PR is merged base-layer labels Mar 26, 2024
@manicminer manicminer requested a review from a team as a code owner March 26, 2024 18:47
Copy link
Contributor

@tombuildsstuff tombuildsstuff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment but otherwise LGTM 👍

@@ -672,6 +672,18 @@ func (c *Client) retryableClient(checkRetry retryablehttp.CheckRetry) (r *retrya
r.CheckRetry = checkRetry
r.ErrorHandler = RetryableErrorHandler
r.Logger = log.Default()
r.RetryWaitMin = 1 * time.Second
r.RetryWaitMax = 61 * time.Second
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW some ARM API's do have Retry-After headers with an excessive timeout (e.g. 10m) - but I don't see any harm in us polling more regularly given that extended interval so 👍

Copy link
Contributor Author

@manicminer manicminer Mar 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh if the response includes a Retry-After header, that is always respected/selected first before falling back to exponential backoff (i.e. these are not used in that case)

@manicminer manicminer merged commit 284232d into main Mar 27, 2024
6 checks passed
@manicminer manicminer deleted the sdk/retryable-limits branch March 27, 2024 16:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
base-layer enhancement New feature or request release-once-merged The SDK should be released once this PR is merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants