Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log cancelled and failed requests #919

Merged
merged 2 commits into from
May 14, 2024
Merged

Log cancelled and failed requests #919

merged 2 commits into from
May 14, 2024

Conversation

mgyucht
Copy link
Contributor

@mgyucht mgyucht commented May 13, 2024

Changes

Currently, we do not log requests that failed to complete or that failed when the SDK started processing the response. This means that requests that timeout due to a client-side timeout or canceled context are not logged, meaning that there is no visibility in the log of these kinds of requests.

This PR changes the request processing and error handling to ensure that all requests are logged, even those that are cancelled or fail due to other errors. The general approach is to process as much of the response as possible until encountering an error, log what we've retrieved, and then do any post-processing.

Some changes are needed in the roundtrip stringer to handle the case of a nil response.

Tests

Unit tests verify that canceled requests or nil responses are logged.

  • make test passing
  • make fmt applied
  • relevant integration tests applied

@mgyucht mgyucht requested a review from pietern May 13, 2024 11:28
@mgyucht mgyucht changed the title Log cancelled requests Log cancelled and failed requests May 13, 2024
@codecov-commenter
Copy link

codecov-commenter commented May 13, 2024

Codecov Report

Attention: Patch coverage is 86.66667% with 4 lines in your changes are missing coverage. Please review.

Project coverage is 7.09%. Comparing base (27d08a6) to head (4deafb4).

Files Patch % Lines
apierr/errors.go 0.00% 4 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##            main    #919      +/-   ##
========================================
+ Coverage   7.08%   7.09%   +0.01%     
========================================
  Files        276     276              
  Lines      64376   64381       +5     
========================================
+ Hits        4563    4570       +7     
  Misses     59502   59502              
+ Partials     311     309       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

apierr/errors.go Outdated Show resolved Hide resolved
apierr/errors.go Outdated Show resolved Hide resolved
httpclient/api_client_test.go Outdated Show resolved Hide resolved
assert.Equal(t, `[DEBUG] non-retriable error: Get "/a": request timed out after 30s of inactivity
[DEBUG] GET /a
< Error: Get "/a": request timed out after 30s of inactivity
`, bufLogger.String())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could eliminate the first line to avoid duplication, perhaps?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should keep it since there are two distinct things being logged: the raw request/response pair, and then the decision to not retry the error. I've reordered them so that there is a logical flow in the log output.

apierr/errors.go Outdated Show resolved Hide resolved
// client failed to read the response body.
// Otherwise the response body will include details about the error.
if len(r.ResponseBody) == 0 && r.Err != nil {
sb.WriteString(fmt.Sprintf(" (Error: %s)", r.Err))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a question, when length of response body is > 0 (line 86), is it always true that the Err would be nil?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessarily. We call the ErrorMapper function that parses the response into one of the SDK errors depending on the result. If the API responds with any of our standard error codes, r.Err will be set to that.

This is just refactored from above, so there is no behavior change here.

@mgyucht mgyucht added this pull request to the merge queue May 14, 2024
Merged via the queue into main with commit 63343c8 May 14, 2024
7 checks passed
@mgyucht mgyucht deleted the log-on-context-timeout branch May 14, 2024 08:15
mgyucht added a commit that referenced this pull request May 14, 2024
* Fixed codecov for repository ([#909](#909)).
* Add traceparent header to enable distributed tracing. ([#914](#914)).
* Log cancelled and failed requests ([#919](#919)).

Dependency updates:

 * Bump golang.org/x/net from 0.22.0 to 0.24.0 ([#884](#884)).
 * Bump golang.org/x/net from 0.17.0 to 0.23.0 in /examples/zerolog ([#896](#896)).
 * Bump golang.org/x/net from 0.21.0 to 0.23.0 in /examples/slog ([#897](#897)).
@mgyucht mgyucht mentioned this pull request May 14, 2024
github-merge-queue bot pushed a commit that referenced this pull request May 14, 2024
* Fixed codecov for repository
([#909](#909)).
* Add traceparent header to enable distributed tracing.
([#914](#914)).
* Log cancelled and failed requests
([#919](#919)).

Dependency updates:

* Bump golang.org/x/net from 0.22.0 to 0.24.0
([#884](#884)).
* Bump golang.org/x/net from 0.17.0 to 0.23.0 in /examples/zerolog
([#896](#896)).
* Bump golang.org/x/net from 0.21.0 to 0.23.0 in /examples/slog
([#897](#897)).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants