[8.16] Instrumentation: fix log trace inconsistent status code with timeout check when writing the response (backport #15123) #15164
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation/summary
This fix is branched from #15117.
This PR replaces
TimeoutMiddleware
with proactively checking for context timeout before writing the response result. The rationale behind this change is that request timeout only ever matters if it happened before the response was written. Otherwise, the client won't care about the response anyway and it's logical for the server to emit either of two error signals consistently in self instrumentation at this stage.As a result of this changes depending if request timeout happened before or after response was written the self instrumentation will either emit original error transaction with an error log or timeout error transaction with an error log.
This is alternative to errors chaining PR #15122 which will preserve both error logs. Update: After some brief discussion we agreed to move forward with this option for the fix instead of more complex error chaining.
Checklist
For functional changes, consider:
How to test these changes
This PR includes a unit test that encapsulates the condition to simulate the issue. In order to reproduce against a real instance of APM Server follow the recipe from this comment #14232 (comment).
Related issues
#15122
#15117
Fixes #14232
This is an automatic backport of pull request #15123 done by Mergify.