Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add traceparent header to enable distributed tracing. #914

Merged
merged 3 commits into from
May 8, 2024
Merged

Conversation

mgyucht
Copy link
Contributor

@mgyucht mgyucht commented May 7, 2024

Changes

Currently, it is difficult to correlate requests made by a client with a request made by a backend service, especially if the same request is made multiple times. The REST API supports the Trace Context standard which defines a standard way of propagating tracing information through HTTP headers.

This PR implements a traceparent generator to construct new traceparent headers in accordance with this standard. These traceparents are attached to the headers of each individual request. The resulting header is visible when debug logs are enabled and DATABRICKS_DEBUG_HEADERS is set.

Tests

Added a unit test to ensure that new traceparents are set for every request.

It's hard to test that a debug log actually contains this header without rewriting a lot of the matching logic for debug logs, but I did need to remove traceparent from those unit tests, suggesting that debug logging does include traceparent (like any other header).

Manually tested that the traceparent set by the client is visible to the server by checking our internal access logs.

  • make test passing
  • make fmt applied
  • relevant integration tests applied

@mgyucht mgyucht requested a review from hectorcast-db May 7, 2024 11:26
Copy link
Contributor

@hectorcast-db hectorcast-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a note, otherwise it looks good.

@mgyucht mgyucht enabled auto-merge May 7, 2024 15:23
@mgyucht mgyucht disabled auto-merge May 7, 2024 15:23
@@ -99,7 +99,7 @@ type Config struct {
// Use at your own risk or for unit testing purposes.
InsecureSkipVerify bool `name:"skip_verify" auth:"-"`

// Number of seconds for HTTP timeout. Default is 300 (5 minutes).
// Number of seconds for HTTP timeout. Default is 60 (1 minute).
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a typo from before, the default has always been 60 seconds.

@mgyucht mgyucht added this pull request to the merge queue May 8, 2024
Merged via the queue into main with commit 27d08a6 May 8, 2024
7 checks passed
@mgyucht mgyucht deleted the add-traceparent branch May 8, 2024 13:46
mgyucht added a commit that referenced this pull request May 14, 2024
* Fixed codecov for repository ([#909](#909)).
* Add traceparent header to enable distributed tracing. ([#914](#914)).
* Log cancelled and failed requests ([#919](#919)).

Dependency updates:

 * Bump golang.org/x/net from 0.22.0 to 0.24.0 ([#884](#884)).
 * Bump golang.org/x/net from 0.17.0 to 0.23.0 in /examples/zerolog ([#896](#896)).
 * Bump golang.org/x/net from 0.21.0 to 0.23.0 in /examples/slog ([#897](#897)).
@mgyucht mgyucht mentioned this pull request May 14, 2024
github-merge-queue bot pushed a commit that referenced this pull request May 14, 2024
* Fixed codecov for repository
([#909](#909)).
* Add traceparent header to enable distributed tracing.
([#914](#914)).
* Log cancelled and failed requests
([#919](#919)).

Dependency updates:

* Bump golang.org/x/net from 0.22.0 to 0.24.0
([#884](#884)).
* Bump golang.org/x/net from 0.17.0 to 0.23.0 in /examples/zerolog
([#896](#896)).
* Bump golang.org/x/net from 0.21.0 to 0.23.0 in /examples/slog
([#897](#897)).
github-merge-queue bot pushed a commit to databricks/cli that referenced this pull request Nov 1, 2024
## Changes

This PR adds the `cmd-exec-id` field to the user agent. This allows us
to correlate multiple HTTP requests made from the CLI.

### Why Not Use HTTP traceparent?
We considered using the traceparent header in HTTP as an alternative,
but it's not a good fit for our use case. Here's why:
1. Purpose of traceparent: It's designed to trace a single HTTP request
across a distributed system as it moves through subsystems and proxies.
2. Our requirement: We need to trace multiple HTTP requests made during
a single command execution in the CLI.

For more details about how traceparent itself works and how it's used in
the Go SDK, see
databricks/databricks-sdk-go#914.

## Tests
Unit test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants