Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When using M2M client credentials, after an hour, can no longer refresh token #671

Closed
thestephenstanton opened this issue Oct 25, 2023 · 9 comments
Assignees

Comments

@thestephenstanton
Copy link

Description
Getting this error:

inner token: Post \"https://accounts.cloud.databricks.com/oidc/accounts/*REDACTED*/v1/token\": context canceled"

I am not even touching the context, I'm simply passing the request context. So not sure how the context is canceled.

Also, I'm not canceling the context anywhere, which tells me the DBX SDK is somewhere.

Reproduction

I did this using a HTTP server so that the application can keep living and can call it at any time. Will leave out HTTP server setup for brevity.

Set up client (using env vars)

dbxAccountClient, err := databricks.NewAccountClient()

Handler

func (h handler) handle(w http.ResponseWriter, r *http.Request) {
	userInfo, err := h.dbxAccountClient.Users.GetByUserName(r.Context(), "stephen")
	if err != nil {
		slog.Error("failed to get user info", "error", err.Error())
		w.WriteHeader(http.StatusInternalServerError)
		return
	}

	fmt.Fprintf(w, "user info: %+v", userInfo)
}

Hit the endpoint first to trigger the lazy load of the token then wait 1 hour for JWT to expire and make the call again, you will get the error.

Expected behavior

Token should refresh.

Other Information

  • OS: MacOS
  • Version: v0.24.0
@mgyucht mgyucht self-assigned this Oct 25, 2023
@mgyucht
Copy link
Contributor

mgyucht commented Oct 25, 2023

When running your app, can you set the log level to trace as described in the issue template and attach logs? Feel free to attach the setup code as well, perhaps as a Github Gist, that way I can run your reproduction more easily.

We have retry logic with timeouts when refreshing the OAuth token. There could definitely be an issue with this logic. Is it possible that something changed about your network in this period that could have caused the token refresh logic to fail?

@mgyucht
Copy link
Contributor

mgyucht commented Oct 25, 2023

I am getting a bit closer. There are a couple of points:

  1. The OAuth2 library in Golang caches the context.Context specified at token source creation time and uses this context when refreshing the token.
  2. The Go SDK supports lazy authentication: if you haven't explicitly called Config.Authenticate(), the Go SDK will call that to load the credential provider for your client. The context used for this will be the same as the one provided in the request.

Is it possible your requests have any timeout/cancellation called by your web framework?

I think there should be a relatively simple fix: before starting your webserver, call Config.Authenticate() using a context with no timeout.

@thestephenstanton
Copy link
Author

@mgyucht I'll add those traces in the morning and report them back. I will also try the explicit authenticate call.

One thing I did notice that is a pretty big clue, if that initial call I do, if I use context.Background(), then I never have this problem. But if I use the r.Context(), it will have the problem. In other words, it looks like with this lazy load functionality, it is caching the first context it gets. And since that first context from r.Context() inevitably gets cancelled, every subsequent use of that context ends up saying that the context was cancelled.

All that said, that lines up with what you're saying about using Config.Authenticate() using a context with no timeout.

However, why would the authentication be caching a context?

@thestephenstanton
Copy link
Author

I'll also make a Gist tomorrow as well.

@mgyucht
Copy link
Contributor

mgyucht commented Oct 26, 2023

Auth caches the context because we use Golang's OAuth2 library, which caches the context used when constructing the Token Source. They don't expose a context parameter in the Token() method. See golang/oauth2#262 and golang/oauth2#388 (comment) for more context.

@thestephenstanton
Copy link
Author

Ok yeah, I see now. It makes sense why they added that so that you can have a timeout on the actual request for a token. However, it's almost a double-edged sword because whatever context you pass there will inevitably be canceled. And unless you use context.Background(), then you will run into this problem. Which seems like it defeats the whole purpose of passing the context in the first place.

I was going to suggest that here it be changed to context.Background(), however, that has its own drawbacks like never being able to stop the request if it hangs.

What are your thoughts?

@mgyucht
Copy link
Contributor

mgyucht commented Oct 26, 2023

I guess one other option you would have is to customize the HTTP client that is being used. The default HTTP client doesn't enforce any timeouts, but you could specify a custom client that adds timeouts to OAuth requests (see https://github.com/golang/oauth2/blob/master/internal/transport.go#L25 for where this is configured, and see https://cs.opensource.google/go/go/+/refs/tags/go1.21.3:src/net/http/client.go;l=106 for the Timeout field of net/http's Client struct). In this case, you would still need to pass this HTTP client via the context used by the first call to Authenticate().

@mgyucht
Copy link
Contributor

mgyucht commented Nov 1, 2023

For future reference: users who are using OAuth for a long-lived application should ensure that the Authenticate() method is called with a context that has no timeout to prevent this from happening.

@thestephenstanton do you have any other questions? If not, or if I don't hear back from you in the next week, I'll close this out.

@thestephenstanton
Copy link
Author

Sorry @mgyucht, I meant to comment back. I think that is fine now that it is semi-documented here. That is essentially what I am doing now and it is working. Thanks for the help! I'll close out the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants