-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add features tests for gRPC retry behavior #435
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This generally makes sense to me and thanks for some of the various cleanup that's in here. My only question is around the need for this proxy to be its own binary/process. We could save a little argparsing and CI code if it's not.
@@ -154,6 +204,22 @@ func (r *Runner) Run(ctx context.Context, patterns []string) error { | |||
return err | |||
} | |||
|
|||
var fn func(context.Context, *cmd.Run) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var fn func(context.Context, *cmd.Run) error | |
var langRunFn func(context.Context, *cmd.Run) error |
@@ -0,0 +1,752 @@ | |||
package main |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not immediately obvious to me why this guy needs to be his own executable? Seemingly the runner could start this up internally rather than needing to start it as a subprocess?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concur. @chronos-tachyon - take a peek at what I recently had to do with HTTP proxying: #448 (ignore the subprocess stuff which was because I had to test environment variables safely and unrelated to the HTTP proxy part). Some notes:
- Can have a feature's config opt-in to the proxy so if someone runs a specific feature without needing a proxy it's not started
- Can name it "tcp proxy" to disambiguate from http proxy (note, I called the other "http proxy" to avoid this confusion too)
- Can make a
TCPProxy
utility as a Go library that the command just happens to use. This can help properly encapsulate the tool (even though it'll only ever be used by the cmd) instead of using globals and subprocesses and such - Features know if they need a proxy, so they can choose to, say, dial to that address instead of the default one. No need to affect existing running features by making them go through the proxy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't do it in-process because, when I did, there were too many CPU-bound goroutines to allow the proxy server's goroutines to get serviced in a timely manner (delays on the order of 20 seconds).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is very surprising. We test the HTTP proxy embedded in the main process (not the per-lang worker process) without any such delays. I wonder what is causing such delays. We shouldn't really have many CPU-bound goroutines here except for the case of running Go features embedded (every other feature type of run uses external processes to run the features). Also, can we consider only using the TCP proxy for features that need it (i.e. they can connect different clients to the proxy inside their tests instead of using default client)?
- POST /reject | ||
Immediately reject incoming gRPC requests with UNAVAILABLE. | ||
|
||
- POST /accept | ||
Accept incoming gRPC requests; this is the default. | ||
|
||
- POST /freeze | ||
Block on incoming accepted gRPC requests. | ||
|
||
- POST /thaw | ||
Process incoming accepted gRPC requests immediately; this is the default. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume posting to these switches to the corresponding "mode"? Might be worth mentioning this modality concept explicitly.
} | ||
}() | ||
|
||
Info("HTTP control server is running on: %s", flagControl) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: If user passes 127.0.0.1:0
this would show 0 rather than the bound port, yes?
Not really a problem since we're predetermining available ports and using those, but thought I'd mention it since this can be run directly in theory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When this is made a library as part of a runner, we should let this listener pick the random port and extract the address out of it instead temporary-listener-free-port approach (which is a good approach in general, but won't be needed when this is not a separate process).
func HandleHelp(w http.ResponseWriter, r *http.Request) { | ||
if path.Clean(r.URL.Path) != "/" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice touch
var ( | ||
flagTrace bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit in the same vein as asking why this is its own binary, it might be nice to move all of these into some FeaturesProxyServer
struct or something
s.gc = nil | ||
s.gs = nil | ||
s.l = nil | ||
s.wc = nil | ||
s.ws = nil | ||
s.quitCh = nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like it might be easy to miss any future fields here. I'm wondering if just having three top-level fields: The mutex, condvar, and an "everything else" container might be nice.
It could also potentially let you have some generic code for this pattern that the servers can share.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All makes sense. High-level notes on the separate proxy process.
@@ -154,6 +204,22 @@ func (r *Runner) Run(ctx context.Context, patterns []string) error { | |||
return err | |||
} | |||
|
|||
var fn func(context.Context, *cmd.Run) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No disagreement with extracting switch case content into methods, but why move the switch up here? If it's all the same, can just leave switch where it is and make the actual invocations in each case
@@ -0,0 +1,752 @@ | |||
package main |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concur. @chronos-tachyon - take a peek at what I recently had to do with HTTP proxying: #448 (ignore the subprocess stuff which was because I had to test environment variables safely and unrelated to the HTTP proxy part). Some notes:
- Can have a feature's config opt-in to the proxy so if someone runs a specific feature without needing a proxy it's not started
- Can name it "tcp proxy" to disambiguate from http proxy (note, I called the other "http proxy" to avoid this confusion too)
- Can make a
TCPProxy
utility as a Go library that the command just happens to use. This can help properly encapsulate the tool (even though it'll only ever be used by the cmd) instead of using globals and subprocesses and such - Features know if they need a proxy, so they can choose to, say, dial to that address instead of the default one. No need to affect existing running features by making them go through the proxy.
@@ -221,6 +287,49 @@ func (r *Runner) Run(ctx context.Context, patterns []string) error { | |||
return err | |||
} | |||
} | |||
r.config.DirectServer = r.config.Server |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO the direct server should be the server and the proxy can be provided for features that know they need it (and the features can create clients to the proxy that know they need to test with the proxy)
func (s *ControlServer) serveThread() { | ||
defer s.finish() | ||
|
||
err := s.server.Serve(s.l) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Big fan of using HTTP here. I wish when the "summary server" thing was done HTTP was used there instead of homemade Temporal binary protocol, heh.
} | ||
}() | ||
|
||
Info("HTTP control server is running on: %s", flagControl) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When this is made a library as part of a runner, we should let this listener pick the random port and extract the address out of it instead temporary-listener-free-port approach (which is a good approach in general, but won't be needed when this is not a separate process).
What was changed
Why?
We need tests that verify the client's retry behavior. A comment on features#326 says that @rross reports that sdk-java doesn't retry correctly on the first connection. Indeed, these new tests confirmed that a bug exists in sdk-java's GrpcRetryer, as the
serverCapabilities.get()
call can throw in a place that the code doesn't anticipate exceptions.From here, there is some future work to be done to make these tests more generally useful: