Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

client:fix ttrpc send hang #173

Closed
wants to merge 1 commit into from

Conversation

ningmingxiao
Copy link

@ningmingxiao ningmingxiao commented Sep 9, 2024

fix:#174
@dmcgowan @fuweid @thaJeztah can you review my pr? thank you.

@dmcgowan
Copy link
Member

The client context scope should not be considered during send here. Since the context is at the client scope, it would be expected that unblocking sends would be done by closing the underlying connection. The sender could be legitimately waiting on writing to the connection.

What is the caller error that would cause a hang here? The caller has timed out by something is waiting for ttrpc to finish? Seems that should be handled by the caller not ttrpc.

@ningmingxiao
Copy link
Author

ningmingxiao commented Sep 11, 2024

I use containerd 1.7.3 https://github.com/containerd/containerd/blob/v1.7.3/services/tasks/local.go#L340
containerd try to connect shim to get container status, it hanged 1818 minutes.

ctx timeout is 2s.

func getProcessState(ctx context.Context, p runtime.Process) (*task.Process, error) {
	ctx, cancel := timeout.WithContext(ctx, stateTimeout)
	defer cancel()

	state, err := p.State(ctx)

createStream ctx never timeout. It should check timeout in dispatch func.

func (c *Client) createStream(flags uint8, b []byte) (*stream, error) {
...
	select {
	case <-c.ctx.Done():
// NewClient creates a new ttrpc client using the given connection
func NewClient(conn net.Conn, opts ...ClientOpts) *Client {
	ctx, cancel := context.WithCancel(context.Background())

@ningmingxiao ningmingxiao force-pushed the ttrpc_hang branch 4 times, most recently from 3d877a2 to 173f9f8 Compare September 11, 2024 03:21
Signed-off-by: ningmingxiao <[email protected]>
@kevpar
Copy link
Member

kevpar commented Oct 16, 2024

As I mentioned here, this may be a duplicate of an issue we already fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants