-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ClientConn: fix Dial using grpc.WithTimeout() #2737
Conversation
Commit 955eb8a ("channelz: cleanup channel registration if Dial fails (grpc#2733)") moved a defer block earlier in DialContext() to ensure that cc.Close() was always called. This defer block also checks whether the ctx.Done() is true, and if so ensures the context error is returned. If the dial options include a timeout, the original context gets replaced with a new context that has the timeout, and this gets a catchall `defer cancel()` to go with it. However, this cancel() now gets called before the cleanup defer block, so when the latter runs the context is always already cancelled. Fix by splitting the larger defer block into two parts: - The part that does cc.Close() stays near the beginning of the method. - The part that checks ctx.Done() returns to below the `defer cancel()` call, and so gets invoked before it.
Thank you for your pull request. Before we can look at your contribution, we need to ensure all contributors are covered by a Contributor License Agreement. After the following items are addressed, please respond with a new comment here, and the automated system will re-verify.
Regards, |
CNCF account created as a Google employee. |
r.InitialState(resolver.State{Addresses: []resolver.Address{lisAddr}}) | ||
client, err := Dial(r.Scheme()+":///test.server", WithInsecure(), WithTimeout(5*time.Second)) | ||
close(dialDone) | ||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you still working on this test? It doesn't seem to test the error behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Er, not really :-). I got it just far enough so it fails with the DialContext
timeout, but I don't know what else would be worth testing after that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, now I see what it's testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix! LGTM.
r.InitialState(resolver.State{Addresses: []resolver.Address{lisAddr}}) | ||
client, err := Dial(r.Scheme()+":///test.server", WithInsecure(), WithTimeout(5*time.Second)) | ||
close(dialDone) | ||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, now I see what it's testing.
Fixes #2736