-
Notifications
You must be signed in to change notification settings - Fork 698
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FIXED] Request or NextMsg with Context may return message after being canceled #387
Conversation
Context can be used to cancel blocking waiting for a message, and in the case that the context has already been canceled it is expected for it to return right away since the context no longer valid. Previous behavior was not deterministic since it was possible for a client to still receive a messages even though the context was already canceled. In order to fix this, now we check first whether the context has been canceled previous to waiting for the message to prevent successfully making requests with invalid contexts. Signed-off-by: Waldemar Quevedo <[email protected]>
Signed-off-by: Waldemar Quevedo <[email protected]>
context.go
Outdated
case <-ctx.Done(): | ||
return nil, ctx.Err() | ||
default: | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this check, if a message was received fast enough then it would have succeeded sometimes even if the context.Done()
channel was already closed because of select
randomly choosing to receive the message from the channel:
https://github.com/nats-io/go-nats/blob/2f0e8883ae354dfdeb89680e43c8c6918d34b815/context.go#L66-L84
context.go
Outdated
@@ -31,6 +31,13 @@ func (nc *Conn) RequestWithContext(ctx context.Context, subj string, data []byte | |||
if nc == nil { | |||
return nil, ErrInvalidConnection | |||
} | |||
// Check whether the context is done already before making | |||
// the request. | |||
select { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any other way to check this state without a select etc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I usually check if ctx.Err() is nil. Not sure if that’s the intended correct way but seems to work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Godoc says
// If Done is not yet closed, Err returns nil.
// If Done is closed, Err returns a non-nil error explaining why:
// Canceled if the context was canceled
// or DeadlineExceeded if the context's deadline passed.
// After Err returns a non-nil error, successive calls to Err return the same error.
Err() error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, no need for select
here since there would be already an error when canceled. Thanks.
Signed-off-by: Waldemar Quevedo <[email protected]>
context.go
Outdated
@@ -33,10 +33,8 @@ func (nc *Conn) RequestWithContext(ctx context.Context, subj string, data []byte | |||
} | |||
// Check whether the context is done already before making | |||
// the request. | |||
select { | |||
case <-ctx.Done(): | |||
if ctx.Err() != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this condition change after this call but before we evaluate ctx->Done() below? Leading to the same problem as before but with a slimmer chance of happening? /cc @kozlovic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but that's ok. I believe the semantic here is that if the context has timed-out we need to fail the call, even if there are messages. Past that point it simply means that it has not yet expired and could expire at any moment after that. We may get one message if it is readily available and while the context just times out, but after that, we are guaranteed to always return timeout if using this context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In other words, the issue with prior code is that after a context timed out, we could randomly get a message or not by calling request several times. With this change, we won't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, yes the goal of the change is to guarantee that calling request or next message fails right away if context was canceled before entering the function.
LGTM |
1 similar comment
LGTM |
Currently behavior of context in the client is not deterministic when the context has already been canceled and is possible for a client to still receive messages from a request, as shown in the test occasionally failing in Travis:
In order to fix this, now we check first whether the context has been canceled already previous to waiting for the message to prevent successfully making requests using an invalid context.