-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
thrift: deal with TStruct serialization failures #744
Conversation
Codecov Report
@@ Coverage Diff @@
## dev #744 +/- ##
==========================================
+ Coverage 87.78% 88.17% +0.39%
==========================================
Files 40 40
Lines 4101 4101
==========================================
+ Hits 3600 3616 +16
+ Misses 382 370 -12
+ Partials 119 115 -4
Continue to review full report at Codecov.
|
*NOTE*: I need feedback on XXX lines. A thrift server handler can return success with a TStruct that may return an error when writing a serialized form. This error is ignored, which cause a invalid third argument to be written (or not written at all). We can better handle these kinds of errors earlier by buffering a serialized TStruct and sending along a system error if the serialization fails. Fixes #743.
7c16143
to
b07b24f
Compare
thrift/server.go
Outdated
@@ -185,6 +186,17 @@ func (s *Server) handle(origCtx context.Context, handler handler, method string, | |||
return err | |||
} | |||
|
|||
// Buffer the response write so we can send a system error to the client. | |||
// It's too late to send a system error by the time we write Arg3. | |||
var respStruct bytes.Buffer // XXX should we make a buffer pool? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That’s a decision best made with data, but I expect pooling to yield a 20% improvement in latency.
thrift/server.go
Outdated
var respStruct bytes.Buffer // XXX should we make a buffer pool? | ||
wp = getProtocolWriter(&respStruct) | ||
if err := resp.Write(wp.protocol); err != nil { | ||
call.Response().SendSystemError(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can return an err at which point it should be a tableflip
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same goes with writing out the buffer, eg the socket died. Plenty of tableflip failure modes that will at least get a server side error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to avoid additional buffering if possible -- it's an easy fix, but it also has a performance impact that might not be necessary.
Have we tried other options:
- writing directly to arg3, but then sending an error halfway through?
- writing to arg3, but if it fails, logging an error, and blackholing that response (no more bytes, but the response is never completed. client will see a timeout)
If we have to buffer, then we probably should pool. We'll need benchmark numbers to compare, but considering most other things are pooled, it seems like this would have a visible performance impact.
thrift/server.go
Outdated
@@ -185,6 +186,17 @@ func (s *Server) handle(origCtx context.Context, handler handler, method string, | |||
return err | |||
} | |||
|
|||
// Buffer the response write so we can send a system error to the client. | |||
// It's too late to send a system error by the time we write Arg3. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure about this? I'm quite sure Muttley can send an error halfway through a response, and the clients see the error.
I'm still looking into this. I did a quick pass and wasn't sure we can terminate a half-written arg3 and then start another trailer arg.
I'm trying to avoid this case. Timeouts don't tell the client what happened and the server knows what happened. |
As discussed previously, this eliminates buffering of the responses. It appears that TChannel can send system errors after partial writes to the output stream. This also adds a test for the partial writes case.
@prashantv I've removed the buffering of responses and added a test to verify |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice tests!
err = writer.Close() | ||
defer thriftProtocolPool.Put(wp) | ||
|
||
if err := resp.Write(wp.protocol); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(unrelated to this diff but just thought of this) @twilly just as a safety check, let's verify that the passed in protocol here is a TBinaryProtocol
so that users don't end up calling Write
with a custom protocol and expect it to work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. Created an internal ticket to track this.
A Thrift server handler can return success with a TStruct that may
return an error when writing a serialized form. This error is ignored,
which cause a invalid third argument to be written (or not written at
all).
We can better handle these kinds of errors earlier by buffering a
serialized TStruct and sending along a system error if the
serialization fails.
Fixes #743