-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unexpected exceptions when server response or client request exceeds Max Message Size #3996
Comments
There's a lot here. To confirm I understand, there's two problems:
Is that right? |
Thats correct. In addition: |
@srujann @ejona
But when the response message has size greater than 4MB, I also got the same exceptions. |
@ejona86 Is this bug going to be fixed any time soon? I am using version 1.35.1 and still seeing the issue where the client is not receiving an informative message, in the case, the client sends a request exceeding 4 MB. Server logs: Client logs: |
Python server: fine grpc-java v1.44.0 OpenJDK 17.0.1 on macOS : all fine OpenJDK 11.0.11 on Ubuntu:
Config: val maxMsgBytes = DecimalByteUnit.MEGABYTES.toBytes(100)
val serviceConfig = mapOf(
"methodConfig" to listOf(
mapOf(
"name" to listOf(mapOf("service" to "yanic.Yanic")),
"retryPolicy" to mapOf(
"maxAttempts" to "20",
"initialBackoff" to "0.1s",
"maxBackoff" to "10s",
"backoffMultiplier" to "2",
"retryableStatusCodes" to listOf("UNAVAILABLE"),
),
"waitForReady" to true,
"maxRequestMessageBytes" to maxMsgBytes.toString(),
"maxResponseMessageBytes" to maxMsgBytes.toString(),
),
),
)
val chan =
Grpc.newChannelBuilderForAddress("localhost", port, InsecureChannelCredentials.create())
.enableRetry()
.defaultServiceConfig(serviceConfig)
.retryBufferSize(maxMsgBytes)
.maxInboundMessageSize(maxMsgBytes.toInt())
.maxInboundMetadataSize(maxMsgBytes.toInt())
.build()
val opts = CallOptions.DEFAULT
.withWaitForReady()
.withMaxInboundMessageSize(maxMsgBytes.toInt())
.withMaxOutboundMessageSize(maxMsgBytes.toInt()) |
I believe I have reproduced this issue on v1.64.0-SNAPSHOT (aka the tip of
So the client sees RST_STREAM with no context. If we want the client to receive the status, I think we'll need to explicitly send trailers (which should also serve to close the stream). I think that ryanpbrewster#1 works, but will clean it up and send it out for review. |
The difficulty with fixing this is the deframer and framer are running on different threads, and there's no way for us to enqueue work to the framing thread. Cancellation/resets are already expected to be racy and handled cleanly. But if the deframer closes with trailers that 1) can trigger in the middle of sending a message which the receiver won't like (but that's probably okay), and 2) will require the netty handler to notice and throw away later data and trailers created by the framer. To explain (1) a bit more, when serializing protobuf, the framer produces chunks of data containing 0-many messages. A large message is split into multiple chunks of data. Those are passed via a queue from the framing thread to the netty handler where they are sent. Any trailer triggered by the deframer will go into that same queue. |
If the deframer enqueues a command to send trailers + close, my understanding is that:
|
Well, my point was right now you wouldn't, in certain cases. You will see an error about a truncated message. I still think that's okay, and is a step in the right direction. We just may need future work in the receiver to expose "an error in the trailers if the trailers had an error, otherwise generate an error."
Once you get to a certain level, yes. But plenty of this code has never needed to deal with this so 1) we need to audit/be prepared for broken code and 2) consider how errors are handled. |
Makes sense, and understood. I'm going to open a PR and link it to this issue. If you don't mind, I'll tag you there and we can discuss the specific implementation. |
Today, deframer errors cancel the stream without communicating a status code to the peer. This change causes deframer errors to trigger a best-effort attempt to send trailers with a status code so that the peer understands why the stream is being closed. Fixes grpc#3996
some unexpected exceptions are seen in the server log when a server response exceeds the max message size of the client channel or vice versa. Client log does show valid errors.
This issue is consistently reproducible with a hello world rpc example where request/response is a huge string with size greater than the default message size of 4MiB.
What version of gRPC are you using?
1.9.0
When server response exceeds max message size
Unexpected Exceptions in the server log:
Client log:
When client request exceeds max message size
Valid exception, followed by some unexpected exception in server log
Client log:
The text was updated successfully, but these errors were encountered: