Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected exceptions when server response or client request exceeds Max Message Size #3996

Closed
srujann opened this issue Jan 25, 2018 · 10 comments
Labels
Milestone

Comments

@srujann
Copy link

srujann commented Jan 25, 2018

some unexpected exceptions are seen in the server log when a server response exceeds the max message size of the client channel or vice versa. Client log does show valid errors.

This issue is consistently reproducible with a hello world rpc example where request/response is a huge string with size greater than the default message size of 4MiB.

What version of gRPC are you using?

1.9.0

When server response exceeds max message size

Unexpected Exceptions in the server log:

WARNING: Stream Error
io.netty.handler.codec.http2.Http2Exception$StreamException: Stream closed before write could take place
	at io.netty.handler.codec.http2.Http2Exception.streamError(Http2Exception.java:149)
	at io.netty.handler.codec.http2.DefaultHttp2RemoteFlowController$FlowState.cancel(DefaultHttp2RemoteFlowController.java:499)
	at io.netty.handler.codec.http2.DefaultHttp2RemoteFlowController$FlowState.cancel(DefaultHttp2RemoteFlowController.java:480)
	at io.netty.handler.codec.http2.DefaultHttp2RemoteFlowController$1.onStreamClosed(DefaultHttp2RemoteFlowController.java:105)
	at io.netty.handler.codec.http2.DefaultHttp2Connection.notifyClosed(DefaultHttp2Connection.java:349)
	at io.netty.handler.codec.http2.DefaultHttp2Connection$ActiveStreams.removeFromActiveStreams(DefaultHttp2Connection.java:985)
	at io.netty.handler.codec.http2.DefaultHttp2Connection$ActiveStreams.deactivate(DefaultHttp2Connection.java:941)
	at io.netty.handler.codec.http2.DefaultHttp2Connection$DefaultStream.close(DefaultHttp2Connection.java:497)
	at io.netty.handler.codec.http2.DefaultHttp2Connection$DefaultStream.close(DefaultHttp2Connection.java:503)
	at io.netty.handler.codec.http2.Http2ConnectionHandler.closeStream(Http2ConnectionHandler.java:587)
	at io.netty.handler.codec.http2.DefaultHttp2ConnectionDecoder$FrameReadListener.onRstStreamRead(DefaultHttp2ConnectionDecoder.java:356)
	at io.netty.handler.codec.http2.Http2InboundFrameLogger$1.onRstStreamRead(Http2InboundFrameLogger.java:80)
	at io.netty.handler.codec.http2.DefaultHttp2FrameReader.readRstStreamFrame(DefaultHttp2FrameReader.java:516)
	at io.netty.handler.codec.http2.DefaultHttp2FrameReader.processPayloadState(DefaultHttp2FrameReader.java:260)
	at io.netty.handler.codec.http2.DefaultHttp2FrameReader.readFrame(DefaultHttp2FrameReader.java:160)
	at io.netty.handler.codec.http2.Http2InboundFrameLogger.readFrame(Http2InboundFrameLogger.java:41)
	at io.netty.handler.codec.http2.DefaultHttp2ConnectionDecoder.decodeFrame(DefaultHttp2ConnectionDecoder.java:118)
	at io.netty.handler.codec.http2.Http2ConnectionHandler$FrameDecoder.decode(Http2ConnectionHandler.java:388)
	at io.netty.handler.codec.http2.Http2ConnectionHandler.decode(Http2ConnectionHandler.java:448)
	at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:489)
	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:428)
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:265)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1359)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:935)
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:134)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
	at java.lang.Thread.run(Thread.java:745)

Client log:

INFO: Will try to greet  ...
Jan 24, 2018 6:44:16 PM io.grpc.internal.AbstractClientStream$TransportState inboundDataReceived
INFO: Received data on closed stream
Jan 24, 2018 6:44:16 PM io.grpc.internal.AbstractClientStream$TransportState inboundDataReceived
INFO: Received data on closed stream
Jan 24, 2018 6:44:16 PM io.grpc.internal.AbstractClientStream$TransportState inboundDataReceived
INFO: Received data on closed stream
Jan 24, 2018 6:44:16 PM io.grpc.internal.AbstractClientStream$TransportState inboundDataReceived
INFO: Received data on closed stream
Jan 24, 2018 6:44:16 PM io.grpc.internal.AbstractClientStream$TransportState inboundDataReceived
INFO: Received data on closed stream
Jan 24, 2018 6:44:16 PM io.grpc.examples.helloworld.HelloWorldClient greet
WARNING: RPC failed: Status{code=RESOURCE_EXHAUSTED, description=io.grpc.netty.NettyClientTransport$3: Frame size 59000005 exceeds maximum: 4194304. , cause=null}
Jan 24, 2018 6:44:16 PM io.grpc.internal.AbstractClientStream$TransportState inboundDataReceived
INFO: Received data on closed stream
Jan 24, 2018 6:44:16 PM io.grpc.internal.AbstractClientStream$TransportState inboundDataReceived
INFO: Received data on closed stream

When client request exceeds max message size

Valid exception, followed by some unexpected exception in server log

WARNING: Exception processing message
io.grpc.StatusRuntimeException: RESOURCE_EXHAUSTED: io.grpc.netty.NettyServerStream$TransportState: Frame size 59000005 exceeds maximum: 4194304. 
	at io.grpc.Status.asRuntimeException(Status.java:517)
	at io.grpc.internal.MessageDeframer.processHeader(MessageDeframer.java:391)
	at io.grpc.internal.MessageDeframer.deliver(MessageDeframer.java:271)
	at io.grpc.internal.MessageDeframer.request(MessageDeframer.java:165)
	at io.grpc.internal.AbstractStream$TransportState.requestMessagesFromDeframer(AbstractStream.java:202)
	at io.grpc.netty.NettyServerStream$Sink$1.run(NettyServerStream.java:100)
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
	at java.lang.Thread.run(Thread.java:745)

Jan 24, 2018 6:54:21 PM io.grpc.netty.NettyServerHandler onStreamError
WARNING: Stream Error
io.netty.handler.codec.http2.Http2Exception$StreamException: Received DATA frame for an unknown stream 3
	at io.netty.handler.codec.http2.Http2Exception.streamError(Http2Exception.java:129)
	at io.netty.handler.codec.http2.DefaultHttp2ConnectionDecoder$FrameReadListener.shouldIgnoreHeadersOrDataFrame(DefaultHttp2ConnectionDecoder.java:535)
	at io.netty.handler.codec.http2.DefaultHttp2ConnectionDecoder$FrameReadListener.onDataRead(DefaultHttp2ConnectionDecoder.java:187)
	at io.netty.handler.codec.http2.Http2InboundFrameLogger$1.onDataRead(Http2InboundFrameLogger.java:48)
	at io.netty.handler.codec.http2.DefaultHttp2FrameReader.readDataFrame(DefaultHttp2FrameReader.java:421)
	at io.netty.handler.codec.http2.DefaultHttp2FrameReader.processPayloadState(DefaultHttp2FrameReader.java:251)
	at io.netty.handler.codec.http2.DefaultHttp2FrameReader.readFrame(DefaultHttp2FrameReader.java:160)
	at io.netty.handler.codec.http2.Http2InboundFrameLogger.readFrame(Http2InboundFrameLogger.java:41)
	at io.netty.handler.codec.http2.DefaultHttp2ConnectionDecoder.decodeFrame(DefaultHttp2ConnectionDecoder.java:118)
	at io.netty.handler.codec.http2.Http2ConnectionHandler$FrameDecoder.decode(Http2ConnectionHandler.java:388)
	at io.netty.handler.codec.http2.Http2ConnectionHandler.decode(Http2ConnectionHandler.java:448)
	at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:489)
	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:428)
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:265)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1359)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:935)
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:134)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
	at java.lang.Thread.run(Thread.java:745)

Client log:

Exception in thread "main" io.grpc.StatusRuntimeException: CANCELLED: HTTP/2 error code: CANCEL
Received Rst Stream
	at io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:221)
	at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:202)
	at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:131)
	at io.grpc.examples.helloworld.GreeterGrpc$GreeterBlockingStub.sayHello(GreeterGrpc.java:160)
	at io.grpc.examples.helloworld.HelloWorldClient.greet(HelloWorldClient.java:60)
	at io.grpc.examples.helloworld.HelloWorldClient.main(HelloWorldClient.java:77)
@ejona86
Copy link
Member

ejona86 commented Jan 25, 2018

There's a lot here. To confirm I understand, there's two problems:

  1. Log spam on client and server (in the various cases)
  2. A poor error client-side error when client exceeds max size

Is that right?

@srujann
Copy link
Author

srujann commented Jan 25, 2018

Thats correct. In addition:
3. There is no useful error on server-side when server response exceeds max size.

@RiyaTyagi
Copy link
Contributor

RiyaTyagi commented Jun 27, 2019

@srujann @ejona
I also tried to run the helloworld example with request message having size more than 4MB and I just got the valid exceptions in server-side log (there was no unexpected exception) and also in the client side log I got a decent message

Jun 27, 2019 11:15:41 AM io.grpc.examples.helloworld.HelloWorldClient greet
INFO: Will try to greet ...  
Jun 27, 2019 11:15:41 AM io.grpc.examples.helloworld.HelloWorldClient greet
WARNING: RPC failed: Status{code=CANCELLED, description=HTTP/2 error code: CANCEL
Received Rst Stream, cause=null}

But when the response message has size greater than 4MB, I also got the same exceptions.

@jclinn
Copy link

jclinn commented Aug 31, 2021

@ejona86 Is this bug going to be fixed any time soon? I am using version 1.35.1 and still seeing the issue where the client is not receiving an informative message, in the case, the client sends a request exceeding 4 MB.

Server logs:
{"level":"WARNING","logNameSource":"io.grpc.netty.NettyServerStream","message":"Exception processing message","extendedMessage":"io.grpc.StatusRuntimeException: RESOURCE_EXHAUSTED: gRPC message exceeds maximum size 4194304: 6144592"}

Client logs:
Status{code=CANCELLED, description=RST_STREAM closed stream. HTTP/2 error code: CANCEL, cause=null}

@meoyawn
Copy link

meoyawn commented Feb 3, 2022

Python server: fine

grpc-java v1.44.0
grpc-kotlin v1.2.1

OpenJDK 17.0.1 on macOS : all fine

OpenJDK 11.0.11 on Ubuntu:

io.grpc.StatusException: RESOURCE_EXHAUSTED: Received message larger than max (4516559 vs. 4194304)
    at io.grpc.Status.asException(Status.java:550)
    at io.grpc.kotlin.ClientCalls$rpcImpl$1$1$1.onClose(ClientCalls.kt:295)
    at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:562)
    at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:70)
    at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:743)
    at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:722)
    at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
    at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)

Config:

val maxMsgBytes = DecimalByteUnit.MEGABYTES.toBytes(100)

val serviceConfig = mapOf(
    "methodConfig" to listOf(
        mapOf(
            "name" to listOf(mapOf("service" to "yanic.Yanic")),
            "retryPolicy" to mapOf(
                "maxAttempts" to "20",
                "initialBackoff" to "0.1s",
                "maxBackoff" to "10s",
                "backoffMultiplier" to "2",
                "retryableStatusCodes" to listOf("UNAVAILABLE"),
            ),
            "waitForReady" to true,
            "maxRequestMessageBytes" to maxMsgBytes.toString(),
            "maxResponseMessageBytes" to maxMsgBytes.toString(),
        ),
    ),
)

val chan =
    Grpc.newChannelBuilderForAddress("localhost", port, InsecureChannelCredentials.create())
        .enableRetry()
        .defaultServiceConfig(serviceConfig)
        .retryBufferSize(maxMsgBytes)
        .maxInboundMessageSize(maxMsgBytes.toInt())
        .maxInboundMetadataSize(maxMsgBytes.toInt())
        .build()

val opts = CallOptions.DEFAULT
    .withWaitForReady()
    .withMaxInboundMessageSize(maxMsgBytes.toInt())
    .withMaxOutboundMessageSize(maxMsgBytes.toInt())

@ryanpbrewster
Copy link
Contributor

I believe I have reproduced this issue on v1.64.0-SNAPSHOT (aka the tip of master) and understand the issue, at least in the Netty-based gRPC implementation:

  • the over-sized payload is detected inside of the MessageDeframer and propagated as an exception
  • that exception is caught by an AbstractStream, which passes the underlying Status to a NettyServerStream.TransportState (specifically the deframeFailed method)
  • that is enqueueing a CancelServerStreamCommand which closes the stream without notifying the peer about the status

So the client sees RST_STREAM with no context.

If we want the client to receive the status, I think we'll need to explicitly send trailers (which should also serve to close the stream). I think that ryanpbrewster#1 works, but will clean it up and send it out for review.

@ejona86
Copy link
Member

ejona86 commented Apr 2, 2024

The difficulty with fixing this is the deframer and framer are running on different threads, and there's no way for us to enqueue work to the framing thread. Cancellation/resets are already expected to be racy and handled cleanly. But if the deframer closes with trailers that 1) can trigger in the middle of sending a message which the receiver won't like (but that's probably okay), and 2) will require the netty handler to notice and throw away later data and trailers created by the framer.

To explain (1) a bit more, when serializing protobuf, the framer produces chunks of data containing 0-many messages. A large message is split into multiple chunks of data. Those are passed via a queue from the framing thread to the netty handler where they are sent. Any trailer triggered by the deframer will go into that same queue.

@ryanpbrewster
Copy link
Contributor

If the deframer enqueues a command to send trailers + close, my understanding is that:

  • making the receiver unhappy is probably okay, because the stream is going to be closed one way or another, and at least this way they get some context about why
  • if the stream is closed, won't the handler automatically notice & discard any future data (because the underlying stream won't exist anymore)?

@ejona86
Copy link
Member

ejona86 commented Apr 2, 2024

making the receiver unhappy is probably okay, because the stream is going to be closed one way or another, and at least this way they get some context about why

Well, my point was right now you wouldn't, in certain cases. You will see an error about a truncated message. I still think that's okay, and is a step in the right direction. We just may need future work in the receiver to expose "an error in the trailers if the trailers had an error, otherwise generate an error."

if the stream is closed, won't the handler automatically notice & discard any future data

Once you get to a certain level, yes. But plenty of this code has never needed to deal with this so 1) we need to audit/be prepared for broken code and 2) consider how errors are handled.

@ryanpbrewster
Copy link
Contributor

Makes sense, and understood. I'm going to open a PR and link it to this issue. If you don't mind, I'll tag you there and we can discuss the specific implementation.

ryanpbrewster added a commit to ryanpbrewster/grpc-java that referenced this issue Apr 24, 2024
Today, deframer errors cancel the stream without communicating a status code
to the peer. This change causes deframer errors to trigger a best-effort
attempt to send trailers with a status code so that the peer understands
why the stream is being closed.

Fixes grpc#3996
@ejona86 ejona86 modified the milestones: Next, 1.64 Apr 24, 2024
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jul 24, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

6 participants