Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Server-side gRPC flow control results in small window sizes and client message backlog #11723

Closed
jammyMarse opened this issue Dec 3, 2024 · 5 comments
Labels

Comments

@jammyMarse
Copy link

We are experiencing an issue with gRPC where the server-side stalls, leading to very small flow control window sizes. This causes gRPC client messages to accumulate in the DefaultHttp2RemoteFlowController.pendingWriteQueue.

During high throughput scenarios with numerous streams, we have noticed that even when employing stream.isReady() for flow control, the off-heap memory usage significantly increases, ultimately leading to an Out of Memory (OOM) situation.

Currently, flow control is primarily based at the HTTP/2 layer. Would it be possible to expose some TCP-level metrics, such as a TCP equivalent of isReady()? This could help in finer-grained resource management, especially under high concurrency situations.

Having more granular control over traffic at the TCP level might help alleviate performance issues due to flow control, while also reducing the risk of OOM due to excessive memory pressure.

Is there any plan from the gRPC team to consider such TCP-level improvements in future releases? Or, are there other recommended approaches to address such issues?

Thank you!

@kannanjgithub
Copy link
Contributor

kannanjgithub commented Dec 6, 2024

Is it possible to scale up your server side so it does not stall?
As far as the client side, I see this discussion Eric (@ejona86 ) has had with the Netty team regarding flow control and some fixes had come out of it.

@ejona86
Copy link
Member

ejona86 commented Dec 6, 2024

TCP is sorta neither here nor there, as we have per-connection flow control already. It is also very hard to use such signals while also avoiding stream starvation/unfairness.

Do you have a limit to the number of concurrent RPCs you're performing? Unary RPCs don't have flow control either, and if you create an unbounded number of them you'd experience the same problem.

@hlx502
Copy link
Contributor

hlx502 commented Dec 10, 2024

TCP is sorta neither here nor there, as we have per-connection flow control already. It is also very hard to use such signals while also avoiding stream starvation/unfairness.

Do you have a limit to the number of concurrent RPCs you're performing? Unary RPCs don't have flow control either, and if you create an unbounded number of them you'd experience the same problem.

Currently, we use onReady before onNext, but memory overflow still occurs due to too many streams. However, in our real business scenarios, we do need streams of this magnitude. Is there a plan to expose the correspondence between netty channel and stream to the application layer?

@ejona86
Copy link
Member

ejona86 commented Dec 10, 2024

We aren't going to expose TCP details to the streams. It wouldn't work as it is inherently unfair.

This issue really doesn't describe your problem; I would hope to see numbers. It more asserts the problem and expects a particular solution. But there are other options. Is the memory use actually expected, or is the problem simply #11719? Or should the per-stream buffer be reduced in this case from its default of 32 KiB using CallOptions.withOnReadyThreshold()? Or is the memory use expected and you should increase the JVM's -XX:MaxDirectMemorySize? Or are you simply trying to do to many RPCs concurrently for the machine size? For those sorts of things we'd need to understand the memory impact you are seeing, the number of concurrent RPCs, and approximate message size.

@ejona86
Copy link
Member

ejona86 commented Dec 26, 2024

Without knowing more, I don't see anything more that can be done here. Closing, but comment with more info and it can be reopened.

Note also that if direct memory is the main problem (the amount of memory is fine, just the type of memory is the problem), it is possible to make your own ByteBufAllocator instance that prefers heap memory and pass it to gRPC's builders with Netty's ChannelOption.ALLOCATOR. For an application, you can maybe use the system property -Dio.netty.noPreferDirect=true instead (it is easy to use, but you may not want it to apply process-wide).

@ejona86 ejona86 closed this as not planned Won't fix, can't repro, duplicate, stale Dec 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants