Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use directBuffer with 8m initial capacity #1105

Closed
wants to merge 1 commit into from

Conversation

ppalaga
Copy link
Contributor

@ppalaga ppalaga commented Nov 10, 2023

@ppalaga ppalaga marked this pull request as draft November 10, 2023 21:50
Copy link

@franz1981 franz1981 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be equally dangerous (especially if the write happen in the blocking thread pool executor), see quarkusio/quarkus#32546.

I strongly suggest to try reuse quarkusio/quarkus#32858 (AppendBuffer and maybe the stream itself?) and expose both min and buffer size, to both save number of packets creations AND to avoid duplicated code

wdyt?

@geoand
Copy link

geoand commented Nov 11, 2023

Definitely +1 to using the same approach as @franz1981 introduced in RESTEasy Reactive as if has worked out really well for us

@franz1981
Copy link

franz1981 commented Nov 11, 2023

Just fyi (to be verified): https://github.com/netty/netty/blob/4.1/buffer/src/main/java/io/netty/buffer/PooledByteBufAllocator.java#L122 32K is the max cacheable size for pooled allocations, which means that 8 mb will not be pooled AND could create other problems eg https://sourceware.org/bugzilla/show_bug.cgi?id=11261

Tldr performing malloc/free (used under the hood by Netty for unspooled big allocations) can cause RSS increases due to thread-local arenas of glibc malloc.

@franz1981
Copy link

franz1981 commented Nov 13, 2023

@ppalaga can you share here the stack trace which lead to VertxServletOutputStream's method call?

Maybe there's some interesting low hanging fruit which would avoid other intermediate big allocation(s) as well (eg an intermediate encoded byte[] out of a String), thanks to Netty's ability to directly encode char sequence(s) into the pooled buffers ie https://netty.io/4.1/api/io/netty/buffer/ByteBuf.html#writeCharSequence-java.lang.CharSequence-java.nio.charset.Charset-.
This last thing work well if we have a mechanism to know upfront the length of the encoded data, which depends by the charset (and UTF-8's in Netty requires O(n) at https://netty.io/4.1/api/io/netty/buffer/ByteBufUtil.html#utf8Bytes-java.lang.CharSequence-): luckly in Undertow, at undertow-io/undertow#1424. I've created a batchy method to quickly check if the String is ASCII, which will make both encoding and presizing it in Netty straightforward (encoded length := inputlength).

@ppalaga
Copy link
Contributor Author

ppalaga commented Nov 13, 2023

can you share here the stack trace which lead to VertxServletOutputStream's method call?

Yep, here it is:

java.lang.RuntimeException
     at io.quarkiverse.cxf.transport.VertxServletOutputStream.awaitWriteable(VertxServletOutputStream.java:185)
     at io.quarkiverse.cxf.transport.VertxServletOutputStream.write(VertxServletOutputStream.java:150)
     at io.quarkiverse.cxf.transport.VertxServletOutputStream.writeBlocking(VertxServletOutputStream.java:124)
     at io.quarkiverse.cxf.transport.VertxServletOutputStream.write(VertxServletOutputStream.java:111)
     at org.apache.cxf.io.AbstractWrappedOutputStream.write(AbstractWrappedOutputStream.java:51)
     at com.ctc.wstx.io.UTF8Writer.write(UTF8Writer.java:143)
     at com.ctc.wstx.sw.BufferingXmlWriter.flushBuffer(BufferingXmlWriter.java:1417)
     at com.ctc.wstx.sw.BufferingXmlWriter.writeRaw(BufferingXmlWriter.java:256)
     at com.ctc.wstx.sw.BufferingXmlWriter.writeCharacters(BufferingXmlWriter.java:600)
     at com.ctc.wstx.sw.BaseStreamWriter.writeCharacters(BaseStreamWriter.java:467)
     at org.apache.cxf.staxutils.StaxUtils.copy(StaxUtils.java:759)
     at org.apache.cxf.staxutils.StaxUtils.copy(StaxUtils.java:707)
     at org.apache.cxf.jaxws.handler.logical.LogicalHandlerOutInterceptor$LogicalHandlerOutEndingInterceptor.handleMessage(LogicalHandlerOutInterceptor.java:183)
     at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:307)
     at org.apache.cxf.interceptor.OutgoingChainInterceptor.handleMessage(OutgoingChainInterceptor.java:90)
     at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:307)
     at org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:121)
     at org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:265)
     at org.apache.cxf.transport.servlet.ServletController.invokeDestination(ServletController.java:233)
     at org.apache.cxf.transport.servlet.ServletController.invoke(ServletController.java:207)
     at org.apache.cxf.transport.servlet.ServletController.invoke(ServletController.java:159)
     at io.quarkiverse.cxf.transport.CxfHandler.process(CxfHandler.java:219)
     at io.quarkiverse.cxf.transport.CxfHandler.handle(CxfHandler.java:156)
     at io.quarkiverse.cxf.transport.CxfHandler.handle(CxfHandler.java:40)
     at io.vertx.ext.web.impl.BlockingHandlerDecorator.lambda$handle$0(BlockingHandlerDecorator.java:48)
     at io.vertx.core.impl.ContextBase.lambda$executeBlocking$1(ContextBase.java:180)
     at io.vertx.core.impl.ContextInternal.dispatch(ContextInternal.java:277)
     at io.vertx.core.impl.ContextBase.lambda$internalExecuteBlocking$2(ContextBase.java:199)
     at io.quarkus.vertx.core.runtime.VertxCoreRecorder$14.runWith(VertxCoreRecorder.java:582)

@franz1981
Copy link

franz1981 commented Nov 13, 2023

Thanks @ppalaga so...quick analysis:
https://github.com/FasterXML/woodstox/blob/8b49745c1277ef75d880782ee92cf336a9c15395/src/main/java/com/ctc/wstx/io/UTF8Writer.java#L143 "seems" to suggest that we cannot save the Netty ByteBuf pooled allocation by "stealing" the byte[] in https://github.com/FasterXML/woodstox/blob/8b49745c1277ef75d880782ee92cf336a9c15395/src/main/java/com/ctc/wstx/io/UTF8Writer.java#L50 , given that's reused to perform UTF-8 encoding (it's the same for Json encoding in Jackson actually - sameol' sameol').

Due to this, I strongly recommend to try reuse the existing jackson approach mentioned earlier, because it's both fixing the issue and preventing further performance issues (till the next one :P).

@ppalaga
Copy link
Contributor Author

ppalaga commented Nov 13, 2023

Just fyi (to be verified): https://github.com/netty/netty/blob/4.1/buffer/src/main/java/io/netty/buffer/PooledByteBufAllocator.java#L122 32K is the max cacheable size for pooled allocations, which means that 8 mb will not be pooled AND could create other problems eg https://sourceware.org/bugzilla/show_bug.cgi?id=11261

Thanks for the hint, but the idea with 8m was not meant for merging. It was rather a quick fix to figure out whether the approach brings any improvement at all.

I strongly recommend to try reuse the existing jackson approach mentioned earlier

You mean the AppendBuffer? I created a new PR #1106 Let's continue the discussion there.

@ppalaga ppalaga closed this Nov 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants