-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP always above zero in SimpleDequePool #118
Comments
hi @machao23 and thanks for the report. There has been a few bugs fixed in the later versions of reactor-netty and reactor-pool, could you confirm you still see the issue with the latest versions ? (ie I'll look into what pieces of the |
@simonbasle Thank you for replying. This issue happened once for some months, so it probably needs a log of time to check if the latest version works well. |
can you let us know more about your netty/reactor-netty usage? for instance, are you setting up netty handlers, codecs, etc... ? What's the configuration like? The stacktrace you provided seems to indicate something goes wrong during the encoding phase. |
cc @violetagg |
@simonbasle our codes related to return () -> httpClient.headers(h -> header.forEach(h::add))
.tcpConfiguration(tcpClient -> {
if (!Strings.isNullOrEmpty(searchContext.getProxyHost())) {
tcpClient = tcpClient.proxy((spec) ->
spec.type(ProxyProvider.Proxy.HTTP)
.host(searchContext.getProxyHost())
.port(searchContext.getProxyPort()));
NettyWebClient.LOGGER.info("tcp proxy ProxyHost is {}, ProxyPort is {}",
searchContext.getProxyHost(), searchContext.getProxyPort());
}
return tcpClient.option(ChannelOption.CONNECT_TIMEOUT_MILLIS,
(int) searchContext.getTimeout().getSeconds() * 1000)
.doOnConnected(connection -> connection.addHandlerLast(
new ReadTimeoutHandler(searchContext.getTimeout().toMillis(),
TimeUnit.MILLISECONDS))
.addHandlerLast(new WriteTimeoutHandler(
searchContext.getTimeout().toMillis(), TimeUnit.MILLISECONDS)));
})
.post()
.uri(searchContext.getSearchUrl())
.send(Mono.just(Unpooled.wrappedBuffer(bytes)))
.responseSingle((resp, buf) -> {
HttpResponseStatus httpResponseStatus = resp.status();
HttpHeaders httpHeaders = resp.responseHeaders();
response.setStatusCode(httpResponseStatus.code());
response.setContentLength(getContentLength(httpHeaders.get("Content-Length")));
return buf;
})
.map(ByteBuf::retain)
.map(byteBuf -> {
response.setResponseBodyStream(new ByteBufInputStream(byteBuf, true));
return response;
}); Yes, it throwed |
@machao23 I tried to reproduce the issue based on the example above but with no success. Now some Netty code: It seems that the encoder here fails and releases the byte buffer (unfortunately I cannot say why) exceptions in Netty are not just thrown but propagated via promises If the promise was already marked as failed (remember encoder failed for some reason), Netty will just log that the new exception cannot be propagated because the promise was already marked as failed and actually that's the WARNING that you observed. |
@violetagg Thank you for your clarification. Because I don't find any other new exceptions in our logs. Please confirm it, thanks. |
How many of those do you observe? What's your connection pool configurations? (especially max connections)
If you run with the default configurations you should have many exceptions, in order to exhaust the connection pool, although the one above should not lead to a connection not returned to the pool. |
@violetagg Just a few, it's only occurred with
Yes, we do run with default connection. |
@machao23 I had a wrong impression that you see connections always stay active and not returned to the pool. |
@violetagg Maybe I didn't express my meaning before. Let me explain again:
|
Please see my comment above - the exceptions in Netty are propagated via promises i.e. the promise is marked as failed and you can obtain the exception from the promise, it is not just thrown. |
@violetagg Oh, I see. |
@simonbasle @machao23 I was able to reproduce the warning with this test case reactor/reactor-netty@7b08d3e |
@violetagg Thanks your code, I would debug this case later. I noticed you mentioned |
yes |
@violetagg Could you please tell me where the |
@violetagg The test failed on The exception stack is java.lang.AssertionError: expectation "expectError(Class)" failed (expected error of type: EncoderException; actual type: java.io.IOException: 远程主机强迫关闭了一个现有的连接。)
at reactor.test.MessageFormatter.assertionError(MessageFormatter.java:115)
at reactor.test.MessageFormatter.failPrefix(MessageFormatter.java:104)
at reactor.test.MessageFormatter.fail(MessageFormatter.java:73)
at reactor.test.MessageFormatter.failOptional(MessageFormatter.java:88)
at reactor.test.DefaultStepVerifierBuilder.lambda$expectError$6(DefaultStepVerifierBuilder.java:370)
at reactor.test.DefaultStepVerifierBuilder$SignalEvent.test(DefaultStepVerifierBuilder.java:2213)
at reactor.test.DefaultStepVerifierBuilder$DefaultVerifySubscriber.onSignal(DefaultStepVerifierBuilder.java:1485)
at reactor.test.DefaultStepVerifierBuilder$DefaultVerifySubscriber.onExpectation(DefaultStepVerifierBuilder.java:1433)
at reactor.test.DefaultStepVerifierBuilder$DefaultVerifySubscriber.onError(DefaultStepVerifierBuilder.java:1092)
at reactor.core.publisher.FluxMap$MapSubscriber.onError(FluxMap.java:126)
at reactor.core.publisher.MonoFlatMapMany$FlatMapManyMain.onError(MonoFlatMapMany.java:197)
at reactor.core.publisher.SerializedSubscriber.onError(SerializedSubscriber.java:124)
at reactor.core.publisher.FluxRetryWhen$RetryWhenMainSubscriber.whenError(FluxRetryWhen.java:213)
at reactor.core.publisher.FluxRetryWhen$RetryWhenOtherSubscriber.onError(FluxRetryWhen.java:255)
at reactor.core.publisher.FluxConcatMap$ConcatMapImmediate.drain(FluxConcatMap.java:406)
at reactor.core.publisher.FluxConcatMap$ConcatMapImmediate.onNext(FluxConcatMap.java:243)
at reactor.core.publisher.DirectProcessor$DirectInner.onNext(DirectProcessor.java:333)
at reactor.core.publisher.DirectProcessor.onNext(DirectProcessor.java:142)
at reactor.core.publisher.SerializedSubscriber.onNext(SerializedSubscriber.java:99)
at reactor.core.publisher.FluxRetryWhen$RetryWhenMainSubscriber.onError(FluxRetryWhen.java:180)
at reactor.core.publisher.MonoCreate$DefaultMonoSink.error(MonoCreate.java:185)
at reactor.netty.http.client.HttpClientConnect$HttpObserver.onUncaughtException(HttpClientConnect.java:407)
at reactor.netty.ReactorNetty$CompositeConnectionObserver.onUncaughtException(ReactorNetty.java:511)
at reactor.netty.resources.PooledConnectionProvider$DisposableAcquire.onUncaughtException(PooledConnectionProvider.java:549)
at reactor.netty.resources.PooledConnectionProvider$PooledConnection.onUncaughtException(PooledConnectionProvider.java:385)
at reactor.netty.channel.FluxReceive.drainReceiver(FluxReceive.java:230)
at reactor.netty.channel.FluxReceive.onInboundError(FluxReceive.java:435)
at reactor.netty.channel.ChannelOperations.onInboundError(ChannelOperations.java:442)
at reactor.netty.channel.ChannelOperationsHandler.exceptionCaught(ChannelOperationsHandler.java:129)
at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:302)
at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:281)
at io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:273)
at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireExceptionCaught(CombinedChannelDuplexHandler.java:424)
at io.netty.channel.ChannelHandlerAdapter.exceptionCaught(ChannelHandlerAdapter.java:92)
at io.netty.channel.CombinedChannelDuplexHandler$1.fireExceptionCaught(CombinedChannelDuplexHandler.java:145)
at io.netty.channel.ChannelInboundHandlerAdapter.exceptionCaught(ChannelInboundHandlerAdapter.java:143)
at io.netty.channel.CombinedChannelDuplexHandler.exceptionCaught(CombinedChannelDuplexHandler.java:231)
at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:302)
at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:281)
at io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:273)
at io.netty.handler.logging.LoggingHandler.exceptionCaught(LoggingHandler.java:205)
at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:302)
at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:281)
at io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:273)
at io.netty.channel.DefaultChannelPipeline$HeadContext.exceptionCaught(DefaultChannelPipeline.java:1377)
at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:302)
at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:281)
at io.netty.channel.DefaultChannelPipeline.fireExceptionCaught(DefaultChannelPipeline.java:907)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.handleReadException(AbstractNioByteChannel.java:125)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:174)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:834)
Suppressed: java.io.IOException: 远程主机强迫关闭了一个现有的连接。
at java.base/sun.nio.ch.SocketDispatcher.read0(Native Method)
at java.base/sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:43)
at java.base/sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:276)
at java.base/sun.nio.ch.IOUtil.read(IOUtil.java:233)
at java.base/sun.nio.ch.IOUtil.read(IOUtil.java:223)
at java.base/sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:358)
at io.netty.buffer.PooledByteBuf.setBytes(PooledByteBuf.java:253)
at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1133)
at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:350)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:148)
... 8 more |
@machao23 What OS and JDK do you use? |
@violetagg windows10 + openjdk 11+28 |
@machao23 This issue might be related to this one reactor/reactor-netty#1622 |
@violetagg @Poolkey[ |
@tenghaoxiang Are you able to provide some reproducible example or at least steps how to try to reproduce this? |
I'm using reactor-netty to make API calls and facing an issue wherein reactor-netty is not able to create new connections for the incoming requests. And I debuged codes and found the result of
WIP.getAndIncrement(this)
always above zero.Expected Behavior
Even if the
drainLoop
is broken by some requests, it shouldn't affect further requests.Actual Behavior
The
darainLoop
never called anymore after some exceptions occurred.Steps to Reproduce
Unfortunately, I don't know how to reproduce this because I don't know the root cause.
If this helps in any manner, I checked the logs on prod and there're some exceptions occurred during
drainLoop
:Possible Solution
From what I understand from the source code for SimpleDequePool.java, the method
drainLoop()
just incremented WIP while checking the (WIP.getAndIncrement(this) == 0) condition and then decrementing WIP for itself.If some exceptions occured during
drainLoop()
, it would fail to handle decrementing WIP, the further requests to drain() will never pass the below if condition.So I think it'd be better to implement fallback to decrement the WIP in case something wrong happen during
drainLoop
.Your Environment
reactor-netty: 0.9.14.RELEASE
netty: 4.1.51.Final
JVM version: openjdk version "11.0.4"
OS and version: 4.19.118-1.el7.centos.x86_64
The text was updated successfully, but these errors were encountered: