Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error using Spring cloud gateway - IllegalStateException: channel not registered to an event loop - Client waits infinite for response #3559

Closed
FrankKlOS opened this issue Dec 22, 2024 · 4 comments · Fixed by #3581
Assignees
Labels
type/bug A general bug warn/regression A regression from a previous release
Milestone

Comments

@FrankKlOS
Copy link

FrankKlOS commented Dec 22, 2024

We have noticed occasionally the following exception in our logs:

dmssidecar.7301           : 2024-12-22T15:34:54.038+01:00  WARN 30048 --- [dms] [ctor-http-nio-5] i.n.u.concurrent.AbstractEventExecutor   : A task raised an exception. Task: reactor.netty.channel.FluxReceive$$Lambda$2120/0x000001c60190c3c8@7dc94a20
dmssidecar.7301           : 
dmssidecar.7301           : java.lang.IllegalStateException: channel not registered to an event loop
dmssidecar.7301           : 	at io.netty.channel.AbstractChannel.eventLoop(AbstractChannel.java:163)
dmssidecar.7301           : 	at io.netty.channel.AbstractChannelHandlerContext.executor(AbstractChannelHandlerContext.java:132)
dmssidecar.7301           : 	at io.netty.channel.AbstractChannelHandlerContext.findContextOutbound(AbstractChannelHandlerContext.java:1079)
dmssidecar.7301           : 	at io.netty.channel.AbstractChannelHandlerContext.read(AbstractChannelHandlerContext.java:821)
dmssidecar.7301           : 	at io.netty.channel.DefaultChannelPipeline.read(DefaultChannelPipeline.java:953)
dmssidecar.7301           : 	at io.netty.channel.AbstractChannel.read(AbstractChannel.java:289)
dmssidecar.7301           : 	at io.netty.channel.DefaultChannelConfig.setAutoRead(DefaultChannelConfig.java:341)
dmssidecar.7301           : 	at reactor.netty.channel.FluxReceive.drainReceiver(FluxReceive.java:338)
dmssidecar.7301           : 	at reactor.netty.channel.FluxReceive.lambda$request$1(FluxReceive.java:136)
dmssidecar.7301           : 	at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173)
dmssidecar.7301           : 	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166)
dmssidecar.7301           : 	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472)
dmssidecar.7301           : 	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569)
dmssidecar.7301           : 	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
dmssidecar.7301           : 	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
dmssidecar.7301           : 	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
dmssidecar.7301           : 	at java.base/java.lang.Thread.run(Thread.java:840)

If this exeption occures, the client call gets stalled. If a socket timeout is not set, the client waits infinite for a response. This results in some of our services not starting up or failing during runtime.

Expected Behavior

The call should succeed and the spring cloud gateway should forward the request.

Actual Behavior

The request gets stalled. The gateway never returns. This does not happen on every request - but around 2-3% fail.

Steps to Reproduce

A multithreaded test http client was used to reproduce the behaviour. Should be reproducable with any http load testframework.

Possible Solution

Use version 1.1.23 of reactor-netty. The problem does not occure in this version. The error behaviour can be reproduced in 1.1.24 and 1.1.25. Only the reactory-netty-core library was replaced for the test. All other libraries in our app stayed the same. The reactor-netty version was enforced in the app main pom. The 1.2.x branch was not tested.

Your Environment

  • Reactor version(s) used: reactor-netty-core 1.1.24 + 1.1.25. Version 1.1.23 is working.
  • Other relevant libraries versions (eg. netty, ...): Spring cloud version 2023.0.3 + Spring-boot 3.3.6.
  • JVM version (java -version): 17.0.1+12-LTS-39
  • OS and version (eg. uname -a): Windows Server 2019
@FrankKlOS FrankKlOS added status/need-triage A new issue that still need to be evaluated as a whole type/bug A general bug labels Dec 22, 2024
@FrankKlOS FrankKlOS changed the title Error using Spring cloud gateway - IllegalStateException: channel not registered to an event loop - Client wait infinite for response Error using Spring cloud gateway - IllegalStateException: channel not registered to an event loop - Client waits infinite for response Dec 22, 2024
@lotmek
Copy link

lotmek commented Dec 23, 2024

We started facing the same issue in Spring Cloud Gateway when Renovate Bot upgraded Spring Boot version from 3.2.11 to 3.2.12. It seems to happen only with large or chunked requests.

I wonder if it was introduced with this PR #3459

@violetagg violetagg removed the status/need-triage A new issue that still need to be evaluated as a whole label Dec 30, 2024
@violetagg violetagg self-assigned this Dec 30, 2024
@violetagg violetagg added the warn/regression A regression from a previous release label Jan 6, 2025
@violetagg violetagg added this to the 1.1.26 milestone Jan 6, 2025
violetagg added a commit that referenced this issue Jan 6, 2025
…nfiguration

DisposedChannel is effective when request/response is terminated and replaces the actual channel.
At that point DisposedChannelConfig#setAutoRead must be non operational
as the inbound has already been read and the actual channel has already set auto-read to true.

The issue is also observed with the reproducible example provided by #3495

Fixes #3559
@violetagg
Copy link
Member

@FrankKlOS Thanks for the detailed description. This is fixed with #3581. It would be great if you can test the snapshot version.

@FrankKlOS
Copy link
Author

@FrankKlOS Thanks for the detailed description. This is fixed with #3581. It would be great if you can test the snapshot version.

Hi - I have tested with a locally build 1.2.2-SNAPSHOT version of reactor-netty and can confirm that the error goes away. Thanks for your fix.

@violetagg
Copy link
Member

@FrankKlOS Thanks for the detailed description. This is fixed with #3581. It would be great if you can test the snapshot version.

Hi - I have tested with a locally build 1.2.2-SNAPSHOT version of reactor-netty and can confirm that the error goes away. Thanks for your fix.

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug A general bug warn/regression A regression from a previous release
Projects
None yet
3 participants