[BUG] tapir_request_active prometheus metric not going down when server responds after client connection has been closed #4131

meetic-thomasavril · 2024-11-05T14:52:46Z

Tapir version: 1.11.8

Scala version: 2.13.15

Describe the bug

-- Using Netty Future server --
When an endpoint responds after client connection has been closed (either by the client itself or the Netty connection timeout), the associated prometheus gague metric tapir_request_active will not go down. I can pinpoint the issue happening since tapir version 1.9.4 and the deletion of NettyConfig.defaultNoStreaming. I can only guess that switching from NettyConfig.defaultNoStreaming to NettyConfig.default caused the issue.

How to reproduce?

This issue can be reproduced by adding a Thread.sleep of 30 seconds inside an endpoint. After calling the endpoint you will be able to see that the tapir_request_active metric is set to 1 and will never go down to 0 (even after 30 seconds and the endpoint finishing its work). The following error log also appears :

java.util.NoSuchElementException: io.netty.handler.timeout.IdleStateHandler
	at io.netty.channel.DefaultChannelPipeline.getContextOrDie(DefaultChannelPipeline.java:1031)
	at io.netty.channel.DefaultChannelPipeline.remove(DefaultChannelPipeline.java:366)
	at sttp.tapir.server.netty.internal.NettyServerHandler$$anonfun$$nestedInanonfun$channelRead0$6$1.$anonfun$applyOrElse$1(NettyServerHandler.scala:151)
	at scala.Option.foreach(Option.scala:437)
	at sttp.tapir.server.netty.internal.NettyServerHandler$$anonfun$$nestedInanonfun$channelRead0$6$1.applyOrElse(NettyServerHandler.scala:151)
	at sttp.tapir.server.netty.internal.NettyServerHandler$$anonfun$$nestedInanonfun$channelRead0$6$1.applyOrElse(NettyServerHandler.scala:150)
	at scala.concurrent.Future.$anonfun$andThen$1(Future.scala:515)
	at scala.concurrent.impl.Promise$Transformation.run(Promise.scala:475)
	at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173)
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166)
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472)
	at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:405)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.base/java.lang.Thread.run(Thread.java:829)

Maybe you can provide code to reproduce the problem?

Use this exemple code : https://github.com/softwaremill/tapir/blob/master/examples/src/main/scala/sttp/tapir/examples/observability/prometheusMetricsExample.scala
Add a Thread.sleep(30000) inside server logic code.

Thank you for any help :)

The text was updated successfully, but these errors were encountered:

adamw mentioned this issue Nov 12, 2024

Fix cleanup in Netty handler after a request timeout #4156

Merged

adamw closed this as completed in #4156 Nov 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] tapir_request_active prometheus metric not going down when server responds after client connection has been closed #4131

[BUG] tapir_request_active prometheus metric not going down when server responds after client connection has been closed #4131

meetic-thomasavril commented Nov 5, 2024 •

edited

Loading

[BUG] tapir_request_active prometheus metric not going down when server responds after client connection has been closed #4131

[BUG] tapir_request_active prometheus metric not going down when server responds after client connection has been closed #4131

Comments

meetic-thomasavril commented Nov 5, 2024 • edited Loading

meetic-thomasavril commented Nov 5, 2024 •

edited

Loading