Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP pipelining causes resource leak #30801

Closed
danielmitterdorfer opened this issue May 23, 2018 · 4 comments
Closed

HTTP pipelining causes resource leak #30801

danielmitterdorfer opened this issue May 23, 2018 · 4 comments
Assignees
Labels
>bug :Distributed Coordination/Network Http and internode communication implementations v7.0.0-beta1

Comments

@danielmitterdorfer
Copy link
Member

Elasticsearch version: 7.0.0-alpha1-SNAPSHOT (distribution flavor OSS), commit 31251c9

Description of the problem including expected versus actual behavior:

Elasticsearch dies with OOME in our benchmarks. This is caused by a resource leak on the network layer.

Steps to reproduce:

Run the following benchmark with Rally (it will build the right revision of Elasticsearch already):

esrally --revision=1918a30 --challenge=append-no-conflicts-index-only --on-error=abort

After a few minutes (about 17%) into the benchmark, Elasticsearch will die with an OOME.

Provide logs (if relevant):

In the logs we see:

[2018-05-23T07:31:31,811][ERROR][i.n.u.ResourceLeakDetector] LEAK: ByteBuf.release() was not called before it's garbage-collected. See http://netty.io/wiki/reference-counted-objects.html for more information.

If we start Elasticsearch with -Dio.netty.leakDetection.level=advanced we get:

Click arrow for details
[2018-05-23T07:31:31,811][ERROR][i.n.u.ResourceLeakDetector] LEAK: ByteBuf.release() was not called before it's garbage-collected. See http://netty.io/wiki/reference-counted-objects.html for more information.
WARNING: 7 leak records were discarded because the leak record count is limited to 4. Use system property io.netty.leakDetection.maxRecords to increase the limit.
Recent access records: 4
#4:
	Hint: 'read_timeout' will handle the message from this point.
	io.netty.channel.DefaultChannelPipeline.touch(DefaultChannelPipeline.java:116)
	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:345)
	io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1359)
	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:935)
	io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:134)
	io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
	io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:545)
	io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:499)
	io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
	io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
	java.base/java.lang.Thread.run(Thread.java:844)
#3:
	Hint: 'openChannels' will handle the message from this point.
	io.netty.channel.DefaultChannelPipeline.touch(DefaultChannelPipeline.java:116)
	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:345)
	io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1359)
	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:935)
	io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:134)
	io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
	io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:545)
	io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:499)
	io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
	io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
	java.base/java.lang.Thread.run(Thread.java:844)
#2:
	Hint: 'DefaultChannelPipeline$HeadContext#0' will handle the message from this point.
	io.netty.channel.DefaultChannelPipeline.touch(DefaultChannelPipeline.java:116)
	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:345)
	io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:935)
	io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:134)
	io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
	io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:545)
	io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:499)
	io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
	io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
	java.base/java.lang.Thread.run(Thread.java:844)
#1:
	io.netty.buffer.AdvancedLeakAwareByteBuf.writeBytes(AdvancedLeakAwareByteBuf.java:630)
	io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:343)
	io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:123)
	io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
	io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:545)
	io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:499)
	io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
	io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
	java.base/java.lang.Thread.run(Thread.java:844)
Created at:
	io.netty.buffer.PooledByteBufAllocator.newHeapBuffer(PooledByteBufAllocator.java:314)
	io.netty.buffer.AbstractByteBufAllocator.heapBuffer(AbstractByteBufAllocator.java:162)
	io.netty.buffer.AbstractByteBufAllocator.heapBuffer(AbstractByteBufAllocator.java:153)
	io.netty.buffer.AbstractByteBufAllocator.ioBuffer(AbstractByteBufAllocator.java:135)
	io.netty.channel.DefaultMaxMessagesRecvByteBufAllocator$MaxMessageHandle.allocate(DefaultMaxMessagesRecvByteBufAllocator.java:80)
	io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:122)
	io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
	io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:545)
	io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:499)
	io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
	io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
	java.base/java.lang.Thread.run(Thread.java:844)

Can you please have a look at this @tbrooks8?

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra

Tim-Brooks added a commit to Tim-Brooks/elasticsearch that referenced this issue May 23, 2018
This is related to elastic#30801. When we calling the http pipelining
aggregator on an inbound request, we retain the netty request. However,
this is unnecessary as the pipelining aggregator does not store the
request. This worked in the past as we manually release the request and
netty internally automatically releases the request. At this point we do
not implement the ref counter interface after the pipelining step which
means that netty is no longer automatically handling this second retain.

This commit removes that retain.
@vineelyalamarthy
Copy link

Is this issue still open ? Or with the patches submitted, this got closed?

@Tim-Brooks
Copy link
Contributor

I believe it is fixed. But I think we should wait until the nightly benchmarks run to be sure.

@jasontedor
Copy link
Member

Closed by #30820

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed Coordination/Network Http and internode communication implementations v7.0.0-beta1
Projects
None yet
Development

No branches or pull requests

6 participants