Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose network memory pool stats #42912

Closed
wants to merge 9 commits into from

Conversation

Tim-Brooks
Copy link
Contributor

This is related to #36127. This PR exposes stats about the memory pools
that are used by the networking layer.

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@Tim-Brooks
Copy link
Contributor Author

I'm opening this up as a POC for some feedback. A few notes:

  1. This does not currently provide active Netty allocation metrics. That will require wrapping some Netty stuff and probably will happen in a follow-up.
  2. This does not provide accurate pool size for the page cache recycler. Just the max pool size (which theoretically it will eventually hit). But that should probably be improved at some point.
  3. This does not have tests / rest APIs. But I will add those if we are good with this approach.

I think that what I would like to do is expose this on maybe an experimental API? And not back port the actual REST API to 7.x. As we continue to hash out improved network metrics, we can decide if this will be rolled into the current transport stats API. And once the API is determined we can back port to 7.x. But I am open to discussion on how to do this.

@Tim-Brooks Tim-Brooks added v7.3.0 and removed v7.2.0 labels Jun 6, 2019
Copy link
Member

@original-brownbear original-brownbear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me in terms of the approach (there's not much of any overhead to it and APIs look just fine to me). This should save inspecting heap dumps in a few Netty related scenarios already imo.
=> I'd be all for adding a REST API to this and some tests and make progress on it :)

I left two comments (the Netty one might be irrelevant at this stage and can be ignored if so)

protected MemoryUsage memoryStats() {
MemoryUsage baseMemoryUsage = super.memoryStats();

ByteBufAllocator allocator = ByteBufAllocator.DEFAULT;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more of a detail, but don't we also have to grab the stats off of io.netty.buffer.UnpooledByteBufAllocator#DEFAULT if we have the pooled allocator running by default? Otherwise we miss all the heap usage our REST layer creates in the transport stats since it uses Unpooled.allocate right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We never release the bytes used at the REST level so those metrics would increase forever. In fact those metrics would throw off the network level metrics in this PR currently, so I will probably need to separate the allocators (if we are using unpooled). I will look into that.

Copy link
Contributor Author

@Tim-Brooks Tim-Brooks Jun 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway I guess what I'm saying here is that I view application level bytes (in the rest request and transport deserialized messages) as beyond the scope of this PR currently as it would involve adding a lifecycle to these messages.


import java.io.IOException;

public class TransportStatsV2 implements Writeable, ToXContentFragment {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class is probable just here by accident? It doesn't seem used or functional yet.

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we can just add some of this to the existing transport stats, but just mention that those fields are experimental and bound to change.

@@ -45,6 +45,12 @@ public DequeRecycler(C<T> c, Deque<T> queue, int maxSize) {
return new DV(v, true);
}

@Override
public int recycledCount() {
// TODO: Does not reflect that the currently allocated number of pages might be less.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not add tracking for this in DequeRecycler? Is this difficult to do?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is just work and I wanted to avoid making many changes before getting feedback (also I think it needs to be synchronized for the non-concurrent Deques).

protected MemoryUsage memoryStats() {
MemoryUsage baseMemoryUsage = super.memoryStats();

ByteBufAllocator allocator = ByteBufAllocator.DEFAULT;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@Tim-Brooks
Copy link
Contributor Author

I'll make some more updates to this PR later this week based on feedback.

@jpountz jpountz added v7.4.0 and removed v7.3.0 labels Jul 3, 2019
@colings86 colings86 added v7.5.0 and removed v7.4.0 labels Aug 30, 2019
@jimczi jimczi added the v7.6.0 label Nov 12, 2019
@jimczi jimczi removed the v7.5.0 label Nov 12, 2019
@Tim-Brooks
Copy link
Contributor Author

Closing this as it was a POC and is well out of date. We can always bring these concepts back into an updated PR if necessary.

@Tim-Brooks Tim-Brooks closed this Dec 11, 2019
@Tim-Brooks Tim-Brooks deleted the expose_memory branch April 30, 2020 18:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants