Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove redundant response of empty result in AsyncShardFetch to avoid OOM issue #84010

Closed
wants to merge 2 commits into from

Conversation

maosuhan
Copy link

In the process of full restart, OOM problems are often encountered. After dump analysis and we found the following problems.

image

In the process of fetching data, the master will ask each data whether it owns the shard, and all the returned results of the node will be saved in the map<nodeId, nodeResponse> in AsyncShardFetch. In most cases, a shard only has data on several nodes, but the intermediate result map will still save the return results of all nodes.

In this MR, we only save valid intermediate result of node responses and ignore the node responses that does not hold the shard at all.

It is proved to be successfully in our company and the OOM issue is gone after the optimization.

@cla-checker-service
Copy link

cla-checker-service bot commented Feb 16, 2022

❌ Author of the following commits did not sign a Contributor Agreement:
e37d3e1

Please, read and sign the above mentioned agreement if you want to contribute to this project

@elasticsearchmachine elasticsearchmachine added v8.2.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels Feb 16, 2022
@DaveCTurner
Copy link
Contributor

What version are you using @maosuhan? I think this excessive memory usage was addressed by #77991.

@maosuhan
Copy link
Author

@DaveCTurner We already backport the #77991, but the OOM issue still exists.
We use 7.6.2 in production. The shard number is 1 million, and the nodes number is nearly 400.

The main part which occupy the memory is not DiscoveryNode but StoreFilesMetadata which is a node response content and is relatively large.

@DaveCTurner
Copy link
Contributor

Can you help me understand why an empty StoreFilesMetadata is still "relatively large". I get that there are a lot of them, but each one should be small. Is that not the case?

I think this will be addressed by some bigger changes related to #77466, particularly #80694 and friends.

@maosuhan maosuhan force-pushed the enhance_async_shard_fetch branch from b515a21 to e37d3e1 Compare February 16, 2022 09:31
@DaveCTurner DaveCTurner added the :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) label Feb 16, 2022
@elasticmachine elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Feb 16, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@DaveCTurner DaveCTurner self-assigned this Feb 16, 2022
@DaveCTurner
Copy link
Contributor

Also would you sign the CLA?

@maosuhan
Copy link
Author

CLA Done @DaveCTurner

@maosuhan
Copy link
Author

@DaveCTurner As you can see in the screenshot, there was totally 32927024 objets in the dump and they occupied more than 15GB.
So each object size is 0.45KB.

@DaveCTurner
Copy link
Contributor

So each object size is 0.45KB.

i.e. 450 bytes. That's a lot more than I expect. Can you explain why it's so much?

@DaveCTurner
Copy link
Contributor

Are you sure you signed the CLA with the address [email protected] attached to your commits? I don't see this address in the list.

@DaveCTurner
Copy link
Contributor

On reflection I'd rather avoid allocating these objects entirely rather than creating them and then filtering them out as proposed here. #84034 should do what we want.

@maosuhan
Copy link
Author

Thanks for your effort! I think it is more reasonable to do that.
I just debugged in IDE, below is the screenshot.
image

metadata and commitUserData are all empty HashMap which is read through streamInput. And each hashmap will occupy 64 bytes each. So the object is at least 128 bytes.
The non-empty result is definitely much larger than the empty ones but the accumulated wasted memory will be very large if the cluster and node number are both large.
Feel free to close this issue.

@DaveCTurner
Copy link
Contributor

DaveCTurner commented Feb 16, 2022

Sounds good, thanks for raising the issue. Closing this, will proceed with #84034 instead.

@elasticsearchmachine
Copy link
Collaborator

@maosuhan please enable the option "Allow edits and access to secrets by maintainers" on your PR. For more information, see the documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) external-contributor Pull request authored by a developer outside the Elasticsearch team Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v8.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants