ProvideMany: high memory usage when providing tens of millions of CIDs #354

ajnavarro · 2022-10-11T16:53:23Z

When using BatchProviding, we are not really batching, but sending all the CIDs at the same time to the Router implementing ProvideMany.

To avoid collateral problems, we should actually batch the calls to ProvideMany.

This will help with Reframe Router implementation (https://github.com/ipfs/go-delegated-routing) to avoid huge JSON payloads sent to the server.

We need to find good defaults to still keep FullRT DHT implementation with good performance numbers.

That is like 1/10th of the memory spike observed, we are still searching for other possible problems.

welcome · 2022-10-11T16:53:25Z

Thank you for submitting your first issue to this repository! A maintainer will be here shortly to triage and review.
In the meantime, please double-check that you have provided all the necessary information to make this process easy! Any information that can help save additional round trips is useful! We currently aim to give initial feedback within two business days. If this does not happen, feel free to leave a comment.
Please keep an eye on how this issue will be labeled, as labels give an overview of priorities, assignments and additional actions requested by the maintainers:

"Priority" labels will show how urgent this is for the team.
"Status" labels will show if this is ready to be worked on, blocked, or in progress.
"Need" labels will indicate if additional input or analysis is required.

Finally, remember to use https://discuss.ipfs.io if you just need general support.

ajnavarro · 2022-10-11T16:57:34Z

CC @ischasny

BigLep · 2022-10-18T16:41:47Z

2022-10-18 conversation: we aren't aware of this being a blocker at the moment so not prioritizing currently, but feedback welcome if this needs to be moved up sooner.

ischasny · 2022-10-28T12:01:34Z

@ajnavarro as per our discussion, might be a good idea to chunk up the CIDs snapshot into smaller pieces so that at least we don't squeeze all of them into a single HTTP request. That is problematic on larger nodes (like web3 storage). Snapshots don't get reprovided even with high router timeouts. Maybe we can do that only for reframe router initially? That should save us some memory on both sending and receiving side. Wdyt?

ajnavarro · 2022-10-28T12:05:44Z

Yeah, won't be the final solution, but will help in providing over HTTP.

ischasny · 2022-10-28T12:07:02Z

Great! Would you guys be up for taking it into the next release? Should be simple to do and would unblock us too.

ajnavarro · 2022-10-28T12:07:47Z

Related issue: ipfs/go-delegated-routing#55

ajnavarro added the need/triage Needs initial labeling and prioritization label Oct 11, 2022

ajnavarro self-assigned this Oct 11, 2022

ajnavarro removed their assignment Feb 5, 2023

hacdias transferred this issue from ipfs/go-ipfs-provider Jun 16, 2023

lidel mentioned this issue Sep 4, 2023

IPFS Services Memory Increases with Data Addition ipfs/kubo#9856

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ProvideMany: high memory usage when providing tens of millions of CIDs #354

ProvideMany: high memory usage when providing tens of millions of CIDs #354

ajnavarro commented Oct 11, 2022

welcome bot commented Oct 11, 2022

ajnavarro commented Oct 11, 2022

BigLep commented Oct 18, 2022

ischasny commented Oct 28, 2022

ajnavarro commented Oct 28, 2022

ischasny commented Oct 28, 2022

ajnavarro commented Oct 28, 2022

ProvideMany: high memory usage when providing tens of millions of CIDs #354

ProvideMany: high memory usage when providing tens of millions of CIDs #354

Comments

ajnavarro commented Oct 11, 2022

welcome bot commented Oct 11, 2022

ajnavarro commented Oct 11, 2022

BigLep commented Oct 18, 2022

ischasny commented Oct 28, 2022

ajnavarro commented Oct 28, 2022

ischasny commented Oct 28, 2022

ajnavarro commented Oct 28, 2022