Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ProvideMany: high memory usage when providing tens of millions of CIDs #354

Open
ajnavarro opened this issue Oct 11, 2022 · 7 comments
Open
Labels
need/triage Needs initial labeling and prioritization

Comments

@ajnavarro
Copy link
Member

When using BatchProviding, we are not really batching, but sending all the CIDs at the same time to the Router implementing ProvideMany.

To avoid collateral problems, we should actually batch the calls to ProvideMany.

This will help with Reframe Router implementation (https://github.com/ipfs/go-delegated-routing) to avoid huge JSON payloads sent to the server.

We need to find good defaults to still keep FullRT DHT implementation with good performance numbers.

That is like 1/10th of the memory spike observed, we are still searching for other possible problems.

@ajnavarro ajnavarro added the need/triage Needs initial labeling and prioritization label Oct 11, 2022
@welcome
Copy link

welcome bot commented Oct 11, 2022

Thank you for submitting your first issue to this repository! A maintainer will be here shortly to triage and review.
In the meantime, please double-check that you have provided all the necessary information to make this process easy! Any information that can help save additional round trips is useful! We currently aim to give initial feedback within two business days. If this does not happen, feel free to leave a comment.
Please keep an eye on how this issue will be labeled, as labels give an overview of priorities, assignments and additional actions requested by the maintainers:

  • "Priority" labels will show how urgent this is for the team.
  • "Status" labels will show if this is ready to be worked on, blocked, or in progress.
  • "Need" labels will indicate if additional input or analysis is required.

Finally, remember to use https://discuss.ipfs.io if you just need general support.

@ajnavarro ajnavarro self-assigned this Oct 11, 2022
@ajnavarro
Copy link
Member Author

CC @ischasny

@BigLep
Copy link
Contributor

BigLep commented Oct 18, 2022

2022-10-18 conversation: we aren't aware of this being a blocker at the moment so not prioritizing currently, but feedback welcome if this needs to be moved up sooner.

@ischasny
Copy link
Contributor

@ajnavarro as per our discussion, might be a good idea to chunk up the CIDs snapshot into smaller pieces so that at least we don't squeeze all of them into a single HTTP request. That is problematic on larger nodes (like web3 storage). Snapshots don't get reprovided even with high router timeouts. Maybe we can do that only for reframe router initially? That should save us some memory on both sending and receiving side. Wdyt?

@ajnavarro
Copy link
Member Author

Yeah, won't be the final solution, but will help in providing over HTTP.

@ischasny
Copy link
Contributor

Great! Would you guys be up for taking it into the next release? Should be simple to do and would unblock us too.

@ajnavarro
Copy link
Member Author

Related issue: ipfs/go-delegated-routing#55

@ajnavarro ajnavarro removed their assignment Feb 5, 2023
@hacdias hacdias transferred this issue from ipfs/go-ipfs-provider Jun 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need/triage Needs initial labeling and prioritization
Projects
No open projects
Status: 🥞 Todo
Archived in project
Development

No branches or pull requests

3 participants