Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: improved fs uploads performance #2333

Merged
merged 7 commits into from
Oct 29, 2024

Conversation

grumbach
Copy link
Member

@grumbach grumbach commented Oct 28, 2024

  • Parallel file uploading with capped concurrency to avoid memory burst
  • Parallel chunk uploading with capped concurrency to avoid memory burst
  • Local testing reveals this PR improves upload performance of CLI directories upload by x50 FIFTY!!
  • Configurable batching for files and chunks through env vars

@grumbach grumbach enabled auto-merge October 28, 2024 08:49

use super::archive::{Archive, ArchiveAddr};
use super::data::{DataAddr, GetError, PutError};

pub(crate) const FILE_UPLOAD_BATCH_SIZE: usize = 128;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the rationale behind 128? Do we need batching at all?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried with unlimited and my memory useage spiked to the moon.
So it seems you were right about batching, and we probably need some. Local testing revealed that 128 was a sweet spot (at least on my machine), not too high for mem, not too slow.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do worry a little about the relative performance (e.g. PC vs Raspberry Pi). Can we relate this value to CPU core count?

https://doc.rust-lang.org/std/thread/fn.available_parallelism.html

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oooh excellent! Yes!

@mazzi
Copy link
Contributor

mazzi commented Oct 28, 2024

Good one! What about a couple of tests? To check file integrity and probably we can profile in the future.
What do you think?

- name: Run autonomi tests
timeout-minutes: 25
run: cargo test --release --package autonomi --lib --features="full,fs"

- name: Run autonomi tests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think the same workflow is repeated twice! Run autonomi tests
Maybe some rebase conflicts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I'm actually sneaking this back in as it was removed as a side effect of:
#2334

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh cool!

@jacderida jacderida changed the base branch from main to rc-2024.10.4 October 28, 2024 22:43
@jacderida jacderida disabled auto-merge October 28, 2024 22:44
@grumbach grumbach merged commit 7203ffd into maidsafe:rc-2024.10.4 Oct 29, 2024
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants