-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Balance performance on small files #108
Comments
The tool mostly just walks over the filesystem and calls rsync to copy files (because while I can certainly recreate the behavior of rsync, rsync is well trusted). What kind of system do you have? Small files will always be higher cost. You could put a file size filter which may help (the paths still all have to be walked.) A better solution, which is planned already, is to decide what to move where all at once, write some temp files, and then use |
The system is an Intel i7 2760QM (mobile chip) with 16GB of DDR3 RAM. The disks are 8 and 12TB WD ones, low RPM but are capable of 150MB/sec sequencial read/write. They are all connected through a Dell Perc H310 flashed in IT mode. |
It blocks on the execution of rsync which should limit CPU usage... I can't even get close to 100% usage if I change rsync to "bash -c true" but maybe my system is just faster. Regardless, a change of what I described is basically a full rewrite. |
I am running mergerfs.balance on a filesystem that is comprised of lots of big but even more small files (source code, potentially even compile artifacts).
The moment balance starts to move the small files, the whole process turns into an unbelievably slow torture. CPU utilization jumps to 100% and disk I/O becomes almost 0%. It's been 2 days now and barely 10GB have been moved!
I switched from cpython to pypy3 to see if that would improve things, I think it slightly did, but not by a huge margin. Is there something that I could do to help this process? Is there some logic that if added to the script, it would improve the performance of small file transfer?
E.g. Use some os or shutil to see if a folder is comprised of a large number of small files, then tar them all, move the tar and extract it to the new target?
If I implemented something like that and filed a PR would that be of interest to be accepted?
The text was updated successfully, but these errors were encountered: