Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Iterators.partition gives suboptimal partitions for multithreading #37

Closed
nalimilan opened this issue Mar 14, 2021 · 3 comments
Closed

Comments

@nalimilan
Copy link

I've noticed that you use a tforeach implementation inspired by ThreadTools. I did the same for DataFrames.jl, and we noticed that Iterators.partition doesn't give very good results since the last chunk will generally be smaller than others, which is bad for load balancing. See JuliaData/DataFrames.jl#2661 (comment) for details and a possible solution. I hope this is useful.

@jondeuce
Copy link
Owner

Thanks for the tip! I actually have gone through several iterations of tforeach-type abstractions, so I'm sure your findings will be helpful. I've found it particularly tricky to print progress updates to a log file and/or to the console during high throughput multithreaded workloads without compromising performance. I don't think DataFrames.jl does progress printing, but if you do I'd be very interested in any tips there, too!

@nalimilan
Copy link
Author

Unfortunately I don't really have tips in that area. I guess I would have checked a counter to print progress only every N iterations, but of course that depends on the particular case.

@jondeuce
Copy link
Owner

jondeuce commented Mar 15, 2021

Yup, that's essentially what I've settled on. ProgressMeter.jl worked almost out of the box, but the "live updating" progress bar (essentially, repeatedly printing over itself) would occasionally not show up. My current solution works quite well, if a bit hacky:

msg = replace(msg, "\r" => "")

But I digress ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants