You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've noticed that you use a tforeach implementation inspired by ThreadTools. I did the same for DataFrames.jl, and we noticed that Iterators.partition doesn't give very good results since the last chunk will generally be smaller than others, which is bad for load balancing. See JuliaData/DataFrames.jl#2661 (comment) for details and a possible solution. I hope this is useful.
The text was updated successfully, but these errors were encountered:
Thanks for the tip! I actually have gone through several iterations of tforeach-type abstractions, so I'm sure your findings will be helpful. I've found it particularly tricky to print progress updates to a log file and/or to the console during high throughput multithreaded workloads without compromising performance. I don't think DataFrames.jl does progress printing, but if you do I'd be very interested in any tips there, too!
Unfortunately I don't really have tips in that area. I guess I would have checked a counter to print progress only every N iterations, but of course that depends on the particular case.
Yup, that's essentially what I've settled on. ProgressMeter.jl worked almost out of the box, but the "live updating" progress bar (essentially, repeatedly printing over itself) would occasionally not show up. My current solution works quite well, if a bit hacky:
I've noticed that you use a
tforeach
implementation inspired by ThreadTools. I did the same for DataFrames.jl, and we noticed thatIterators.partition
doesn't give very good results since the last chunk will generally be smaller than others, which is bad for load balancing. See JuliaData/DataFrames.jl#2661 (comment) for details and a possible solution. I hope this is useful.The text was updated successfully, but these errors were encountered: