-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
map should iterate #51
Comments
You are right, SCOOP generates futures from every element of the iterable first. The reason is to be able to distribute those tasks to remote workers. If Batching the input explicitly would require the user to manually emit batches when he desires. I haven't found a way to provide a clean and simple API to perform this. Batching implicitly (on SCOOP's end) would delay the scheduling of later tasks, which could hinder load balancing. I am not formally against this, as long as the default value does not cause surprise to the power user and performs suboptimally for the beginner. |
Perhaps a solution would be for the user to provide an argument. |
My system runs out of RAM when the iterator is consumed all at once 🤗🤓 |
passing huge lists / iterables into
map
ormap_as_completed
will first "register" them all for computation and only after it exhausted them all, compute them in parallel.try running the following with
python -m scoop example.py
and notice how nothing is printed for a long time:I think the reason is in https://github.com/soravux/scoop/blob/master/scoop/futures.py#L94 . In python 3
map
returns an iterable. Even for python 2 it would be cool if the internal function would batch the input.The text was updated successfully, but these errors were encountered: