-
-
Notifications
You must be signed in to change notification settings - Fork 719
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modifying a task via scheduler plugin #1384
Comments
Modifying tasks as they arrive would be tricky. You would have to rearrange Another option would be to do this on the client side by overriding |
Given PR ( dask/dask#2748 ) and PR ( #930 ) add Edit: Maybe PR ( dask/dask#3196 ) would make this possible. |
This can be done pretty easily these days by adding to the |
I have the exact same use case. @jakirkham, do you mind sharing an example on how this can be achieved on a simple custom graph (using Edit: In fact, I am wondering if nowadays @mrocklin 's |
I am trying to modify a task by using a scheduler plugin, unsure whether this is possible? @mrocklin you mentioned here dask/dask#2119 (comment) that this could be done - although this was the dask scheduler, not distributed.
My use case is; I would like to combine the distributed scheduler with joblib.Memory for building a data pipeline that has some smart caching. Joblib does this well by saving a copy of the the source code of the function as well as the inputs. I would like to extend this notion by invalidating any child nodes where a parent is to be recomputed.
Now, my first thought would be to do something like;
I would like to simply remove the cache file that joblib uses, however at the
update_graph
stage thetask_args
are only references to other tasks, not the actual result values.My next thought was that I could mark the function with some sort offorce_recompute
flag, but it appears (please correct me if I am wrong) that modifying the functions inkwargs
has no effect on the tasks that are sent to the workers. Is this conclusion correct?edit: this was an issue in cloudpickle hard-coding attributes to pickle
What is the most suitable way for me to achieve the above?
The text was updated successfully, but these errors were encountered: