Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Work Queue (& Task Vine?) vs complicated return values #2908

Closed
benclifford opened this issue Oct 13, 2023 · 7 comments
Closed

Work Queue (& Task Vine?) vs complicated return values #2908

benclifford opened this issue Oct 13, 2023 · 7 comments

Comments

@benclifford
Copy link
Collaborator

Describe the bug
if a python app returns an object that is too complex for pickle to serialize, this breaks.

i) pickle in exec_parsl_function.py cannot serialize the return value

ii) I think (but I haven't confirmed) that passing the result value along a multiprocessing.Queue in parsl/executors/workqueue/executor.py also uses pickle to get the object between processes, and so this is likely to fail too.

For comparison, the HighThroughputExecutor doesn't hit these problems:

i) it uses parsl.serialize, rather than pickle, to send back results
ii) deserialization happens in the main process, so that the deserialized object is never sent along a multiprocessing queue

cc @tphung3 @dthain

To Reproduce

In PRs #2904, #2906 , there is a test case that fails with Work Queue, when trying to return a dataclass definition.

Expected behavior
More serializable result handling

@dthain
Copy link
Contributor

dthain commented Oct 13, 2023

@tphung what do you think?

@benclifford
Copy link
Collaborator Author

(serialization is something I'm actively working on so I'm not expecting Cooperative Computing Lab people to do anything here - and it's also sufficiently obscure that this isn't super urgent)

@tphung3
Copy link
Contributor

tphung3 commented Oct 13, 2023

The issue makes sense to me. TaskVineExecutor seems to have the same problem. A quick and correct fix would be to use serialize library from Parsl as mentioned by @benclifford. I think this fix doesn't get in the way of your work in serialization right @benclifford?

@benclifford
Copy link
Collaborator Author

I think point i) will be addressed by using parsl.serialize. I think the 2nd point is a bit more complicated - I haven't fully debugged what is happening there so I don't know if it is even true or not.

@tphung3
Copy link
Contributor

tphung3 commented Oct 13, 2023

If mq.Queue uses pickle to transfer data (I'm pretty sure it does), then the unpickle operation can happen in the main process in the collector thread. The workqueue process then only sends back a notification to the collector thread so it knows to unpickle results from files. Unpickle is then replaced by deserialize method in parsl.

I think this would clear all corner cases right? @benclifford

@benclifford
Copy link
Collaborator Author

Yeah I think that will work.

In htex there's a thread that does the deserialisation but it lives in the same process, rather than being a separate process like in WQExecutor

benclifford pushed a commit that referenced this issue Oct 25, 2023
benclifford added a commit that referenced this issue Oct 26, 2023
benclifford added a commit that referenced this issue Nov 1, 2023
@benclifford
Copy link
Collaborator Author

This is fixed by PR #2908 and the test that discovered this is enabled by PR #2928

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants