Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move Parsl away from multiprocessing Fork #3723

Open
benclifford opened this issue Dec 15, 2024 · 0 comments
Open

Move Parsl away from multiprocessing Fork #3723

benclifford opened this issue Dec 15, 2024 · 0 comments

Comments

@benclifford
Copy link
Collaborator

benclifford commented Dec 15, 2024

Please comment if you have opinions on this

Issue #2343 brought up that using multiprocessing's fork start method is not safe in the presence of threads. More recently, Python has been more aggressively pushing that line too - https://docs.python.org/3/library/multiprocessing.html states that:

The default start method will change away from fork in Python 3.14. Code that requires fork should explicitly specify that[.]

Using fork in Parsl leads to practical race conditions resulting in mysterious, non-deterministic hangs - this is not a theoretical problem.

PR #3463 switches one use of multiprocessing fork (the HTEX interchange) to a separately launched process. This was quite a bit of effort, and the interchange was already mostly set up to be a separate process due to earlier remote-interchange work that @yadudoc did a long time ago.

The experience there suggests that it would be useful to keep using multiprocessing where it is currently used, but with a different spawn method.

The main user-facing problem I see there is that it would become mandatory to wrap your workflow scripts with this idiom, which https://docs.python.org/3/library/multiprocessing.html#multiprocessing-programming justifies:

if __name__ == "__main__":
    your_workflow_here()

This new requirement has discouraged me previously from pursuing that approach. However, I pasted this to the #parsl-help slack channel and got largely positive responses that this would not be a serious problem:

Here is a thing that is awkward and I would like user input:

  • the current way Parsl does multiple processes is fundamentally broken, and results in random-seeming deadlocks

  • there is a different Python-recommended way, but that will require all user workflow scripts to use the common Python idiom of protecting your main code with:

if __name__ == "__main__":
your code here

otherwise it will break. That is the main reason we haven't made that change before.

  • there are other more invasive changes that could be made to Parsl to not use this recommended way but that is a > lot of work

  • the payoff is fewer random-seeming deadlocks in Parsl code (that actually have an explanation)

I am wondering how much requiring this protection idiom will cause trouble for users. I know a lot of people already write their main code this way anyway.

In a brief technical test of using forkserver or spawn, I got at least one problem related to object serialization so this is not an entirely straightforward switch internally either.

I think this switch will make things nicer for people trying to run Parsl on other platforms: for example, see issue #1878 where some errors are related to the absence of the fork start method on Windows; and PR #2076 which introduces unpleasant global state changes when running on macOS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant