Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix to macos multiprocessing spawn and context issues #2076

Merged
merged 6 commits into from
Jun 29, 2021

Conversation

yongyanrao
Copy link
Collaborator

@yongyanrao yongyanrao commented May 26, 2021

As discussed in Issue#2070, htex did not work on macos under python3.8 or python3.9. There are two folds for the reason.

  1. spawn becomes the default for process creation from python3.8, which does not handle resources properly.
  2. When trying to solve the issue above by forcing fork method, it does not get the context of the main thread properly. We need to explicitly do that.

This PR tries to solve the issues. We can use the following code to test on macos. It should work with the installation from yrao2-macos-fork branch, and it should not work with the installation from master.

import parsl
import os
import os.path
from os import path
from parsl.app.app import python_app, bash_app
from parsl.providers import LocalProvider
from parsl.channels import LocalChannel
from parsl.config import Config
from parsl.executors import HighThroughputExecutor
from parsl.data_provider.files import File
from parsl.configs.local_threads import config

from parsl.executors.threads import ThreadPoolExecutor

local_threads = Config(
    executors=[
        ThreadPoolExecutor(max_threads=8, label='local_threads')
    ]
)

local_htex = Config(
    executors=[
        HighThroughputExecutor(
            label="htex_Local",
            worker_debug=True,
            cores_per_worker=1,
            provider=LocalProvider(
                channel=LocalChannel(),
                init_blocks=1,
                max_blocks=1,
                worker_init=(
                             'export OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES\n'
                            ),
            ),
        )
    ],
    strategy=None,
)    

@bash_app
def generate(outputs=[]):
    return "echo $(( RANDOM )) &> {}".format(outputs[0])

@bash_app
def concat(inputs=[], outputs=[]):
    return "cat {0} > {1}".format(" ".join(i.filepath for i in inputs), outputs[0])

@python_app
def total(inputs=[]):
    total = 0
    with open(inputs[0], 'r') as f:
        for l in f:
            total += int(l)
    return total


def main():

    parsl.clear()
    parsl.load(local_htex)

    # Create 5 files with semi-random numbers
    output_files = []
    for i in range (5):
         output_files.append(generate(outputs=[File(os.path.join(os.getcwd(), 'random-%s.txt' % i))]))
    
    # Concatenate the files into a single file
    cc = concat(inputs=[i.outputs[0] for i in output_files],
                outputs=[File(os.path.join(os.getcwd(), 'combined.txt'))])
    
    # Calculate the sum of the random numbers
    x = total(inputs=[cc.outputs[0]])
    
    print (x.result())

if __name__ == '__main__':
    main()

@yongyanrao yongyanrao linked an issue May 26, 2021 that may be closed by this pull request
@yongyanrao yongyanrao marked this pull request as draft June 3, 2021 19:32
package, to avoid any explicit handling from the end user side
@yongyanrao yongyanrao marked this pull request as ready for review June 4, 2021 15:37
Copy link
Member

@danielskatz danielskatz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works on my mac!

@benclifford
Copy link
Collaborator

This seems to make htex work for me on:

(parsl) benc@8740 parsl % sw_vers
ProductName:	macOS
ProductVersion:	11.2.1
BuildVersion:	20D75
(parsl) benc@8740 parsl % uname -a
Darwin 8740.local 20.3.0 Darwin Kernel Version 20.3.0: Thu Jan 21 00:07:06 PST 2021; root:xnu-7195.81.3~1/RELEASE_X86_64 x86_64

@@ -33,6 +33,10 @@

from parsl.dataflow.dflow import DataFlowKernel, DataFlowKernelLoader

import multiprocessing
if platform.system() == 'Darwin':
multiprocessing.set_start_method('fork', force=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yongyanrao I don't understand why this needs to be set globally. I'm concerned about how it interacts with other libraries which are equally entitled to set the global multiprocessing start method.

@benclifford benclifford merged commit e87a0aa into master Jun 29, 2021
@benclifford benclifford deleted the yrao2-macos-fork branch June 29, 2021 18:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

htex error under macos
3 participants