Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workers should inherit Pkg environment #28781

Closed
rened opened this issue Aug 20, 2018 · 13 comments · Fixed by #43270
Closed

Workers should inherit Pkg environment #28781

rened opened this issue Aug 20, 2018 · 13 comments · Fixed by #43270
Labels
packages Package management and loading parallelism Parallel or distributed computation

Comments

@rened
Copy link
Member

rened commented Aug 20, 2018

(Moved over from JuliaLang/Pkg.jl#675)

If I have a project where I installed MyPackage and I can say

julia --project -e "using MyPackage"

the following does only work when I manually activate the environment:

using Distributed
using MyPackage   # works
addprocs(3)
@everywhere
  using Pkg; Pkg.activate(".")  # required
  using MyPackage
  ...
end

I would suggest that workers inherit the environment that is active when they are added with addprocs. Should there be any calls to Pkg.activate after addprocs they have to be manually performed by the user on all workers.

@affans
Copy link
Contributor

affans commented Aug 21, 2018

Can you explain a little bit more to whats happening? I have a large codebase I want to port to 1.0 that uses HPC/addprocs quite a bit. If I know better what is happening, maybe I can submit a PR

@JLTastet
Copy link

I just got bitten by this issue while I was porting a small project from 0.6 to 1.0. As soon as I started using a separate environment for the project, worker processes started to throw Package ... is required but does not seem to be installed exceptions.

In addition to @rened's suggestion, I would also suggest that (MyProject) activate . activates the environment on worker processes as well, or at least issues a warning instead of silently ignoring them. This would be useful when starting Julia with the -p <N> option and then manually activating the environment.

@JLTastet
Copy link

@affans It seems that (MyProject) activate . and Pkg.activate(".") only activate the environment on the master process. To activate it on all processes, you would need to use the following code:

using Distributed
@everywhere using Pkg
@everywhere Pkg.activate(".")

You can then use @everywhere using MyPackage as usual.

@StefanKarpinski
Copy link
Member

This requires some careful thinking about how best to do this. One possibility is to send a manifest from the master node to the workers and insist that the manifest be usable on all nodes. That's fine for non-dev packages and even for dev packages with relative paths but dev packages with absolute paths may be a bit of an issue. We could rewrite dev paths to use a commit instead and then send that over; of course, it requires that the workers know about the tree hash that's being used...

@matteoacrossi
Copy link

I ran into the same problem and it took me a while to understand that the problem was with activate. So I believe there should at least be a warning when using activate with worker processes.

@simonbyrne
Copy link
Contributor

Another work around is to use the JULIA_PROJECT environment variable instead of --project, as that will be passed to subprocesses.

@StefanKarpinski
Copy link
Member

For now, that's the recommended way to do this.

@oxinabox
Copy link
Contributor

oxinabox commented Jan 29, 2019

One thing one wants to be clear on is which enviroment a worker should be getting.
If a Package is startng a worker, then the worker should probably have that packages enviroment.
not the enviroment that the main process is running in.

As in: malmaud/TensorFlow.jl#493

@oxinabox
Copy link
Contributor

Oh in the case of a package that owns a worker and want it it use it's environment it is even more complicated. Since even if the package created the worker,
It will still exist globally, and so when any using is run in the main process, the worker will attempt to load the module -- which will be from the main environment.

So workers need both the packages environment and the main environment.
I think this can be done with LOAD_PATH for nested environments right?

@simonbyrne
Copy link
Contributor

As an additional point, it turns out that setting JULIA_PROJECT doesn't work when starting with --machine-file.

@chriselrod
Copy link
Contributor

chriselrod commented Sep 2, 2021

I always run something like:

addprocs(18, exeflags = "--project=$(Base.active_project())")

would be nice if this could be the default, at least for local addprocs, where all workers should have access to the same Manifest.

Until the GC issues with threading are fixed (e.g. #40644), then local Distributed often performs much better than threading. E.g., I see very limited scaling beyond 8 threads for many workloads, while Distributed.pmap continues to scale linearly/close to it.

While this can and should be fixed at the package level (i.e. minimize allocations), that will require a concerted effort, and in the mean time Distributed is an effective workaround for better performance.

@garrison
Copy link
Member

garrison commented Sep 2, 2021

In addition to inheriting the project environment, I believe that local workers should use the same depot settings as well. (Technically, this is a separate issue, but I think it would be good to revisit and fix this at the same time as this.) Example of me working around this is here.

@garrison
Copy link
Member

garrison commented Sep 2, 2021

I think this can be done with LOAD_PATH for nested environments right?

If the LOAD_PATH of each child were set to be the same as the result of Base.load_path(), I believe this would effectively set the "project environment" for all things except Pkg operations -- it would at least make the installed packages available to the workers.

fredrikekre added a commit that referenced this issue Feb 24, 2022
Local workers now inherit the package environment of the main process,
i.e. the active project, LOAD_PATH, and DEPOT_PATH. This behavior
can be overridden by passing the new `env` keyword argument, or by
passing `--project` in the `exeflags` keyword argument.

Fixes #28781, and closes #42089.
fredrikekre added a commit that referenced this issue Feb 25, 2022
Local workers now inherit the package environment of the main process,
i.e. the active project, LOAD_PATH, and DEPOT_PATH. This behavior
can be overridden by passing the new `env` keyword argument, or by
passing `--project` in the `exeflags` keyword argument.

Fixes #28781, and closes #42089.
staticfloat pushed a commit to JuliaCI/julia-buildkite-testing that referenced this issue Mar 2, 2022
…ng#43270)

Local workers now inherit the package environment of the main process,
i.e. the active project, LOAD_PATH, and DEPOT_PATH. This behavior
can be overridden by passing the new `env` keyword argument, or by
passing `--project` in the `exeflags` keyword argument.

Fixes JuliaLang#28781, and closes JuliaLang#42089.
LilithHafner pushed a commit to LilithHafner/julia that referenced this issue Mar 8, 2022
…ng#43270)

Local workers now inherit the package environment of the main process,
i.e. the active project, LOAD_PATH, and DEPOT_PATH. This behavior
can be overridden by passing the new `env` keyword argument, or by
passing `--project` in the `exeflags` keyword argument.

Fixes JuliaLang#28781, and closes JuliaLang#42089.
Keno pushed a commit that referenced this issue Jun 5, 2024
Local workers now inherit the package environment of the main process,
i.e. the active project, LOAD_PATH, and DEPOT_PATH. This behavior
can be overridden by passing the new `env` keyword argument, or by
passing `--project` in the `exeflags` keyword argument.

Fixes #28781, and closes #42089.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
packages Package management and loading parallelism Parallel or distributed computation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants