Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LOAD_PATH in Pkg3 #82

Closed
StefanKarpinski opened this issue Dec 17, 2017 · 6 comments
Closed

LOAD_PATH in Pkg3 #82

StefanKarpinski opened this issue Dec 17, 2017 · 6 comments

Comments

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Dec 17, 2017

Ref #53, #62. I've figured out a design that integrates the old LOAD_PATH loading style and the new Pkg3 project environment concept. This stems from two observations:

  • A directory in LOAD_PATH is a kind of environment, if we consider the mapping from names to paths to be implicitly given by the file layout.

  • We could put Pkg3-style environments in LOAD_PATH as well.

Instead of the JULIA_ENV environment variable that I had introduced to specify one and only one environment to load packages from, we can keep LOAD_PATH but with some changes in behavior:

  1. If an entry in LOAD_PATH is a path to a directory not containing a Project.toml or JuliaProject.toml file, then it is considered an old-style implicit environment where package names are mapped to entry points based on the layout of the directory.

  2. If an entry in LOAD_PATH is a path to a directory containing a Project.toml or JuliaProject.toml file, then it is considered a new-style explicit environment where package names are mapped to entry points based on the contents of the project and manifest files.

  3. If an entry in LOAD_PATH is a path to a TOML file, then it is interpreted as the project file of a new-style explicit environment where package names are mapped to entry points based on the contents of the project file and the corresponding manifest files.

  4. You can continue to control the default contents of LOAD_PATH via the JULIA_LOAD_PATH environment variable.

  5. The julia --env=<env>... command line flag replaces the contents of LOAD_PATH with the given environment specifications, <env>....

  6. The julia --env+<env>... command line flag appends the contents of LOAD_PATH with the given environment specifications, <env>....

Environment specification

Pkg3 has somewhat more complex needs for LOAD_PATH entries than just paths of directories. For example, one wants to be able to find the current project by looking for a git repo in the current working directory's parent directories. It's also common in Pkg3-style loading to want to consider only a sinlge environment for the sake of having one consistent package mapping, rather than an overlay of possibly maps which may not be consistent with each other. To that end, one may want to specify and exclusive alternation of possible environments and only use the first one that exists, rather than loading packages piecemeal from different LOAD_PATH entries. To that end, I propose introducing the following "rich" interpretations of --env=<env>, --env+<env> and JULIA_LOAD_PATH entries:

  1. If an entry starts and ends with [ and ] it is interpreted as a comma-separated list of sub-entries, which are considered as exclusive environment alternatives, resolving to the first one which exists. This is represented in Julia as an array of environment specifiers.

  2. If an entry starts with a valid Julia identifier followed by a ( and ends with a ) then it is interpreted as a custom environment specifier, and is represented in Julia as a corresponding type. There will be a whitelist of allowed identifiers and fixed corresponding types that they construct. If the syntax is used with a non-whitelisted identifier, the entry is invalid and will be ignored.

  3. In custom environment specifiers, in the contents of any string literal, string interpolation syntax will replace certain whitelisted identifier names with corresponding values. This will include major, minor and patch which will be replaced with VERSION.major, VERSION.minor and VERSION.patch, respectively. Any custom environment specifier which uses an identifier name that is not whitelisted, is invalid and will be ignored.

Some custom environment specifiers that we'll want to initially support include:

  • CurrentProject() to look for a project directory in the parents of the current working directory;
  • NamedEnv("name") to look for a named environment in joinpath(DEPOTS[1], "environments");
  • NamedEnv("name", create=true) to look for a named environment and create it if it doesn't exist.

These syntaxes are designed to mimic standard Julia function call syntax, but they are a not general: only a very specific, limited subset of Julia syntax is allowed; you cannot put arbitrary Julia code in the JULIA_LOAD_PATH environment variable and have it be executed. The contents are parsed and mapped to specific expected behaviors, not evaled. Arbitrary code can be evaluated to construct the contents of LOAD_PATH in the ~/.juliarc.jl file, however.

Default LOAD_PATH

The following is a possible good default value of LOAD_PATH should be in Julia syntax:

LOAD_PATH = Any[
    [ CurrentProject(),
      NamedEnv(“v$(VERSION.major).$(VERSION.minor).$(VERSION.patch)”),
      NamedEnv(“v$(VERSION.major).$(VERSION.minor)”),
      NamedEnv(“v$(VERSION.major)”),
      NamedEnv(“default”),
      NamedEnv(“v$(VERSION.major).$(VERSION.minor)”, create=true),
]

This would be specified in JULIA_LOAD_PATH of --env=<env> with the following string:

[CurrentProject(), NamedEnv(“v$major.$minor.$patch”), NamedEnv(“v$major.$minor”), NamedEnv(“v$major”), NamedEnv(“default”), NamedEnv(“v$major.$minor”, create=true)]

The meaning of this LOAD_PATH is that only a single environment is used to load packages, and it is the first of the following, which exists, assuming Julia version 1.2.3:

  • the current project, found by searching the parent directories of the current working directory for a directory containing a Project.toml or a JuliaProject.toml file;
  • the directory ~/.julia/environments/v1.2.3 if it exists
  • the directory ~/.julia/environments/v1.2 if it exists
  • the directory ~/.julia/environments/v1 if it exists
  • the directory ~/.julia/environments/default if it exists
  • the directory ~/.julia/environments/v1.2, creating it if it does not exist

Name collisions

When loading a package from a given environment, all dependencies recursively loaded in the process are resolved within the same environment. This is a significant semantic change from the previous LOAD_PATH behavior, wherein if you had LOAD_PATH = [ "dir1", "dir2" ] you could, in the process of loading Foo from dir1 have Foo require Bar and load it from dir2. This behavior would be incompatible with explicit Pkg3 environments, and since the whole premise of this scheme is that directories are an implicit environment, they should work in the same way. Fortunately, it seems unlikely that this would be a common problem in practice since if Foo is installed in dir1 one would expect all of Foo's dependencies to also be installed there.

In the presence of multiple environments in the LOAD_PATH there is a possibility of load order becoming significant in the following way: if one loads A which loads dependency D, if one subsequently loads B which depends on D as well, then B will get the version of D provided by the environment that A was loaded from; if A and B come from different environments, this could be a different version of D than the environment B comes from would provide, and if B had been loaded first a different version of D would have been loaded. It seems like this is just an inherent problem of having multiple LOAD_PATH entries overlaying different, potentially incompatible environments. In production, the LOAD_PATH should only ever contain a single environment entry and it should probably be spelled out explicitly as an absolute or relative path. The utility of multiple environments in the LOAD_PATH is primarily so that one can work on a project and easily load tools that don't belong in that project, but which one has installed in some directory or in a named environment. This may lead to version incompatibilities, but that's a acceptable for interactive debugging usage.

@StefanKarpinski
Copy link
Member Author

StefanKarpinski commented Dec 17, 2017

Interaction with checked out packages

At first I thought that checked out packages were related to this, but that's actually a mostly independent matter. When you check out a package, it gets checked out in JULIA_DEVDIR which defaults to joinpath(DEPOTS[1], "dev") and a path to it is placed in the current environment's manifest. However, when the current environment is an old-style implicit one, there is no Manifest.toml file to put a path into, so what does one do? I can see two options:

  1. Check out the package into the implicit environment directory structure instead of in JULIA_DEVDIR.

  2. Check out the package into JULIA_DEVDIR as normal, but create a symlink from that location into the implicit environment directory.

I'm not sure which approach is better.

There's also the question of "What is the current environment?" I would say that the current environment should be the environment corresponding to the first entry in LOAD_PATH that exists.

@StefanKarpinski
Copy link
Member Author

StefanKarpinski commented Dec 17, 2017

Some questions:

  • Should LOAD_PATH be renamed? E.g. to ENV_PATH or ENVIRONMENTS?
  • Or should the command-line options --env= and --env+ be --load-path= and --load-path+?

@StefanKarpinski
Copy link
Member Author

StefanKarpinski commented Dec 17, 2017

Example usage, you might be working on a project and want to use a profiler and debugger, and so start julia using julia --env+NamedEnv("devtools") which would have the effect of appending the devtools named environment to your LOAD_PATH, thereby giving you access to any of the packages installed in there, in addition to what's in your current project's environment.

This does kind of suggest that the NamedEnv("devtools") syntax is too verbose, so maybe we should have more concise syntaxes. Maybe --env+@devtools? And @@ for CurrentProject()? And !@devtools to create the devtools named environment if it doesn't already exist? Or maybe we don't need a syntax for that. The long syntax is fine for setting LOAD_PATH since that tends to be done once in a config file somewhere, but the command-line really wants concise syntax.

@tpapp
Copy link
Contributor

tpapp commented Jan 30, 2018

  1. Would LOAD_PATH then effectively implement what is described as DEPOT_PATH in the Julep?

  2. If I have projects scattered all over my home directory (eg because I keep them alongside LaTeX source and other misc stuff from coauthors relevant to a working paper, organized into directories, which may not follow the Julia package layout from their root), would it be sufficient to just make a single directory, put it in LOAD_PATH, and symlink in the project directories there?

@StefanKarpinski
Copy link
Member Author

In short, no. The best description of the roles of LOAD_PATH and DEPOT_PATH in Pkg3 are now here: JuliaLang/julia#25709. This will obviously be properly documented before we release 1.0 final... we're still working on integration. In short, LOAD_PATH is used to determine what packages to load while DEPOT_PATH is used, among other things, to find installed versions of packages based on their UUID and SHA-1 git hash (also to look for registries and named environments).

@StefanKarpinski
Copy link
Member Author

This is now implemented as of JuliaLang/julia#25455.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants