Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poetry install fails for nested local dependencies and develop = false #3368

Closed
3 tasks done
TrevorPace opened this issue Nov 16, 2020 · 21 comments
Closed
3 tasks done
Labels
kind/bug Something isn't working as expected

Comments

@TrevorPace
Copy link

TrevorPace commented Nov 16, 2020

  • I am on the latest Poetry version.
  • I have searched the issues of this repo and believe that this is not a duplicate.
  • If an exception occurs when executing a command, I executed it again in debug mode (-vvv option).
  • Standard Python:3.8 docker image
  • 1.1.4

Issue

I am attempting to install a poetry based application with a monorepo-like hierarchy. For simplicity sake the structure looks something like:

/lib
    /pkg1
        pyproject.toml
    /pkg2
        pyproject.toml
/apps
    /app1
        pyproject.toml

The apps/app1/pyproject.toml is completely empty other than a reference to one of the libraries: pkg1 = { path = "../../lib/pkg1", develop = false }

pkg1 has then a similar local dependency on pkg2: pkg2 = { path = "../pkg2", develop = false }

An example repo of this is here: https://github.com/TrevorPace/poetry-localdep-issue

When attempting to run poetry install in apps/app1 to create the initial poetry.lock file I get the following issues:

Updating dependencies
Resolving dependencies... (0.1s)

Writing lock file

Package operations: 2 installs, 0 updates, 0 removals

  • Installing pkg2 (0.1.0 /home/trevor/git/trevor.pace/poetry-localdep-issue/libs/pkg2)
  • Installing pkg1 (0.1.0 /home/trevor/git/trevor.pace/poetry-localdep-issue/libs/pkg1): Failed

  EnvCommandError

  Command ['/home/trevor/.cache/pypoetry/virtualenvs/app1-Jm016c7i-py3.8/bin/pip', 'install', '--no-deps', '-U', '/home/trevor/git/trevor.pace/poetry-localdep-issue/libs/pkg1'] errored with the following return code 1, and output:
  Processing /home/trevor/git/trevor.pace/poetry-localdep-issue/libs/pkg1
    Installing build dependencies: started
    Installing build dependencies: finished with status 'done'
    Getting requirements to build wheel: started
    Getting requirements to build wheel: finished with status 'done'
      Preparing wheel metadata: started
      Preparing wheel metadata: finished with status 'error'
      ERROR: Command errored out with exit status 1:
       command: /home/trevor/.cache/pypoetry/virtualenvs/app1-Jm016c7i-py3.8/bin/python /home/trevor/.cache/pypoetry/virtualenvs/app1-Jm016c7i-py3.8/lib/python3.8/site-packages/pip/_vendor/pep517/_in_process.py prepare_metadata_for_build_wheel /tmp/tmp8olvb_pd
           cwd: /tmp/pip-req-build-845k8blz
      Complete output (16 lines):
      Traceback (most recent call last):
        File "/home/trevor/.cache/pypoetry/virtualenvs/app1-Jm016c7i-py3.8/lib/python3.8/site-packages/pip/_vendor/pep517/_in_process.py", line 280, in <module>
          main()
        File "/home/trevor/.cache/pypoetry/virtualenvs/app1-Jm016c7i-py3.8/lib/python3.8/site-packages/pip/_vendor/pep517/_in_process.py", line 263, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
        File "/home/trevor/.cache/pypoetry/virtualenvs/app1-Jm016c7i-py3.8/lib/python3.8/site-packages/pip/_vendor/pep517/_in_process.py", line 133, in prepare_metadata_for_build_wheel
          return hook(metadata_directory, config_settings)
        File "/tmp/pip-build-env-q6c1rb7q/overlay/lib/python3.8/site-packages/poetry/core/masonry/api.py", line 34, in prepare_metadata_for_build_wheel
          poetry = Factory().create_poetry(Path(".").resolve())
        File "/tmp/pip-build-env-q6c1rb7q/overlay/lib/python3.8/site-packages/poetry/core/factory.py", line 91, in create_poetry
          self.create_dependency(name, constraint, root_dir=package.root_dir)
        File "/tmp/pip-build-env-q6c1rb7q/overlay/lib/python3.8/site-packages/poetry/core/factory.py", line 242, in create_dependency
          dependency = DirectoryDependency(
        File "/tmp/pip-build-env-q6c1rb7q/overlay/lib/python3.8/site-packages/poetry/core/packages/directory_dependency.py", line 36, in __init__
          raise ValueError("Directory {} does not exist".format(self._path))
      ValueError: Directory ../pkg2 does not exist
      ----------------------------------------
  ERROR: Command errored out with exit status 1: /home/trevor/.cache/pypoetry/virtualenvs/app1-Jm016c7i-py3.8/bin/python /home/trevor/.cache/pypoetry/virtualenvs/app1-Jm016c7i-py3.8/lib/python3.8/site-packages/pip/_vendor/pep517/_in_process.py prepare_metadata_for_build_wheel /tmp/tmp8olvb_pd Check the logs for full command output.


  at ~/.local/lib/python3.8/site-packages/poetry/utils/env.py:1074 in _run
      1070│                 output = subprocess.check_output(
      1071│                     cmd, stderr=subprocess.STDOUT, **kwargs
      1072│                 )
      1073│         except CalledProcessError as e:
    → 1074│             raise EnvCommandError(e, input=input_)
      1075│
      1076│         return decode(output)
      1077│
      1078│     def execute(self, bin, *args, **kwargs):

So, it appears that we are actually able to correctly traverse the package tree when determining what to build, but when pkg1 actually is being built it's generating something with a local reference in it still. My initial thought is that it is somehow related to #3148.

@TrevorPace TrevorPace added kind/bug Something isn't working as expected status/triage This issue needs to be triaged labels Nov 16, 2020
@sinoroc
Copy link

sinoroc commented Nov 17, 2020

Is it really related to poetry config virtualenvs.create false? How does it behave without it?

@TrevorPace
Copy link
Author

Is it really related to poetry config virtualenvs.create false? How does it behave without it?

I think you're right. I've gone and created an example repo showing the problem: https://github.com/TrevorPace/poetry-localdep-issue

The issue occurs actually when even generating the poetry.lock file. I'll update the description of the issue accordingly.

@TrevorPace TrevorPace changed the title Poetry install with 'virtualenvs.create false' breaks for nested local dependencies Poetry install fails for nested local dependencies and develop = false Nov 17, 2020
@sinoroc
Copy link

sinoroc commented Nov 17, 2020

OK. Then it looks to me like it is simply a use case that is not supported (yet). Looks like your best bet is to add all path dependencies, even the indirect ones, in the pyproject.toml. So if apps/app1/pyproject.toml:

pkg1 = { path = "../../lib/pkg1", develop = false }
pkg2 = { path = "../../lib/pkg2", develop = false }

Maybe related:

@TrevorPace
Copy link
Author

I'm not sure about that, because when using "develop = true" it's actually correctly computing the correct paths for pkg2 when creating the .pth files in the virtual environment. There must be actual code looking into the the pyproject.toml of pkg1, resolving the dependencies and installing them into the virtual environment (or pkg2 would be neither installed nor correctly resolved). It seems like when it comes to actually creating the wheel or whatever intermediary it is passing to pip it isn't correctly updating the dependency. Which is why it works fine for pkg1 (because it has no other local dependencies to resolve). It also works fine for the main application, because it doesn't actually create that intermediary (just the app1.egg-info/).

I'll take a look at the poetry source to see what's up.

@TrevorPace
Copy link
Author

So, after a bit of a deep dive in the code.

I believe what is happening is:

  1. During dependency solving of app1, the pkg1 dependency is evaluated with Provider().get_package_from_directory(). This makes a call to PackageInfo().from_directory(), which essentially identifies pkg2 as a poetry package and ultimately adds it's own dependencies into the full dependency solver. Pkg2 being defined as a non-develop entity by pkg1's pyproject.toml, and thus is installed with pip install --no-deps <pkg2-path>. As pkg2 contains a pyproject.toml this causes pip to basically invoke poetry again, and it is correctly installed, as it has no local dependencies.

  2. After pkg1's dependencies are installed it's time to install pkg2. Pkg2 (like pkg1) is identified as a directory package and is again installed with pip install --no-deps <pkg2-path>. This is where the problem is though. Because when poetry.masonry.api gets invoked it's constructing a wheel but the dependencies get evaluated again and despite us using --no-deps it's still evaluating if they are still valid when we are creating the wheel. Poetry can't evaluate that path to "../pkg2" because all of this is being done inside a temporary build directory. Ultimately, the Factory() created by poetry.masonry.api should be skipping checking of local dependencies again somehow and when creating wheels, and should be only containing package name, not the path

In the case of development dependencies installed locally though this problem is avoided because a custom Builder is used in the PipInstaller which avoids the invocation with pip (and thus the isolated wheel generation step).

@TrevorPace
Copy link
Author

Slightly off-topic, but I just want to add as well, that this whole develop = true/false thing seems like a bad idea. Are there actually development situations where we want to install local directories into our virtual environment and not just reference them?

Couldn't we just always do a local install with our custom Builder, except when providing a build-flag to poetry during install: poetry install --copy-local-deps

This way for development we do nothing, and for creating Docker files we would just add the flag and disable creation of virtual environments to have everything installed into the filesystem.

If we still wanted to decide to install packages or not we could use the optional flag accordingly.

@sinoroc
Copy link

sinoroc commented Nov 18, 2020

Thanks for sharing your debugging experience. I did not follow exactly, but I am sure it will be helpful to the maintainers when they get time to visit this ticket.


Personal opinion:

I am not fan of path dependencies at all. I see how it could be helpful in many cases, but I feel like actually it can work in very limited cases only. So as a general rule I would really recommend anyone to avoid them as much as possible.

This to me looks like the typical kind of features that are added for some very specific and straightforward use cases in mind, but for the creator of the feature it is hard to tell when it is going to break, so no explicit limitations are indicated in the documentation. But still there is a point where it is going to break (same story with editable installations).

In my mind, these features are here to help during development only, for example. But users somehow push it a bit too far out of the feature's comfort zone, and it shows its limitations and breaks. In particular I would not expect path dependencies to work on 2nd degree dependencies, to me it seems like expecting too much out of this.

Basically I would limit path to direct dependencies. And in particular to scenarios such as this:

[tool.poetry.dependencies]
# ...
alpha = '^1.2.3'
charlie = { path = '../charlie', develop = false }  # would not recommend, but acceptable if nothing else depends on the current project

[tool.poetry.dev-dependencies]
# ...
alpha = { path = '../alpha', develop = true }
bravo = { path = '../bravo', develop = true }  # alpha depends on bravo

@ghost
Copy link

ghost commented Dec 14, 2020

I think this is actually the issue, I described in another ticket: #2877. In our setup we are also using "develop = false" (which I don't recall why we use it, but there was a reason - I will check again).

I think path dependencies are very valuable, especially in monorepo setups (where relative paths are usually fixed). Moving the dependencies to the first level in such a setup defeats the purpose of dependency resolution and adds quite some manual overhead (we have ~ 20 poetry packages with approximately 3 to 4 levels of dependencies within the repository, this is already quite hard to keep up).

The current state is problematic as it is inconsistent: Relative path dependencies can be specified and there is no explicit warning/error and no documentation that they would not work transitively. Transitive dependency resolution is quite a strong guarantee I would expect from a solver & installation tool. Looking at the error that is shown right now it seems like there is a state that should be invalid, if the feature is not supported. But as a user of the tool, after reading the documentation, it is not clear to me why this should not be the case.

I understand that from an internal standpoint it might be a use case that is harder to support, but I think that should not be a reason to dismiss the use case (unless there is a way to reach the same result, but without the problem with explicit mentioning of transitive dependencies). Right now this is stopping the wider poetry adoption in my organisation and looking at the bug tracker this seems to be something that was already asked multiple times.

I would be happy to help out getting this to work with some advice on the implementation. Or help to make sure other people do not run into the same problem by enhancing the error message and providing documentation on the use case and workarounds.

@sinoroc
Copy link

sinoroc commented Dec 14, 2020

@malte-klemm-by

I'm not a maintainer, but I will try to look at this again, see if I can pin point more accurately what is going wrong here.


Digression:

I believe the potentially surprising and/or unreliable behaviour becomes quite clear with more knowledge of Python's packaging story in general (not just poetry). In my mind it is clear that things like path dependencies are not first class citizens at all. And from my point of view I also do not know why they should be. If I could I would probably get rid of them entirely.

I guess I see how path dependencies could be somewhat useful for things like "mono-repo", although I do not have much knowledge about those (seen from afar I do not see the advantages of mono-repos). Anyway, I do not understand why mono-repos want to bypass the distribution phase. Mono-repo or not, my impression is that it should still be possible to build the individual projects and upload them to an index (PyPI or private repository), and thus get rid of path dependencies.

Also I must admit I do not really understand why people bother with building individual projects in a mono-repo. This all seems quite counter-productive, a weird mix of things, that result in doing many things that are against best practices. Why do you need dependency resolution in a mono-repo? Aren't all dependencies supposed to be in the repo, even the external ones? All these things do not make sense to me. Why go for a radically different approach (mono-repo) and then complain that tools do not work with this approach? I always assumed that mono-repos used radically different tools. I am curious about these things.


There is a bit of work in progress that could potentially help with these kinds of project structure, maybe it is of interest to you:

@TrevorPace
Copy link
Author

TrevorPace commented Dec 14, 2020

@malte-klemm-by

I'm not a maintainer, but I will try to look at this again, see if I can pin point more accurately what is going wrong here.

Digression:

I believe the potentially surprising and/or unreliable behaviour becomes quite clear with more knowledge of Python's packaging story in general (not just poetry). In my mind it is clear that things like path dependencies are not first class citizens at all. And from my point of view I also do not know why they should be. If I could I would probably get rid of them entirely.

I guess I see how path dependencies could be somewhat useful for things like "mono-repo", although I do not have much knowledge about those (seen from afar I do not see the advantages of mono-repos). Anyway, I do not understand why mono-repos want to bypass the distribution phase. Mono-repo or not, my impression is that it should still be possible to build the individual projects and upload them to an index (PyPI or private repository), and thus get rid of path dependencies.

Also I must admit I do not really understand why people bother with building individual projects in a mono-repo. This all seems quite counter-productive, a weird mix of things, that result in doing many things that are against best practices. Why do you need dependency resolution in a mono-repo? Aren't all dependencies supposed to be in the repo, even the external ones? All these things do not make sense to me. Why go for a radically different approach (mono-repo) and then complain that tools do not work with this approach? I always assumed that mono-repos used radically different tools. I am curious about these things.

There is a bit of work in progress that could potentially help with these kinds of project structure, maybe it is of interest to you:

To address your comments on the mono-repo: There are significant advantages for a mono-repo when it comes to closed-source development of microservice-based projects. When it comes to development you are able to literally change files in the dependent libraries on the fly without having to first build/push to a private repo. By having all of your components in one repo it promotes quicker code sharing as well removes the need for any sort of "manual" versioning of the various components. As a result, you simply deal with one repo and merging the branches from it. A refactor, or split of services would not require removing or adding other repos, but instead be completely isolated to the branch. When it comes to deployment you can basically use git describe tags to get a unique version identifier for all the artifacts in that specific commit. Meaning that you guarantee all of the individual services are functionally aligned, there is no thinking about "What version does service X have to be with service Y?". Working with private repos is pretty painful when it comes to making CI/CD pieplines as well as docker images.

When it comes to python/poetry, ultimately what we are looking for is a way to do development locally (IE install into a virtual environment for a given service) in a way that local paths are used, but for an install (into something like a Docker container) we want to be creating actual packages for each component in the dependency tree and installing them into site-packages. NPM for example doesn't allow local development via linked resources, but instead when you install a local dependency it copies it (and all other dependencies) into your source tree. That makes it easy for deployment, but for development you need to rerun the install step after making any changes to the local dependency services or libraries.

My general opinion is that the whole use of the "develop" flag within the actual pyproject.toml file is counter-productive. If anything we should have a flag passed when running "poetry install" like "--local-dev" or similar. This way without the flag all packages would be copied into the virtual environment correctly with all local path dependencies resolved to just package names (similar to npm install or what we need for docker file packaging), but with the flag the package links (egg-link I think) would be installed into the virtual environment instead.

Ultimately, what we are trying to do is stack a hierarchy of dependencies and either let them be there for development, or flatten them into standalone packages based on what we actually need.

@ghost
Copy link

ghost commented Dec 14, 2020

@sinoroc: Thanks for the pointers, I looked into them earlier, but I wasn't sure they are fully overlapping.

@TrevorPace: We have exactly the same use case and use the mono repo (or rather mini mono repo, so splitting an application into smaller packages but in one repository) for the same advantage. To not further fuel the discussion: I think the value of such a setup depends heavily on your application/library and even more your infrastructure for CI/CD, deployment and even your organisation.

My general opinion is that the whole use of the "develop" flag within the actual pyproject.toml file is counter-productive. If anything we should have a flag passed when running "poetry install" like "--local-dev" or similar. This way without the flag all packages would be copied into the virtual environment correctly with all local path dependencies resolved to just package names (similar to npm install or what we need for docker file packaging), but with the flag the package links (egg-link I think) would be installed into the virtual environment instead.

For CI/CD we build the packages locally and install the packages using pip with a requirements.txt built from poetry. So we don't need the round trip via a (private) pypi just for tests. But for development we would like to have the nested path dependencies working (right now we track all transitive dependencies manually) so that we can do changes in multiple packages and have the changes instantly running or test them. I think what you proposed would be a perfect fit for our use case as well.

@sinoroc
Copy link

sinoroc commented Dec 14, 2020

Alright, I'll try to forget the mono-repo bit (I am still not sure what it implies or not).

So your issue is as described in #2877 (comment), isn't it? Somewhere poetry seems to lose track of what the relative paths are relative to. As a side note: maybe a flat hierarchy in the project's directory structure would work better (i.e. all apps and libs in the same directory instead of an apps directory and a libs directory).

The rest I don't understand. Particularly these bits:

If anything we should have a flag passed when running "poetry install" like "--local-dev" or similar. This way without the flag all packages would be copied into the virtual environment correctly with all local path dependencies resolved to just package names (similar to npm install or what we need for docker file packaging), but with the flag the package links (egg-link I think) would be installed into the virtual environment instead.

For CI/CD we build the packages locally and install the packages using pip with a requirements.txt built from poetry. So we don't need the round trip via a (private) pypi just for tests. But for development we would like to have the nested path dependencies working (right now we track all transitive dependencies manually) so that we can do changes in multiple packages and have the changes instantly running or test them. I think what you proposed would be a perfect fit for our use case as well.

Poetry can't just magically guess what dependencies should be installed as editable. It has to be specified for each dependency explicitly. I can not really imagine how a global flag would work. Or do you have only local path dependencies and strictly no other 3rd party dependencies (from PyPI for example)?

@dobbymoodge
Copy link

Resurrecting this discussion as this bug is still current, and I suspect it may be related to #4051, and @sinoroc expresses some preference here which I think needs to be repudiated:

I'm not a maintainer, but I will try to look at this again, see if I can pin point more accurately what is going wrong here.

Digression:

I believe the potentially surprising and/or unreliable behaviour becomes quite clear with more knowledge of Python's packaging story in general (not just poetry). In my mind it is clear that things like path dependencies are not first class citizens at all. And from my point of view I also do not know why they should be. If I could I would probably get rid of them entirely.

... etc.

Whether you like path dependencies or not is totally irrelevant to the bug in question. Poetry is supposed to be able to resolve dependency hierarchies recursively, and it should do so correctly for each kind of dependency it is designed to support.

In this instance, pkg2 is being installed correctly, but when pkg1 dependencies are being resolved at install time, poetry fails to synthesize a package name from the path dependency in pkg1's pyproject.toml. The develop = false flag should instruct poetry to express the package dependency as a name instead of a path when feeding pip for pkg1 - that way pip would find the matching package amongst the installed packages, mark the dependency as satisfied, and continue as normal. This is not an intractable problem.

Poetry should be able to handle these kinds of dependencies - it's central to the design goals of the tool. Breaking when develop = false isn't acceptable; why take a boolean if only one value will work? With the path and the pyproject.toml, Poetry has enough information to correctly resolve the dependency and provide pip with the information it needs to successfully install pkg1.

You don't understand path dependencies or don't think they're "nice", or whatever. That doesn't mean that this isn't a bug. Spending paragraphs of italicized text explaining that you don't understand something is distracting. You explain that you don't even understand the "monorepo" use case - if you don't understand the use case, and don't understand the failure, then don't make the issue difficult to approach by building walls of text on top of each other.

This bug affects Poetry 1.1.6.

@bstadlbauer
Copy link

Just as a note, while the workaround worked until poetry==1.1.6, the fix for #4202 also broke the workaround, and now dependencies can also not be installed when using develop = true

@martinka
Copy link

Just to add a different perspective on the path dependency issue.. I think a large use case, is those using a mono repo for a large system.. at least that is our use case and while I feel like the debate about mono vs separate repos is a war in and of itself. it doesn't seem to make sense to me for a general purpose tool like poetry to be opinionated about such things.

@martinka
Copy link

martinka commented Dec 21, 2021

I will also put out the following horrible work around we have in our make files.. this basically 'fixes' the lock file to have the correct directory references...
poetry lock sed -e 's/\.\.\/constants/\.\.\/\.\.\/libs\/constants/g' < poetry.lock > poetry.lock.new rm poetry.lock; mv poetry.lock.new poetry.lock

you have to do that for every second order dependency ... obviously libs/constants is particular to our repo..
its actually possible this #4246... fixed our problem.. looking into it.. sorry if this ended up being a fixed issue for us.

@jaklan
Copy link

jaklan commented Jan 12, 2022

Using Poetry version 1.1.12 the issue still happens when you specify path dependencies with develop = false, however with develop = true it seems to work properly.

@andrea-cassioli-maersk
Copy link

I am using Poetry (version 1.3.2) and the issue definitely appears with develop=true.

@dimbleby
Copy link
Contributor

OP provided a repro for the issue they were reporting, at https://github.com/TrevorPace/poetry-localdep-issue. And that repro works just fine with poetry 1.3.2

$ poetry update
Updating dependencies
Resolving dependencies... (0.1s)

Package operations: 2 installs, 0 updates, 0 removals

  • Installing pkg2 (0.1.0 /home/dch/bar/poetry-localdep-issue/libs/pkg2)
  • Installing pkg1 (0.1.0 /home/dch/bar/poetry-localdep-issue/libs/pkg1)

This issue should be closed: the problem that it reported has been fixed.

If others of you in this thread have different problems, please raise new issues.

@radoering
Copy link
Member

Resolved via #4245

Copy link

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 29, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Something isn't working as expected
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants