Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for conflicting dep extras, when they are exclusive of each other. #6419

Closed
2 tasks done
erezsh opened this issue Sep 6, 2022 · 12 comments · Fixed by #9553
Closed
2 tasks done

Support for conflicting dep extras, when they are exclusive of each other. #6419

erezsh opened this issue Sep 6, 2022 · 12 comments · Fixed by #9553
Labels
area/solver Related to the dependency resolver kind/feature Feature requests/implementations

Comments

@erezsh
Copy link

erezsh commented Sep 6, 2022

  • I have searched the issues of this repo and believe that this is not a duplicate.
  • I have searched the documentation and believe that my question is not covered.

Feature Request

Background

I am working on a project called data-diff (https://github.com/datafold/data-diff), which supports a lot of optional 3rd party libraries, mostly connectors to different databases. We provide them as "extras", so that users can install them easily if they wish. For example, they might run pip install data-diff[mysql, snowflake, bigquery].

For the most part, these libraries play nice with each other, except that sometimes they don't. For example, adding the databricks connector (https://github.com/databricks/databricks-sql-python) is somehow causing a conflict with snowflake on numpy/pandas/etc., which can only be resolved, if at all, by taking all these libraries years back, which is bad for both features and security.

This kind of conflict resolution really isn't necessary for our use case, because it's very unlikely for someone to install both databricks and snowflake, and even when they do, we would much prefer to give them a special error message, over the alternative which is removing databricks from "extras" entirely.

It would be great if poetry offered some solution to this situation, where extras conflict, but don't have to co-exist.

Request (simple)

One way to do it would be to add an option to make some extras exclusive of one another, so they can't be installed together, but also don't affect each other's dependencies.

Maybe add the option to make dependency groups exclusive. Anything within a groups can be installed together, but deps from separate groups cannot.

Request (ideal)

What I really want in this situation, is to be able to install dependencies in their own "virtual env", so that when they import their dependencies, they import them from their own private cache.

That would allow all these 3rd party libraries to play nice with each other, no matter what.

This might be beyond the scope of poetry, in more ways than one. But I also think it could be useful for a lot of other scenarios..

@erezsh erezsh added kind/feature Feature requests/implementations status/triage This issue needs to be triaged labels Sep 6, 2022
@LarsDu
Copy link

LarsDu commented Sep 9, 2022

Having the exact same issue!

We would like to have mutually exclusive extras groups that do not trigger the solver collectively. This type of feature would be a great boon for teams that operate out of the same repo, but may require different packages.

@LarsDu
Copy link

LarsDu commented Sep 9, 2022

Also if anyone could provide information on how extras propagate into the solver, that would be extremely helpful (for making a potential PR)

@LarsDu LarsDu mentioned this issue Sep 9, 2022
2 tasks
@neersighted neersighted added area/solver Related to the dependency resolver and removed status/triage This issue needs to be triaged labels Oct 26, 2022
@neersighted
Copy link
Member

Related to (but not the same as #1168)

@martinitus
Copy link

A good example for this is the databricks-connect python dependency which ships its own internal spark and is hence incompatible with the pyspark dependency. As a poetry used it would be nice if I could offer my package users to either use the package with pyspark or with databricks-connect.

If anyone has an alternative solution I am more than interested :)

@Gornoka
Copy link

Gornoka commented Nov 17, 2023

The same problem arises in the computer vision / AI world when you decide to use opencv, as there are two completely incompatible versions of that package ( opencv-python and opencv-python-headless) Having an easy way to specify both as potential dependencies but ensuring that NEVER both are installed would be nice.

Maybe even with a replacement function, so that one could specify which dependencies shall be overwritten by the extra group.

@jamesowers-cohere
Copy link
Contributor

jamesowers-cohere commented Feb 13, 2024

I'm keen to contribute to this and raise a PR to contribute: it's a missing feature which I require from poetry. At the moment, the only way to work around this for me is to spin up a different vm and make use of environment markers to optionally install different mutually exclusive dependencies: this is less than ideal!

My proposal would be to add a mutually_exclusive option to groups, like optional: https://python-poetry.org/docs/managing-dependencies/#optional-groups. It's different from optional because we would either:

  • require a check that the dependencies within a mutually_exclusive group are not contained outside, or
  • set default behaviour that, if a mutually_exclusive group is specified in the poetry install ... command, that version is used

I'd propose that poetry install just returns error if more than one mutually_exclusive group is specified for installation.

Before beginning on a PR, I'd like to gauge how open Poetry maintainers would be to accepting such a feature i.e. that outlined in Request (simple) in the issue description. I'm not keen to work on it if it is outside the maintainers' scope for poetry.

@radoering
Copy link
Member

The reason why this is so difficult to implement is not the installer but the solver/locker. (Locking comes always before installing.) Currently, the solver searches for one (environment-independent) solution to satisfy all dependencies and that solution is written into the lockfile. In order to install the dependencies into a specific environment, the lockfile is read and the solver runs again only considering the locked dependencies so it can create an environment-specific solution from the environment-independent solution. With mutually exclusive groups you have to do multiple solver runs to find the environment-independent solution(s) and think about how to store it in the lockfile. This might even require extending the lockfile format.

My proposal would be to add a mutually_exclusive option to groups

Not sure about the name. When reading it, I do not think about a Boolean flag but about a list of groups. Maybe, isolated or solitary fits better? Anyway, naming is a detail that can still be clarified later.

Before beginning on a PR, I'd like to gauge how open Poetry maintainers would be to accepting such a feature i.e. that outlined in Request (simple) in the issue description.

I assume we are open to a good solution but that's difficult. If you want to avoid unnecessary work you probably should specify more exactly how the solution will look like (especially considering the solver and the lockfile). On the other side, you probably have to "do some work" to develop an understanding of Poetry's internals and even if we think that the concept is good, some issues might only be discovered after implementing it. To sum it up, it's more complicated than you might think and the risk that a solution will not be accepted cannot be avoided with such a complex issue.

@uvashisth
Copy link

I'm keen to contribute to this and raise a PR to contribute: it's a missing feature which I require from poetry. At the moment, the only way to work around this for me is to spin up a different vm and make use of environment markers to optionally install different mutually exclusive dependencies: this is less than ideal!

My proposal would be to add a mutually_exclusive option to groups, like optional: https://python-poetry.org/docs/managing-dependencies/#optional-groups. It's different from optional because we would either:

  • require a check that the dependencies within a mutually_exclusive group are not contained outside, or
  • set default behaviour that, if a mutually_exclusive group is specified in the poetry install ... command, that version is used

I'd propose that poetry install just returns error if more than one mutually_exclusive group is specified for installation.

Before beginning on a PR, I'd like to gauge how open Poetry maintainers would be to accepting such a feature i.e. that outlined in Request (simple) in the issue description. I'm not keen to work on it if it is outside the maintainers' scope for poetry.

Could you please show a quick example of how you use environment markers? For instance, how you separate two libraries into different groups using let's say dev/uat groups

@hugolytics
Copy link

I also require this solution(or at least it would solve my problem), I've encountered a similar issue described in #9522 where platform-specific dependencies and environment markers are not respected during resolution. In my case, vllm is required on Linux and incompatible on macOS, yet Poetry still attempts to resolve it for macOS, causing conflicts. Both issues highlight the need for better handling of mutually exclusive and platform-specific dependencies to avoid unnecessary conflicts and ensure smoother dependency management across different environments.

@reesehyde
Copy link
Contributor

I'm not sure how many cases it covers but I had this issue and was able to solve it by specifying excluded extras manually, e.g. per this comment on #6409:

torch = { version = "^2.2.1", source = "torch_cpu", markers = "extra=='cpu' and extra!='gpu'" }

Specifying both extra=='cpu' and extra!='gpu' allows locking/solving to succeed where conflicting dependencies specifying only extra=='cpu' and extra=='gpu' would fail

@AlexanderNenninger
Copy link

@reesehyde This kinda works, but it still downloads the CUDA library, even when you run poetry install -E cpu. You can switch between versions though.

Poetry (version 1.8.3)
[tool.poetry.dependencies]
python = "^3.11"
torch = [
    { version = "^2.5.0", source = "torch_cuda124", markers = "sys_platform == 'linux' and extra == 'gpu' and extra != 'cpu'" },
    { version = "^2.5.0", source = "torchcpu", markers = "sys_platform == 'linux' and extra != 'gpu' and extra == 'cpu'" },
]

[tool.poetry.extras]
cpu = ["torch"]
gpu = ["torch"]

[[tool.poetry.source]]
name = "torchcpu"
url = "https://download.pytorch.org/whl/cpu"
priority = "explicit"

[[tool.poetry.source]]
name = "torch_cuda124"
url = "https://download.pytorch.org/whl/cu124"
priority = "explicit"

Copy link

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 31, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area/solver Related to the dependency resolver kind/feature Feature requests/implementations
Projects
None yet
Development

Successfully merging a pull request may close this issue.