Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

uv torch installation breaking in cpu-gpu multiplatform scenario. #9310

Closed
EgonFerri opened this issue Nov 21, 2024 · 2 comments
Closed

uv torch installation breaking in cpu-gpu multiplatform scenario. #9310

EgonFerri opened this issue Nov 21, 2024 · 2 comments

Comments

@EgonFerri
Copy link

First of all thanks for the tool, UV is simplifying immensely the package management.

I was following the new document you provided for using UV with PyTorch, and I think I could have found an edge case that breaks the package resolution. Some of the last issues (#9306 and #9307) are possibly related, but I will report my specific case since it is a "cleaner" situation and could provide you with a new data point.
I'm using the last version (0.5.4).

I have this scenario:
I need a project that installs torch and torchvision ("torch==2.5.1", "torchvision==0.20.1"), and I need to account for three different scenarios: Darwin os (obv always without GPU), and Linux os (both with and without GPU).

From my understanding following the guide, I should be able to use uv sync --extra cpu and uv sync --extra gpu, installing the proper torch version with this pyproject.toml:

[project]
name = "test"
version = "0.1.0"
description = "Test pytorch dependencies"
readme = "README.md"
requires-python = ">=3.11"
dependencies = []

[project.optional-dependencies]
cpu = [
  "torch==2.5.1",
  "torchvision==0.20.1",
]
gpu = [
  "torch==2.5.1",
  "torchvision==0.20.1",
]

[tool]
[tool.uv]
conflicts = [
  [
    { extra = "cpu" },
    { extra = "gpu" },
  ],
]

[tool.uv.sources]
[[tool.uv.sources.torch]]
index = "pytorch-cpu"
marker = "platform_system != 'Darwin'"
extra = "cpu"

[[tool.uv.sources.torchvision]]
index = "pytorch-cpu"
marker = "platform_system != 'Darwin'"
extra = "cpu"

[[tool.uv.index]]
name = "pytorch-cpu"
url = "https://download.pytorch.org/whl/cpu"
explicit = true

However, it works well locally on Darwin (both --extra CPU and --extra GPU), but when I try to install it on Linux with --extra CPU I get this error:

$ uv sync --extra cpu
Using CPython 3.11.9 interpreter at: /usr/local/bin/python3
Creating virtual environment at: .venv
Resolved 29 packages in 2ms
error: Distribution `torch==2.5.1 @ registry+https://download.pytorch.org/whl/cpu` can't be installed because it doesn't have a source distribution or wheel for the current platform

Notice that if I use --extra GPU, instead, not having an explicit extra index and falling back to pypi works correctly, and the installation of GPU torch with Nvidia packages is solved correctly:

$ uv sync --extra gpu
Using CPython 3.11.9 interpreter at: /usr/local/bin/python3
Creating virtual environment at: .venv
Resolved 29 packages in 2ms
Prepared 25 packages in 58.28s
warning: Failed to hardlink files; falling back to full copy. This may lead to degraded performance.
         If the cache and target directories are on different filesystems, hardlinking may not be supported.
         If this is intentional, set `export UV_LINK_MODE=copy` or use `--link-mode=copy` to suppress this warning.
Installed 25 packages in 15.22s
 + filelock==3.16.1
 + fsspec==2024.10.0
 + jinja2==3.1.4
 + markupsafe==3.0.2
 + mpmath==1.3.0
 + networkx==3.4.2
 + numpy==2.1.3
 + nvidia-cublas-cu12==12.4.5.8
 + nvidia-cuda-cupti-cu12==12.4.127
 + nvidia-cuda-nvrtc-cu12==12.4.127
 + nvidia-cuda-runtime-cu12==12.4.127
 + nvidia-cudnn-cu12==9.1.0.70
 + nvidia-cufft-cu12==11.2.1.3
 + nvidia-curand-cu12==10.3.5.1[47](https://gitlab.pepita.io/core-aide/watergate/temp/-/jobs/3465971#L47)
 + nvidia-cusolver-cu12==11.6.1.9
 + nvidia-cusparse-cu12==12.3.1.170
 + nvidia-nccl-cu12==2.21.5
 + nvidia-nvjitlink-cu12==12.4.127
 + nvidia-nvtx-cu12==12.4.127
 + pillow==11.0.0
 + sympy==1.13.1
 + torch==2.5.1
 + torchvision==0.20.1
 + triton==3.1.0
 + typing-extensions==4.12.2

The problem is not with the extra index, since if I accept the situation where I only install cpu on Linux (like on Darwin), the extra index works correctly with this .toml:

[project]
name = "test"
version = "0.1.0"
description = "Test pytorch dependencies"
readme = "README.md"
requires-python = ">=3.11"
dependencies = ["torch==2.5.1", "torchvision==0.20.1"]

[tool]
[tool.uv]
[tool.uv.sources]
[[tool.uv.sources.torch]]
index = "pytorch-cpu"
marker = "platform_system != 'Darwin'"

[[tool.uv.sources.torchvision]]
index = "pytorch-cpu"
marker = "platform_system != 'Darwin'"

[[tool.uv.index]]
name = "pytorch-cpu"
url = "https://download.pytorch.org/whl/cpu"
explicit = true

Notice, however, that this toml only works after I delete and re-generate (with uv sync) the uv.lock, otherwise I have the same error.
I think that the diff of the two locks could show the problem, since the freshly generated lock has the correct x.y.z+cpu version for torch and torch vision, while the old one did not.

image

Thanks in advance for the help (:

@eginhard
Copy link

This is a duplicate of #9295 and should be fixed in the next release

@EgonFerri
Copy link
Author

sorry, missed that.
Thanks again for all the good work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants