Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trying to install Torch for CUDA on AMD hardware #2228

Closed
morgan008 opened this issue Apr 8, 2024 · 7 comments
Closed

Trying to install Torch for CUDA on AMD hardware #2228

morgan008 opened this issue Apr 8, 2024 · 7 comments
Labels
enhancement New feature or request

Comments

@morgan008
Copy link

./gui.sh
Warning: LD_LIBRARY_PATH environment variable is not set.
Certain functionalities may not work correctly.
Please ensure that the required libraries are properly configured.

If you use WSL2 you may want to: export LD_LIBRARY_PATH=/usr/lib/wsl/lib/

INFO     Kohya_ss GUI version: v23.1.2
INFO     Submodule initialized and updated.
INFO     AMD toolkit detected
INFO     Torch 2.1.2+rocm5.6
INFO     Torch backend: AMD ROCm HIP 5.6.31061-8c743ae5d
INFO     Torch detected GPU: AMD Radeon RX 7800 XT VRAM 16368 Arch (11, 0) Cores 30
INFO     Python version is 3.11.8 (main, Feb 12 2024, 14:50:05) [GCC 13.2.1 20230801]
INFO     Verifying modules installation status from .../kohya_ss/requirements_linux.txt...
WARNING  Package wrong version: torch 2.1.2+rocm5.6 required 2.1.2+cu118
INFO     Installing package: torch==2.1.2+cu118 torchvision==0.16.2+cu118 xformers==0.0.23.post1+cu118 --extra-index-url https://download.pytorch.org/whl/cu118

P.S. In general, should gui.sh intervene in the installation? For this, after all, there is setup.sh. And gui.sh should work on the principle of "take what's available", isn't it?

@bmaltais bmaltais added the enhancement New feature or request label Apr 8, 2024
@bmaltais
Copy link
Owner

bmaltais commented Apr 8, 2024

The project does not have support for AMD GPU... If someone come up with the proper requirements file to support it I can integrate support in the GUI based on it.

@morgan008
Copy link
Author

@bmaltais then why detecting which toolkit are used? Never mind.

All we need is just
torch==2.1.2+rocm5.6 torchvision==0.16.2+rocm5.6 --extra-index-url https://download.pytorch.org/whl/rocm5.6

Or just remove these checks from gui.sh, because they duplicate setup.sh functionality and break the environment manually properly configured. And checking Python version too.

@bmaltais
Copy link
Owner

bmaltais commented Apr 8, 2024

OK... I see there is a requirements_linux_rocm.txt file that specify some requirements... but I guess it does not work for your system? Would a new requirements file help with the installation for AMD cards that are not properly covered by the existing option?

I did not create those files as I don't own an AMD card and other contributors created PR to get them as part of the GUI...

If you can come up witht he required files to make the install work for you you might want to create a PR to submit the solution so others can benefit from it.

@Disty0
Copy link
Contributor

Disty0 commented Apr 9, 2024

Use ./setup.sh --use-rocm then ./gui.sh --use-rocm
#2167

Same case with Intel ARC / IPEX.
IPEX uses ./setup.sh --use-ipex then ./gui.sh --use-ipex

./setup.sh --help
Kohya_SS Installation Script for POSIX operating systems.

Usage:
  # Specifies custom branch, install directory, and git repo
  setup.sh -b dev -d /workspace/kohya_ss -g https://mycustom.repo.tld/custom_fork.git

  # Same as example 1, but uses long options
  setup.sh --branch=dev --dir=/workspace/kohya_ss --git-repo=https://mycustom.repo.tld/custom_fork.git

  # Maximum verbosity, fully automated installation in a runpod environment skipping the runpod env checks
  setup.sh -vvv --skip-space-check --runpod

Options:
  -b BRANCH, --branch=BRANCH    Select which branch of kohya to check out on new installs.
  -d DIR, --dir=DIR             The full path you want kohya_ss installed to.
  -g REPO, --git_repo=REPO      You can optionally provide a git repo to check out for runpod installation. Useful for custom forks.
  -h, --help                    Show this screen.
  -i, --interactive             Interactively configure accelerate instead of using default config file.
  -n, --no-git-update           Do not update kohya_ss repo. No git pull or clone operations.
  -p, --public                  Expose public URL in runpod mode. Won't have an effect in other modes.
  -r, --runpod                  Forces a runpod installation. Useful if detection fails for any reason.
  -s, --skip-space-check        Skip the 10Gb minimum storage space check.
  -u, --no-gui                  Skips launching the GUI.
  -v, --verbose                 Increase verbosity levels up to 3.
      --use-ipex                Use IPEX with Intel ARC GPUs.
      --use-rocm                Use ROCm with AMD GPUs.

Or just remove these checks from gui.sh, because they duplicate setup.sh functionality and break the environment manually properly configured. And checking Python version too.

Don't use gui.sh if you don't want environment checks. Just run kohya_gui.py directly.
gui.sh's entire purpose is setting up the environment and updating the environment.

@bmaltais then why detecting which toolkit are used? Never mind.

They are not used for anything other than logging right now.
Setup happens in setup.sh and gui.sh with --use-rocm or --use-ipex argument.

@morgan008
Copy link
Author

@Disty0 First of all, it should have been added to the instructions.
Secondly, it is inconvenient. Why not implement automatic detection?
Thirdly, why did you specify Torch nightly builds and not the stable ones?

gui.sh's entire purpose is setting up the environment and updating the environment.

Then why does setup.sh exist? The instructions clearly say that setup.sh is for setting up the environment, gui.sh is for running.

@Disty0
Copy link
Contributor

Disty0 commented Apr 9, 2024

@Disty0 First of all, it should have been added to the instructions.

I don't like writing documents in general, community PRs are welcome.

Secondly, it is inconvenient. Why not implement automatic detection?

This codebase implemented the requirements check in the setup.sh and gui.sh instead of python files and honestly, i didn't want to rewrite it to support autodetect.

Thirdly, why did you specify Torch nightly builds and not the stable ones?

MES Hangs and Memory Exceptions.
Stable ones are not stable.

Then why does setup.sh exist? The instructions clearly say that setup.sh is for setting up the environment, gui.sh is for running.

You can't run anything without activating the environment first.

And let me add, using a manual environment is a rare use case.
People won't run any updates for years and will complain when things gets outdated and breaks.

@hqnicolas
Copy link

@morgan008
https://github.com/hqnicolas/bmaltaisKohya_ssROCm/tree/main

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants