Trying to install Torch for CUDA on AMD hardware #2228

morgan008 · 2024-04-08T13:28:19Z

./gui.sh
Warning: LD_LIBRARY_PATH environment variable is not set.
Certain functionalities may not work correctly.
Please ensure that the required libraries are properly configured.

If you use WSL2 you may want to: export LD_LIBRARY_PATH=/usr/lib/wsl/lib/

INFO     Kohya_ss GUI version: v23.1.2
INFO     Submodule initialized and updated.
INFO     AMD toolkit detected
INFO     Torch 2.1.2+rocm5.6
INFO     Torch backend: AMD ROCm HIP 5.6.31061-8c743ae5d
INFO     Torch detected GPU: AMD Radeon RX 7800 XT VRAM 16368 Arch (11, 0) Cores 30
INFO     Python version is 3.11.8 (main, Feb 12 2024, 14:50:05) [GCC 13.2.1 20230801]
INFO     Verifying modules installation status from .../kohya_ss/requirements_linux.txt...
WARNING  Package wrong version: torch 2.1.2+rocm5.6 required 2.1.2+cu118
INFO     Installing package: torch==2.1.2+cu118 torchvision==0.16.2+cu118 xformers==0.0.23.post1+cu118 --extra-index-url https://download.pytorch.org/whl/cu118

P.S. In general, should gui.sh intervene in the installation? For this, after all, there is setup.sh. And gui.sh should work on the principle of "take what's available", isn't it?

The text was updated successfully, but these errors were encountered:

bmaltais · 2024-04-08T21:40:59Z

The project does not have support for AMD GPU... If someone come up with the proper requirements file to support it I can integrate support in the GUI based on it.

morgan008 · 2024-04-08T23:35:04Z

@bmaltais then why detecting which toolkit are used? Never mind.

All we need is just
torch==2.1.2+rocm5.6 torchvision==0.16.2+rocm5.6 --extra-index-url https://download.pytorch.org/whl/rocm5.6

Or just remove these checks from gui.sh, because they duplicate setup.sh functionality and break the environment manually properly configured. And checking Python version too.

bmaltais · 2024-04-08T23:49:43Z

OK... I see there is a requirements_linux_rocm.txt file that specify some requirements... but I guess it does not work for your system? Would a new requirements file help with the installation for AMD cards that are not properly covered by the existing option?

I did not create those files as I don't own an AMD card and other contributors created PR to get them as part of the GUI...

If you can come up witht he required files to make the install work for you you might want to create a PR to submit the solution so others can benefit from it.

Disty0 · 2024-04-09T12:09:06Z

Use ./setup.sh --use-rocm then ./gui.sh --use-rocm
#2167

Same case with Intel ARC / IPEX.
IPEX uses ./setup.sh --use-ipex then ./gui.sh --use-ipex

./setup.sh --help

Kohya_SS Installation Script for POSIX operating systems.

Usage:
  # Specifies custom branch, install directory, and git repo
  setup.sh -b dev -d /workspace/kohya_ss -g https://mycustom.repo.tld/custom_fork.git

  # Same as example 1, but uses long options
  setup.sh --branch=dev --dir=/workspace/kohya_ss --git-repo=https://mycustom.repo.tld/custom_fork.git

  # Maximum verbosity, fully automated installation in a runpod environment skipping the runpod env checks
  setup.sh -vvv --skip-space-check --runpod

Options:
  -b BRANCH, --branch=BRANCH    Select which branch of kohya to check out on new installs.
  -d DIR, --dir=DIR             The full path you want kohya_ss installed to.
  -g REPO, --git_repo=REPO      You can optionally provide a git repo to check out for runpod installation. Useful for custom forks.
  -h, --help                    Show this screen.
  -i, --interactive             Interactively configure accelerate instead of using default config file.
  -n, --no-git-update           Do not update kohya_ss repo. No git pull or clone operations.
  -p, --public                  Expose public URL in runpod mode. Won't have an effect in other modes.
  -r, --runpod                  Forces a runpod installation. Useful if detection fails for any reason.
  -s, --skip-space-check        Skip the 10Gb minimum storage space check.
  -u, --no-gui                  Skips launching the GUI.
  -v, --verbose                 Increase verbosity levels up to 3.
      --use-ipex                Use IPEX with Intel ARC GPUs.
      --use-rocm                Use ROCm with AMD GPUs.

Or just remove these checks from gui.sh, because they duplicate setup.sh functionality and break the environment manually properly configured. And checking Python version too.

Don't use gui.sh if you don't want environment checks. Just run kohya_gui.py directly.
gui.sh's entire purpose is setting up the environment and updating the environment.

@bmaltais then why detecting which toolkit are used? Never mind.

They are not used for anything other than logging right now.
Setup happens in setup.sh and gui.sh with --use-rocm or --use-ipex argument.

morgan008 · 2024-04-09T12:45:16Z

@Disty0 First of all, it should have been added to the instructions.
Secondly, it is inconvenient. Why not implement automatic detection?
Thirdly, why did you specify Torch nightly builds and not the stable ones?

gui.sh's entire purpose is setting up the environment and updating the environment.

Then why does setup.sh exist? The instructions clearly say that setup.sh is for setting up the environment, gui.sh is for running.

Disty0 · 2024-04-09T12:51:24Z

@Disty0 First of all, it should have been added to the instructions.

I don't like writing documents in general, community PRs are welcome.

Secondly, it is inconvenient. Why not implement automatic detection?

This codebase implemented the requirements check in the setup.sh and gui.sh instead of python files and honestly, i didn't want to rewrite it to support autodetect.

Thirdly, why did you specify Torch nightly builds and not the stable ones?

MES Hangs and Memory Exceptions.
Stable ones are not stable.

Then why does setup.sh exist? The instructions clearly say that setup.sh is for setting up the environment, gui.sh is for running.

You can't run anything without activating the environment first.

And let me add, using a manual environment is a rare use case.
People won't run any updates for years and will complain when things gets outdated and breaks.

hqnicolas · 2024-11-02T16:38:44Z

@morgan008
https://github.com/hqnicolas/bmaltaisKohya_ssROCm/tree/main

bmaltais added the enhancement New feature or request label Apr 8, 2024

Disty0 mentioned this issue Apr 9, 2024

Autodetect for ROCm #2238

Merged

morgan008 closed this as completed Apr 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trying to install Torch for CUDA on AMD hardware #2228

Trying to install Torch for CUDA on AMD hardware #2228

morgan008 commented Apr 8, 2024

bmaltais commented Apr 8, 2024

morgan008 commented Apr 8, 2024

bmaltais commented Apr 8, 2024

Disty0 commented Apr 9, 2024 •

edited

Loading

morgan008 commented Apr 9, 2024

Disty0 commented Apr 9, 2024 •

edited

Loading

hqnicolas commented Nov 2, 2024

Trying to install Torch for CUDA on AMD hardware #2228

Trying to install Torch for CUDA on AMD hardware #2228

Comments

morgan008 commented Apr 8, 2024

bmaltais commented Apr 8, 2024

morgan008 commented Apr 8, 2024

bmaltais commented Apr 8, 2024

Disty0 commented Apr 9, 2024 • edited Loading

morgan008 commented Apr 9, 2024

Disty0 commented Apr 9, 2024 • edited Loading

hqnicolas commented Nov 2, 2024

Disty0 commented Apr 9, 2024 •

edited

Loading

Disty0 commented Apr 9, 2024 •

edited

Loading