Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Taichi GPU Functions Automatic Degradation onto CPUs on Supercomputers #7472

Closed
Ruoyu66666 opened this issue Mar 1, 2023 · 3 comments
Closed
Assignees

Comments

@Ruoyu66666
Copy link

Ruoyu66666 commented Mar 1, 2023

Describe the bug
I wrote a small program to compare Taichi and numpy, where Taichi function is supposed to run on Nvidia GPUs on TACC, the supercomputer maintained by the University of Texas System. However, the Taichi kernel degrades automatically to CPU. When I check GPU working status as program run, no job is running. My doubt was confirmed by setting (arch = ti.cuda and arch=ti.cpu). The run times are equal. Can someone with experience of using GPUs on supercomputers help? I appreciate your help ahead.

Update: the TACC staff said he think Taichi developers may help after I showed him my code and the error file. He said the problem should not be the CUDA installation on the TACC side.

The simple code is shown here:

taichi_computation_trial.txt

I attached the errors here:

slurm-733695.txt

@github-project-automation github-project-automation bot moved this to Untriaged in Taichi Lang Mar 1, 2023
@erizmr erizmr moved this from Untriaged to Todo in Taichi Lang Mar 3, 2023
@ailzhang
Copy link
Contributor

ailzhang commented Mar 9, 2023

@Ruoyu66666 it looks like taichi failed to find libcuda.so in the system path. Do you need to load addition modules for CUDA as well.

Currently Loaded Modules:
  1) intel/19.1.1   3) python3/3.9.7   5) pmix/3.2.3     7) TACC
  2) impi/19.0.9    4) cmake/3.24.2    6) xalt/2.10.32

@Ruoyu66666
Copy link
Author

@ailzhang Great thanks for your reply! I am contacting TACC staff with your reply. ---Ruoyu

@Ruoyu66666
Copy link
Author

@ailzhang Thanks again for your look! The issue was resolved by setting a different library path for Taichi to find CUDA.

@github-project-automation github-project-automation bot moved this from Todo to Done in Taichi Lang Mar 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

No branches or pull requests

2 participants