Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault when using GPU in supercomputer #6727

Open
mushroomfire opened this issue Nov 24, 2022 · 2 comments
Open

Segmentation fault when using GPU in supercomputer #6727

mushroomfire opened this issue Nov 24, 2022 · 2 comments
Assignees
Labels
question Question on using Taichi

Comments

@mushroomfire
Copy link
Contributor

Environment:
Paratera supercomputer

submit scipt:

#!/bin/bash
#SBATCH -N 1
#SBATCH -n 5
#SBATCH -p gpu 
#SBATCH --gres=gpu:1
#SBATCH --no-requeue

nvidia-smi

python test.py

Here is the test.py:

import taichi as ti
ti.init(ti.cuda)

The output file is as below:

Thu Nov 24 21:48:39 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.73.08    Driver Version: 510.73.08    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla V100-SXM2...  On   | 00000000:8A:00.0 Off |                    0 |
| N/A   23C    P0    42W / 300W |      0MiB / 32768MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
/tmp/slurmd/job604193/slurm_script: line 10: 143192 Segmentation fault      python test.py

I don't know how to sovle this Segmentation fault error. If you need more detail information, please let me know. Thanks a lot.

@mushroomfire mushroomfire added the question Question on using Taichi label Nov 24, 2022
@taichi-gardener taichi-gardener moved this to Untriaged in Taichi Lang Nov 24, 2022
@ailzhang
Copy link
Contributor

Hey @mushroomfire , OOC does this repro if you run it directly on a V100 without slurs? Thanks!

@mushroomfire
Copy link
Contributor Author

mushroomfire commented Nov 25, 2022

Hey @ailzhang, here is the results if I run script directly in shell:
python test.py

[Taichi] version 1.2.2, llvm 10.0.0, commit 608e4b57, linux, python 3.8.0
[Taichi] Starting on arch=cuda
Segmentation fault

@ailzhang ailzhang moved this from Untriaged to Todo in Taichi Lang Nov 25, 2022
@turbo0628 turbo0628 moved this from Todo to Backlog in Taichi Lang Dec 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Question on using Taichi
Projects
Status: Backlog
Development

No branches or pull requests

3 participants