Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUBLAS_STATUS_EXECUTION_FAILED when calling cublasSgemm #29

Open
Lopa07 opened this issue Aug 16, 2023 · 4 comments
Open

CUBLAS_STATUS_EXECUTION_FAILED when calling cublasSgemm #29

Lopa07 opened this issue Aug 16, 2023 · 4 comments

Comments

@Lopa07
Copy link

Lopa07 commented Aug 16, 2023

Getting the error, RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling 'cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)' while trying the following training command:

python -m torch.distributed.launch \
        --nproc_per_node=1 \
        --use_env \
        main.py \
        --pretrained params/detr-r50-pre-2stage-q64.pth \
        --output_dir logs \
        --dataset_file hico \
        --hoi_path data/hico_20160224_det \
        --num_obj_classes 80 \
        --num_verb_classes 117 \
        --backbone resnet50 \
        --num_queries 64 \
        --dec_layers_hopd 3 \
        --dec_layers_interaction 3 \
        --epochs 90 \
        --lr_drop 60 \
        --use_nms_filter

I am using python 3.7, CUDA 10.1.

@Lopa07 Lopa07 changed the title RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc) CUBLAS_STATUS_EXECUTION_FAILED when calling cublasSgemm Aug 16, 2023
@YueLiao
Copy link
Owner

YueLiao commented Aug 22, 2023

Could you provide more details, e.g., where/which line does this error occur in the project?

@Lopa07
Copy link
Author

Lopa07 commented Aug 22, 2023

The error occurred in this line.

@keshara2032
Copy link

I am getting the same issue. Tried different cuda versions no luck. Were you @Lopa07 able to fix it? Thanks!

@keshara2032
Copy link

keshara2032 commented Feb 19, 2024

I was able to fix this by using cuda-toolkit 10.2 with cudnn8.7 for cuda10.2 (https://developer.nvidia.com/rdp/cudnn-archive#a-collapse870-102). Hope this helps. @Lopa07

The following version works as well as long as the appropriate cudann version is installed.
conda install pytorch==1.7.0 torchvision==0.8.0 torchaudio==0.7.0 cudatoolkit=11.0 -c pytorch

This guide is helpful for cudann installation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants