-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cuSolver fails with nvfortran >= 23.11 #76
Comments
MODIFIED * configure include/version/version.m4 modules/mod_X.F pol_function/X_redux.F Bugs: - [yambo] Fix for issue #76 Patch sent by: Davide Sangalli <[email protected]>
MODIFIED * include/version/version.m4 modules/mod_X.F pol_function/X_redux.F Bugs: - [yambo] Fix for issue #76 imported in branch 5.2 Patch sent by: Davide Sangalli <[email protected]>
MODIFIED * configure include/version/version.m4 modules/mod_X.F pol_function/X_redux.F Bugs: - [yambo] Fix for issue #76 Patch sent by: Davide Sangalli <[email protected]>
Bug fixed by moving the contained subroutine in X_redux.F to an independent subroutine |
I am having the same problem. In which branch you splitted the X_redux? |
The original branch is https://github.com/yambo-code/yambo-devel/tree/tech/devel-gpu This is the commit: |
I realized the all past runs on eliud and mo with cuda failed not because of a buggy compilation but exactly because of a crash of cuSolver. https://media.yambo-code.eu/robots/develop/eliud.kipchoge.2_develop_1_error.php If these fails are connected to this bug that it should introduced ASAP in the bug-fixes. |
The cusolver error does not affect tests like |
Here the fails were likely due to the cuSolver: As you can see, for |
The bug happens when running with GPU support (CUDAF)
Detected on my desktop (nvfortran 24.3, cuda 12.3) and on Leonardo (nvoftran 23.11, cuda 11.8 and 12.3)
Error message
Error code is
CUSOLVER_STATUS_EXECUTION_FAILED
https://docs.nvidia.com/cuda/cusolver/index.html
(Sometimes it fails also before, at cuSoverDnCreate)
The text was updated successfully, but these errors were encountered: