Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compiling Samples using NVidia HPC SDK Fails. Permission issue? #324

Open
nbowling opened this issue Feb 15, 2025 · 6 comments
Open

Compiling Samples using NVidia HPC SDK Fails. Permission issue? #324

nbowling opened this issue Feb 15, 2025 · 6 comments
Labels
question Further information is requested

Comments

@nbowling
Copy link

nbowling commented Feb 15, 2025

Hello,
Please could someone help.

I cannot compile the CUDA samples using the HPC toolkit. HPC has a different directory structure than the normal CUDA installation resulting in all sorts of errors when executing cmake ..

I need to use the HPC toolkit because I need to have the NVFORTRAN compiler. At the moment I’m feeling my way with CUDA and Linux so any guidance would be appreciated.

Many thanks

Thank You
Nick

@mcolg
Copy link

mcolg commented Feb 18, 2025

Hi Nick,

I'm with the NVHPC Compiler team and was asked to step in and help. Though I just cloned the repo and didn't see any issue with building the package with the simple "cmake ../ ; make" commands using the nvcc that ships with the NVHPC SDK.

Can you give any more details on any additional settings you gave to cmake or other relevant environment variables you've set? Also, do you have any examples of the errors you're seeing?

Finally, which version of NVHPC do you have installed?

Thanks,
Mat

@rwarmstr rwarmstr added the question Further information is requested label Feb 18, 2025
@nbowling
Copy link
Author

nbowling commented Feb 19, 2025

Hi Mat

Thanks for picking this up hopefully this will be quickly solved.

A bit of background. In earlier versions of CUDA and the HPC SDK I was able to install them separately. This meant that CUDA sat in the usr/local/ directories.
Now The HPC SDK sits in /opt. I kept getting permissions errors when running make. Then the whole structure of the CUDA Samples changed so I thought problem solved.

Sadly not the case, well at least for me.

I just re-downloaded the samples clean and followed the new process to compile the deviceQuery sample. This time all good so for the time being it looks as though my problem has gone away.

I would if possible like some advice. In order to get things working I have flailed about in bashrc ending up with the following.

--- Enable NVidia HPC ---

NVARCH=`uname -s`_`uname -m`; export NVARCH
NVCOMPILERS=/opt/nvidia/hpc_sdk; export NVCOMPILERS
MANPATH=$MANPATH:$NVCOMPILERS/$NVARCH/25.1/compilers/man; export MANPATH
PATH=$NVCOMPILERS/$NVARCH/25.1/compilers/bin:$PATH; export PATH
NVHPC_CUDA=$NVCOMPILERS/Linux_x86_64/25.1/cuda

export PATH=$NVCOMPILERS/$NVARCH/25.1/comm_libs/mpi/bin:$PATH
export MANPATH=$MANPATH:$NVCOMPILERS/$NVARCH/25.1/comm_libs/mpi/man
export PATH=$NVHPC_CUDA/bin:$PATH
export CUDA_HOME=/opt/nvidia/hpc_sdk/Linux_x86_64/25.1/cuda

export PATH=/opt/nvidia/hpc_sdk/Linux_x86_64/25.1/cuda/12.6/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/opt/nvidia/hpc_sdk/Linux_x86_64/25.1/cuda/12.6/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

My gut feeling says this is a mess and probably largely unnecessary. Any advice as to clean this up would be great.

Once again - many thanks for taking this on.

Cheers
Nick

Okay spoke too soon...

Trying to compile nbody and bandwidthTest example and am getting.

nick@freyja:~/Desktop/cuda-samples/Samples/1_Utilities/bandwidthTest/build$ cmake ..
-- Configuring done (0.1s)
-- Generating done (0.0s)
-- Build files have been written to: /home/nick/Desktop/cuda-samples/Samples/1_Utilities/bandwidthTest/build
nick@freyja:~/Desktop/cuda-samples/Samples/1_Utilities/bandwidthTest/build$ make -j$(nproc)
[ 33%] Building CUDA object CMakeFiles/bandwidthTest.dir/bandwidthTest.cu.o
nvcc fatal   : **Unsupported gpu architecture 'compute_100'**
make[2]: *** [CMakeFiles/bandwidthTest.dir/build.make:77: CMakeFiles/bandwidthTest.dir/bandwidthTest.cu.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:83: CMakeFiles/bandwidthTest.dir/all] Error 2
make: *** [Makefile:91: all] Error 2
nick@freyja:~/Desktop/cuda-samples/Samples/1_Utilities/bandwidthTest/build$ 

Output from deviceQuery confirms capable GPU

nick@freyja:~/Desktop/cuda-samples/Samples/1_Utilities/deviceQuery/build$ ./deviceQuery
./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "NVIDIA GeForce RTX 2080 Ti"
  CUDA Driver Version / Runtime Version          12.4 / 12.6
  CUDA Capability Major/Minor version number:    7.5
  Total amount of global memory:                 10793 MBytes (11316887552 bytes)
  (068) Multiprocessors, (064) CUDA Cores/MP:    4352 CUDA Cores
  GPU Max Clock rate:                            1755 MHz (1.75 GHz)
  Memory Clock rate:                             7000 Mhz
  Memory Bus Width:                              352-bit
  L2 Cache Size:                                 5767168 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total shared memory per multiprocessor:        65536 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  1024
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 3 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device supports Managed Memory:                Yes
  Device supports Compute Preemption:            Yes
  Supports Cooperative Kernel Launch:            Yes
  Supports MultiDevice Co-op Kernel Launch:      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 1 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 12.4, CUDA Runtime Version = 12.6, NumDevs = 1
Result = PASS

Any ideas?

N

@rwarmstr
Copy link
Collaborator

I can jump in here with a quick note that SM_100 is Blackwell, which is only supported from CUDA 12.8+. If you edit the CMakeLists.txt in that sample's directory to remove everything greater than SM_90 as a target does that work?

@nbowling
Copy link
Author

Hi,
As I understand it the RTX 2080 TI is Turing. However you may be zeroing in. I noticed that the deviceProbe is reporting CUDA driver 12.4 and CUDA runtime 12.6. This can’t be good but I can’t for the life of me work out how this has occurred.

System is Kubuntu 24.04
NVidia driver 550 Open variant
HPC SDK 25.1 WITH CUDA 12.6

I’m thinking the HPC SDK has set up one CUDA in /opt/NVidia and the driver has set up another version wherever that lives.

My problem is this. I want to use the HPC SDK to work on a hydrological model in Fortran. I want to use CUDA to improve performance in matrix calculations. Previously I would install CUDA in /usr/share/local and then install the HPC SDK on top and it would all work. Now HPC goes in /opt and carries its own version of CUDA which is not the current version.

I’m not a proper programmer just an enthusiastic amateur, working on developing a course called “Cycles in the Biosphere”, largely self taught so to be honest all the Make, CMake stuff is sort of new to me. I’ve always had success working with CUDA in Windows using VS2022 it just works with a little tweaking. Previously CUDA / HPC worked just as seamlessly in Linux so I’m flumuxed now and feeling a little inadequate 😐

Unless there’s a simple solution I may just rebuild the machine and start from scratch using the latest CUDA and C/C++ or even Python.

With grateful thanks for your input.
Nick

@mcolg
Copy link

mcolg commented Feb 19, 2025

I noticed that the deviceProbe is reporting CUDA driver 12.4 and CUDA runtime 12.6. This can’t be good but I can’t for the life of me work out how this has occurred.

This shouldn't matter. Since it would be huge package if we shipped all versions of CUDA with the NVHPC SDK, instead we ship the latest at the time of release (with 25.1 that was 12.6, but in 25.3 we'll move to 12.8). We have a separate package which also includes the last major release from the previous CUDA version (i.e. 11.8) and the base last major release (11.0).

Within a major release, the driver and CUDA runtime are compatible so having a mismatch is ok. You just may not have access to newer features, such as Blackwell support in this case.

It sounds like when they updated the samples, they did so with CUDA 12.8 in mind, so you just need to edit the Cmake config file, "CMakeList.txt", to remove the "100 101 120" from the "CMAKE_CUDA_ARCHITECTURES" list.

As for your bashrc settings, these are fine. Yes you could clean it up a bit since you have repeated directories listed that are defined by marcos earlier, but there's nothing functionally wrong with it (at least not that I see). If you use modules at all, we do include module files which makes setting your environment easier. But if you don't, then it's probably more work to set up modules. It's really up to you on what makes the best sense.

For the permission issue, I'm not sure. Presuming you installed the compilers, you should have permissions to those directories, so I'd need more info to help there. Also, "/opt/nvhpc" is just the default location. You can override where the base install directory is located. So if using /opt is a problem for you, rerun the "install.sh" script giving the new directory at the prompt.

@nbowling
Copy link
Author

Thanks - I’ll work through your suggestions and fingers crossed! I’ll feedback when done.

Cheers
N

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants