Skip to content

Commit

Permalink
[Bugfix] CudaDeviceAPI::GetAttr may check kExist when GPUs absent (#1…
Browse files Browse the repository at this point in the history
…6903)

This commit resolves a bug that was introduced in
#16377.  If no CUDA-capable GPUs are
present, the call to `cudaGetDeviceCount` will return an error, which will
be raised as an exception by the `CUDA_CALL` macro.  However, checking
the `kExist` flag is valid even if no GPUs are present.

This commit removes the use of `CUDA_CALL`, and instead returns false
in this case.
  • Loading branch information
Lunderberg authored Apr 18, 2024
1 parent de91c5c commit 7dc0472
Show file tree
Hide file tree
Showing 2 changed files with 56 additions and 3 deletions.
7 changes: 4 additions & 3 deletions src/runtime/cuda/cuda_device_api.cc
Original file line number Diff line number Diff line change
Expand Up @@ -41,11 +41,12 @@ class CUDADeviceAPI final : public DeviceAPI {
void GetAttr(Device dev, DeviceAttrKind kind, TVMRetValue* rv) final {
int value = 0;
switch (kind) {
case kExist:
case kExist: {
int count;
CUDA_CALL(cudaGetDeviceCount(&count));
value = static_cast<int>(dev.device_id < count);
auto err = cudaGetDeviceCount(&count);
value = (err == cudaSuccess && static_cast<int>(dev.device_id < count));
break;
}
case kMaxThreadsPerBlock: {
CUDA_CALL(cudaDeviceGetAttribute(&value, cudaDevAttrMaxThreadsPerBlock, dev.device_id));
break;
Expand Down
52 changes: 52 additions & 0 deletions tests/python/runtime/test_runtime_device_api.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

import os
import subprocess
import sys

import tvm
import tvm.testing


def test_check_if_device_exists():
"""kExist can be checked when no devices are present
This test uses `CUDA_VISIBLE_DEVICES` to disable any CUDA-capable
GPUs from being accessed by the subprocess. Within the
subprocess, the CUDA driver cannot be initialized. While most
functionality of CUDADeviceAPI would raise an exception, the
`kExist` property can still be checked.
"""

cmd = [
sys.executable,
"-c",
"import tvm; tvm.device('cuda').exist",
]
subprocess.check_call(
cmd,
env={
**os.environ,
"CUDA_VISIBLE_DEVICES": "",
},
)


if __name__ == "__main__":
tvm.testing.main()

0 comments on commit 7dc0472

Please sign in to comment.