Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guard CUDA runtime APIs with error checking #266

Merged
merged 1 commit into from
Jan 31, 2023

Conversation

PointKernel
Copy link
Member

Similar to rapidsai/cudf#12531

This PR adds error checking for CUDA runtime APIs.

@PointKernel PointKernel added type: feature request New feature request Needs Review Awaiting reviews before merging labels Jan 30, 2023
@@ -66,7 +66,7 @@ TEMPLATE_TEST_CASE_SIG("Parallel insert-or-update",
static constexpr int Blocks = 1024;
static constexpr int Threads = 128;
parallel_sum<<<Blocks, Threads>>>(m.get_device_mutable_view());
cudaDeviceSynchronize();
CUCO_CUDA_TRY(cudaDeviceSynchronize());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does catch2 handle exceptions or failing assertions? Will the test case just fail or will this nuke the whole test suite? As a workaround, we could wrap CUDA calls with a catch2 clause, e.g., REQUIRE(cudaSuccess == cudaDeviceSynchronize())

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

REQUIRE would return if a test case fails while CHECK won't stop at the first failure and will complete the whole test suite.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also wrap CUCO_CUDA_TRY(...) with REQUIRE_NOTHROW(CUCO_CUDA_TRY(...))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does catch2 handle exceptions or failing assertions? Will the test case just fail or will this nuke the whole test suite?

Just did a quick test, throwing an error will fail the current test case but other tests in the test suite will continue. So the behavior is similar to CHECK_NOTHROW. In that sense, wrapping those calls with catch2 macros is not necessary.

@PointKernel PointKernel merged commit 8523ab1 into NVIDIA:dev Jan 31, 2023
@PointKernel PointKernel deleted the guard-cuda-apis branch January 31, 2023 17:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Review Awaiting reviews before merging type: feature request New feature request
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants