[BUG] libnvcomp is not compiled with PTDS #9534

jbrennan333 · 2021-10-27T15:53:39Z

CUDF does not define CUDA_API_PER_THREAD_DEFAULT_STREAM when building libnvcomp.a, even when -DPER_THREAD_DEFAULT_STREAM=ON is defined.

We found this when comparing nsys traces taken while running tpc-ds benchmarks on spark-rapids with and without LIBCUDF_USE_NVCOMP=1 defined. The cudf snappy kernels use per thread default streams, but nvcomp was using only 1 stream (7). As a result, most queries were slower with LIBCUDF_USE_NVCOMP=1, which is not what we expect.

Expected behavior
When you compile CUDF with -DPER_THREAD_DEFAULT_STREAM=ON, libnvcomp.a should be compiled with CUDA_API_PER_THREAD_DEFAULT_STREAM. We expect that with this change, most queries will be at least as fast if not faster when run with LIBCUDF_USE_NVCOMP=1.

jbrennan333 · 2021-10-27T15:59:23Z

Some relevant discussion from an internal slack channel:
@abellina - Given that nvcomp is a separate compilation unit in cuDF, do we need to define CUDA_API_PER_THREAD_DEFAULT_STREAM for it? @jbrennan333 shared some nsys traces he obtained using the "use nvcomp" switch for unsnappy, and those kernels ended up in the Default Stream (7), rather than in our other per task streams.

@jrhemstad - that or we need to pass cudaStreamPerThread for any stream argument to nvcomp functions. I'm almost certain we're not passing in cudaStreamPerThread to nvcomp APIs. In libcudf we'd just be passing in the stream argument from the detail API which defaults to 0. The meaning of 0 == PTDS gets lost when crossing precompiled library boundaries.

@abellina - so the jni side changes the value of "default" to 2 if we detect PTDS. But if we are not passing that to the C++ side, then yeah this could happen.

@jrhemstad - So we don't have a great fix for this right now as libcudf public APIs don't take streams. The easiest thing would be to use the detail:: API from cuIO and specify cudaStreamPerThread. Alternatively, we could make it so that when libcudf is compiled with PTDS that all internal stream arguments are defaulted to cudaStreamPerThread instead of 0.

jbrennan333 · 2021-10-27T16:01:20Z

Attaching a screeenshot from one of the nsys traces.

jbrennan333 · 2021-10-27T16:04:15Z

As a workaround, I was able to build libnvcomp.a by adding:
target_compile_definitions(nvcomp PUBLIC CUDA_API_PER_THREAD_DEFAULT_STREAM)
to find_and_configure_nvcomp in get_nvcomp.cmake.

@jrhemstad, @abellina should I put up a patch with this change? It should be sufficient as long as we are building nvcomp during the cudf build.

closes #9534 Change get_nvcomp.cmake to compile with CUDA_API_PER_THREAD_DEFAULT_STREAM is PER_THREAD_DEFAULT_STREAM is defined. I did not add unit tests for this, but I tested it manually by building CUDF and then running a query to verify that PTDS was being used. Authors: - Jim Brennan (https://github.com/jbrennan333) Approvers: - Robert Maynard (https://github.com/robertmaynard) - Vyas Ramasubramani (https://github.com/vyasr) URL: #9540

jbrennan333 added bug Something isn't working Needs Triage Need team to review and classify labels Oct 27, 2021

jbrennan333 added the Performance Performance related issue label Oct 27, 2021

jbrennan333 mentioned this issue Oct 27, 2021

compile libnvcomp with PTDS if requested #9540

Merged

devavret added cuIO cuIO issue and removed Needs Triage Need team to review and classify labels Oct 27, 2021

rapids-bot bot closed this as completed in #9540 Nov 1, 2021

jlowe mentioned this issue Nov 4, 2021

[FEA] libcudf stream parameters should default to cudaStreamPerThread when compiled for PTDS #9614

Closed

jbrennan333 mentioned this issue May 12, 2022

Add support for nvcomp-2.3 in java build #10837

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] libnvcomp is not compiled with PTDS #9534

[BUG] libnvcomp is not compiled with PTDS #9534

jbrennan333 commented Oct 27, 2021

jbrennan333 commented Oct 27, 2021

jbrennan333 commented Oct 27, 2021

jbrennan333 commented Oct 27, 2021

[BUG] libnvcomp is not compiled with PTDS #9534

[BUG] libnvcomp is not compiled with PTDS #9534

Comments

jbrennan333 commented Oct 27, 2021

jbrennan333 commented Oct 27, 2021

jbrennan333 commented Oct 27, 2021

jbrennan333 commented Oct 27, 2021