Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use pynvjitlink for CUDA 12+ MVC #13650

Merged
merged 18 commits into from
Nov 20, 2023
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions python/cudf/cudf/tests/test_numba_import.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,10 @@
TEST_NUMBA_MVC_ENABLED = """
import numba.cuda
import cudf
from cudf.utils._numba import _CUDFNumbaConfig, _patch_numba_mvc
from cudf.utils._numba import _CUDFNumbaConfig, patch_numba_linker_cuda_11


_patch_numba_mvc()
patch_numba_linker_cuda_11()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be unconditionally using CUDA 11 patching logic in tests. We should use CUDA 12 patching logic when on CUDA 12, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this whole test skips in a CUDA 12 environment based on the below

@pytest.mark.skipif(
    not IS_CUDA_11, reason="Minor Version Compatibility test for CUDA 11"
)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably add a test for CUDA 12 MVC eventually but I don't think any of our CI jobs would cover that case yet sadly.

Copy link
Contributor

@bdice bdice Nov 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The MVC tests should be enabled for CUDA 12, even if we don't have an MVC-triggering environment in CI with the drivers/runtimes we use. We want to be able to run pytest on a system requiring MVC and know that it's being covered by cudf.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's okay to have them be skipped if on CUDA 12 and MVC is needed and pynvjitlink isn't available.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, that makes sense. I'm thinking I'll rename this file test_mvc and have a separate test for cuda 11 and cuda 12. updates to follow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did some refactoring in 35b7e59


@numba.cuda.jit
def test_kernel(x):
Expand All @@ -45,4 +45,5 @@ def test_numba_mvc_enabled_cuda_11():
capture_output=True,
cwd="/",
)

assert cp.returncode == 0
52 changes: 28 additions & 24 deletions python/cudf/cudf/utils/_numba.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,19 @@

from numba import config as numba_config

try:
from pynvjitlink.patch import (
patch_numba_linker as patch_numba_linker_cuda_12,
)
except ModuleNotFoundError:
brandon-b-miller marked this conversation as resolved.
Show resolved Hide resolved

def patch_numba_linker_cuda_12():
brandon-b-miller marked this conversation as resolved.
Show resolved Hide resolved
warnings.warn(
"Minor version compatibility requires pynvjitlink. "
"Please pip install pynvjitlink to proceed using cuDF."
Copy link
Contributor

@bdice bdice Nov 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on the timing of pynvjitlink being shipped, we'll want to do one of the following:

  • If pynvjitlink isn't shipped by the time we release, we should remove this note and say something like before (explain the MVC problem and direct users to match their driver and toolkit)
  • If pynvjitlink is available in time, we should give the full install command, like pip install pynvjitlink --extra-index-url=https://pypi.nvidia.com

We don't want to leave this in a halfway state where we tell users to install pynvjitlink but don't specify the index, especially if the package isn't yet available.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you feel about keeping the original warning to be safe then. The changes will still allow pynvjitlink to work once it's out, but we won't document it or direct users to it in any official way until then.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, do that. We can change the warning during code freeze if we ship pynvjitlink in time.

)


CC_60_PTX_FILE = os.path.join(
os.path.dirname(__file__), "../core/udf/shim_60.ptx"
)
Expand Down Expand Up @@ -65,7 +78,7 @@ def _get_ptx_file(path, prefix):
return regular_result[1]


def _patch_numba_mvc():
def patch_numba_linker_cuda_11():
brandon-b-miller marked this conversation as resolved.
Show resolved Hide resolved
# Enable the config option for minor version compatibility
numba_config.CUDA_ENABLE_MINOR_VERSION_COMPATIBILITY = 1

Expand Down Expand Up @@ -106,29 +119,19 @@ def _setup_numba():
versions = safe_get_versions()
if versions != NO_DRIVER:
driver_version, runtime_version = versions
if driver_version >= (12, 0) and runtime_version > driver_version:
warnings.warn(
f"Using CUDA toolkit version {runtime_version} with CUDA "
f"driver version {driver_version} requires minor version "
"compatibility, which is not yet supported for CUDA "
"driver versions 12.0 and above. It is likely that many "
"cuDF operations will not work in this state. Please "
f"install CUDA toolkit version {driver_version} to "
"continue using cuDF."
)
else:
# Support MVC for all CUDA versions in the 11.x range
ptx_toolkit_version = _get_cuda_version_from_ptx_file(
CC_60_PTX_FILE
)
# Numba thinks cubinlinker is only needed if the driver is older
# than the CUDA runtime, but when PTX files are present, it might
# also need to patch because those PTX files may be compiled by
# a CUDA version that is newer than the driver as well
if (driver_version < ptx_toolkit_version) or (
driver_version < runtime_version
):
_patch_numba_mvc()
ptx_toolkit_version = _get_cuda_version_from_ptx_file(CC_60_PTX_FILE)

# MVC is required whenever any PTX is newer than the driver
# This could be the shipped PTX file or the PTX emitted by
# the version of NVVM on the user system, the latter aligning
# with the runtime version
if (driver_version < ptx_toolkit_version) or (
driver_version < runtime_version
):
if driver_version < (12, 0):
patch_numba_linker_cuda_11()
else:
patch_numba_linker_cuda_12()


def _get_cuda_version_from_ptx_file(path):
Expand Down Expand Up @@ -171,6 +174,7 @@ def _get_cuda_version_from_ptx_file(path):
"7.8": (11, 8),
"8.0": (12, 0),
"8.1": (12, 1),
"8.2": (12, 2),
brandon-b-miller marked this conversation as resolved.
Show resolved Hide resolved
}

cuda_ver = ver_map.get(version)
Expand Down