Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use pynvjitlink for MVC #23

Closed

Conversation

brandon-b-miller
Copy link
Collaborator

This PR attempts to move some of the logic inside the pynvjitlink patch.py to work behind config.CUDA_ENABLE_MINOR_VERSION_COMPATIBILITY such that numba may perform the patch if necessary rather than pynvjitlink itself.

Copy link

copy-pr-bot bot commented Jul 29, 2024

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Copy link
Collaborator

@gmarkall gmarkall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just had a quick skim and left some initial comments / thoughts.

This would also need the Numba tests from pynvjitlink porting in.

numba_cuda/numba/cuda/cudadrv/driver.py Outdated Show resolved Hide resolved
numba_cuda/numba/cuda/cudadrv/driver.py Outdated Show resolved Hide resolved
numba_cuda/numba/cuda/cudadrv/driver.py Outdated Show resolved Hide resolved
numba_cuda/numba/cuda/cudadrv/driver.py Outdated Show resolved Hide resolved
@brandon-b-miller brandon-b-miller changed the base branch from main to develop July 29, 2024 15:23
@gmarkall gmarkall added the 4 - Waiting on reviewer Waiting for reviewer to respond to author label Jul 31, 2024
Copy link
Collaborator

@gmarkall gmarkall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates - I just had another pass over this and added some comments on the diff.

numba_cuda/numba/cuda/cudadrv/driver.py Show resolved Hide resolved
numba_cuda/numba/cuda/cudadrv/driver.py Outdated Show resolved Hide resolved
numba_cuda/numba/cuda/cudadrv/driver.py Outdated Show resolved Hide resolved
numba_cuda/numba/cuda/runtime/nrt.cu Outdated Show resolved Hide resolved
numba_cuda/numba/cuda/target.py Outdated Show resolved Hide resolved
numba_cuda/numba/cuda/cudadrv/driver.py Outdated Show resolved Hide resolved
@gmarkall gmarkall added 4 - Waiting on author Waiting for author to respond to review and removed 4 - Waiting on reviewer Waiting for reviewer to respond to author labels Aug 6, 2024
Copy link
Collaborator

@gmarkall gmarkall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates - I added a couple more comments on the diff.

@brandon-b-miller
Copy link
Collaborator Author

Thanks for the updates - I added a couple more comments on the diff.

Thank you for the review :) I've made some changes that hopefully address your comments.

Copy link
Collaborator

@gmarkall gmarkall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes - I've gone through another round and have some more thoughts on the diff. CI is failing because the logic for setting config.ENABLE_PYNVJITLINK is still not quite right.

We also need to port in the tests from pynvjitlink - I suspect there's at least one logic error (to do with adding LTOIR from a file) which the added tests might catch.

numba_cuda/numba/cuda/cudadrv/driver.py Outdated Show resolved Hide resolved
numba_cuda/numba/cuda/cudadrv/driver.py Outdated Show resolved Hide resolved
numba_cuda/numba/cuda/cudadrv/driver.py Outdated Show resolved Hide resolved
f".{ext}")
self.add_file(path, kind)
if isinstance(path_or_code, str):
ext = pathlib.Path(path_or_code).suffix
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we missing a bit of logic for handling .ltoir here? Or has it gone somewhere else? The original I'm looking at is https://github.com/rapidsai/pynvjitlink/blob/a2f23b7c3c237f2cdde3093c845e0453572503eb/pynvjitlink/patch.py#L170-L171

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realise now the handling for this is taken care of by having LTOIR in the file extension map. See comments below (I'll have to link after posting the review because the links don't exist before I post the review.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

numba_cuda/numba/cuda/cudadrv/driver.py Outdated Show resolved Hide resolved
numba_cuda/numba/cuda/cudadrv/enums.py Outdated Show resolved Hide resolved
numba_cuda/numba/cuda/cudadrv/driver.py Outdated Show resolved Hide resolved
numba_cuda/numba/cuda/cudadrv/driver.py Outdated Show resolved Hide resolved
numba_cuda/numba/cuda/cudadrv/driver.py Outdated Show resolved Hide resolved
@brandon-b-miller
Copy link
Collaborator Author

The tests now pass, modulo what appears to be an intermittent error on the conda pynvjitlink test job. I do not seem to have permissions to rerun this job.

@gmarkall
Copy link
Collaborator

Running the testsuite on my local system seems to be deadlocking at:

test_lock (numba.cuda.tests.cudapy.test_dispatcher.TestDispatcher.test_lock)
Test that (lazy) compiling from several threads at once doesn't ...

when pynvjitlink is enabled:

ENABLE_PYNVJITLINK=1 python -m numba.runtests numba.cuda.tests -v -m

@gmarkall
Copy link
Collaborator

Threads:

(gdb) info threads
  Id   Target Id                                            Frame 
* 1    Thread 0x70df86a90740 (LWP 128088) "python"          __futex_abstimed_wait_common64 (private=<optimised out>, 
    cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x70de98395330) at ./nptl/futex-internal.c:57
  2    Thread 0x70df83600640 (LWP 128089) "python"          __futex_abstimed_wait_common64 (private=0, cancel=true, 
    abstime=0x0, op=393, expected=0, futex_word=0x70df859ceb60 <thread_status+96>) at ./nptl/futex-internal.c:57
  3    Thread 0x70df82c00640 (LWP 128090) "python"          __futex_abstimed_wait_common64 (private=0, cancel=true, 
    abstime=0x0, op=393, expected=0, futex_word=0x70df859cebe0 <thread_status+224>) at ./nptl/futex-internal.c:57
  4    Thread 0x70df82200640 (LWP 128091) "python"          __futex_abstimed_wait_common64 (private=0, cancel=true, 
    abstime=0x0, op=393, expected=0, futex_word=0x70df859cec60 <thread_status+352>) at ./nptl/futex-internal.c:57
  5    Thread 0x70df81800640 (LWP 128092) "python"          __futex_abstimed_wait_common64 (private=0, cancel=true, 
    abstime=0x0, op=393, expected=0, futex_word=0x70df859cece0 <thread_status+480>) at ./nptl/futex-internal.c:57
  6    Thread 0x70df80e00640 (LWP 128093) "python"          __futex_abstimed_wait_common64 (private=0, cancel=true, 
    abstime=0x0, op=393, expected=0, futex_word=0x70df859ced60 <thread_status+608>) at ./nptl/futex-internal.c:57
  7    Thread 0x70df78400640 (LWP 128094) "python"          __futex_abstimed_wait_common64 (private=0, cancel=true, 
    abstime=0x0, op=393, expected=0, futex_word=0x70df859cede0 <thread_status+736>) at ./nptl/futex-internal.c:57
  8    Thread 0x70df77a00640 (LWP 128095) "python"          __futex_abstimed_wait_common64 (private=0, cancel=true, 
    abstime=0x0, op=393, expected=0, futex_word=0x70df859cee60 <thread_status+864>) at ./nptl/futex-internal.c:57
  9    Thread 0x70df67000640 (LWP 128096) "python"          __futex_abstimed_wait_common64 (private=0, cancel=true, 
    abstime=0x0, op=393, expected=0, futex_word=0x70df859ceee0 <thread_status+992>) at ./nptl/futex-internal.c:57
  10   Thread 0x70df66600640 (LWP 128097) "python"          __futex_abstimed_wait_common64 (private=0, cancel=true, 
    abstime=0x0, op=393, expected=0, futex_word=0x70df859cef60 <thread_status+1120>) at ./nptl/futex-internal.c:57
  11   Thread 0x70df45c00640 (LWP 128098) "python"          __futex_abstimed_wait_common64 (private=0, cancel=true, 
    abstime=0x0, op=393, expected=0, futex_word=0x70df859cefe0 <thread_status+1248>) at ./nptl/futex-internal.c:57
  12   Thread 0x70df45200640 (LWP 128099) "python"          __futex_abstimed_wait_common64 (private=0, cancel=true, 
    abstime=0x0, op=393, expected=0, futex_word=0x70df859cf060 <thread_status+1376>) at ./nptl/futex-internal.c:57
  13   Thread 0x70df34800640 (LWP 128100) "python"          __futex_abstimed_wait_common64 (private=0, cancel=true, 
    abstime=0x0, op=393, expected=0, futex_word=0x70df859cf0e0 <thread_status+1504>) at ./nptl/futex-internal.c:57
  14   Thread 0x70df33e00640 (LWP 128101) "python"          __futex_abstimed_wait_common64 (private=0, cancel=true, 
    abstime=0x0, op=393, expected=0, futex_word=0x70df859cf160 <thread_status+1632>) at ./nptl/futex-internal.c:57
  15   Thread 0x70df23400640 (LWP 128102) "python"          __futex_abstimed_wait_common64 (private=0, cancel=true, 
    abstime=0x0, op=393, expected=0, futex_word=0x70df859cf1e0 <thread_status+1760>) at ./nptl/futex-internal.c:57
  16   Thread 0x70df12a00640 (LWP 128103) "python"          __futex_abstimed_wait_common64 (private=0, cancel=true, 
    abstime=0x0, op=393, expected=0, futex_word=0x70df859cf260 <thread_status+1888>) at ./nptl/futex-internal.c:57
  17   Thread 0x70df0a000640 (LWP 128104) "python"          __futex_abstimed_wait_common64 (private=0, cancel=true, 
    abstime=0x0, op=393, expected=0, futex_word=0x70df859cf2e0 <thread_status+2016>) at ./nptl/futex-internal.c:57
  18   Thread 0x70df01600640 (LWP 128105) "python"          __futex_abstimed_wait_common64 (private=0, cancel=true, 
    abstime=0x0, op=393, expected=0, futex_word=0x70df859cf360 <thread_status+2144>) at ./nptl/futex-internal.c:57
  19   Thread 0x70def8c00640 (LWP 128106) "python"          __futex_abstimed_wait_common64 (private=0, cancel=true, 
    abstime=0x0, op=393, expected=0, futex_word=0x70df859cf3e0 <thread_status+2272>) at ./nptl/futex-internal.c:57
  20   Thread 0x70def0200640 (LWP 128107) "python"          __futex_abstimed_wait_common64 (private=0, cancel=true, 
    abstime=0x0, op=393, expected=0, futex_word=0x70df859cf460 <thread_status+2400>) at ./nptl/futex-internal.c:57
  21   Thread 0x70ded2200640 (LWP 128117) "cuda00001800007" 0x000070df86918bcf in __GI___poll (fds=0x60aeacfb6e10, 
    nfds=3, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:29
  22   Thread 0x70dec3800640 (LWP 129038) "python"          __futex_abstimed_wait_common64 (private=0, cancel=true, 
    abstime=0x0, op=393, expected=0, futex_word=0x60aead59e718) at ./nptl/futex-internal.c:57
  23   Thread 0x70debaa00640 (LWP 129039) "cuda-EvtHandlr"  0x000070df86918bcf in __GI___poll (fds=0x70debc001cf0, 
    nfds=10, timeout=100) at ../sysdeps/unix/sysv/linux/poll.c:29
  24   Thread 0x70dec4200640 (LWP 129045) "python"          futex_wait (private=0, expected=2, 
    futex_word=0x60aeaccc5d58) at ../sysdeps/nptl/futex-internal.h:146
  25   Thread 0x70debb400640 (LWP 129046) "python"          __futex_abstimed_wait_common64 (private=-1153439408, 
    cancel=true, abstime=0x70debb3fea70, op=137, expected=0, futex_word=0x60aeaaeac194 <_PyRuntime+436>)
    at ./nptl/futex-internal.c:57
  26   Thread 0x70deaec00640 (LWP 129047) "python"          __futex_abstimed_wait_common64 (private=<optimised out>, 
    cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x60aeacd6aea0) at ./nptl/futex-internal.c:57
  27   Thread 0x70dead600640 (LWP 129048) "python"          __futex_abstimed_wait_common64 (private=<optimised out>, 
    cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x60aeacd6aea0) at ./nptl/futex-internal.c:57
  28   Thread 0x70deacc00640 (LWP 129049) "python"          __futex_abstimed_wait_common64 (private=<optimised out>, 
    cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x60aeacd6aea0) at ./nptl/futex-internal.c:57
  29   Thread 0x70de9fe00640 (LWP 129050) "python"          __futex_abstimed_wait_common64 (private=<optimised out>, 
    cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x60aeacd6aea0) at ./nptl/futex-internal.c:57
  30   Thread 0x70de9f400640 (LWP 129051) "python"          __futex_abstimed_wait_common64 (private=<optimised out>, 
    cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x60aeacd6aea0) at ./nptl/futex-internal.c:57
  31   Thread 0x70de9ea00640 (LWP 129052) "python"          __futex_abstimed_wait_common64 (private=<optimised out>, 
    cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x60aeacd6aea0) at ./nptl/futex-internal.c:57
  32   Thread 0x70de9e000640 (LWP 129053) "python"          __futex_abstimed_wait_common64 (private=<optimised out>, 
    cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x60aeacd6aea0) at ./nptl/futex-internal.c:57
  33   Thread 0x70de9d600640 (LWP 129054) "python"          __futex_abstimed_wait_common64 (private=<optimised out>, 
    cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x60aeacd6aea0) at ./nptl/futex-internal.c:57
  34   Thread 0x70de9cc00640 (LWP 129055) "python"          __futex_abstimed_wait_common64 (private=<optimised out>, 
    cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x60aeacd6aea0) at ./nptl/futex-internal.c:57
  35   Thread 0x70de97e00640 (LWP 129056) "python"          __futex_abstimed_wait_common64 (private=<optimised out>, 
    cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x60aeacd6aea0) at ./nptl/futex-internal.c:57
  36   Thread 0x70de97400640 (LWP 129057) "python"          __futex_abstimed_wait_common64 (private=<optimised out>, 
    cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x60aeacd6aea0) at ./nptl/futex-internal.c:57
  37   Thread 0x70de96a00640 (LWP 129058) "python"          __futex_abstimed_wait_common64 (private=<optimised out>, 
    cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x60aeacd6aea0) at ./nptl/futex-internal.c:57
  38   Thread 0x70de93e00640 (LWP 129059) "python"          __futex_abstimed_wait_common64 (private=<optimised out>, 
    cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x60aeacd6aea0) at ./nptl/futex-internal.c:57
  39   Thread 0x70de93400640 (LWP 129060) "python"          __futex_abstimed_wait_common64 (private=<optimised out>, 
    cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x60aeacd6aea0) at ./nptl/futex-internal.c:57

@brandon-b-miller
Copy link
Collaborator Author

Connected offline with @gmarkall , we concluded the above is probably an issue in nvJitLink/pynvjitlink rather than here.

@gmarkall
Copy link
Collaborator

Connected offline with @gmarkall , we concluded the above is probably an issue in nvJitLink/pynvjitlink rather than here.

It was a combination of mismatched versions (cudadevrt from 12.6 with nvJitLink from 12.5) and nvJitLink not handling this situation gracefully.

Copy link
Collaborator

@gmarkall gmarkall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is mergeable. There are some small fixups that I'd like, but would rather push them into a new PR rather than fiddling any more with this, because it's been a heroic effort over a long period of time.

It should also me merged into main rather than develop, so I've rebased it and it's in #56 - over there I'm just waiting for CI, then will merge and follow up with the fixes.

@gmarkall gmarkall added DO NOT MERGE Hold off on merging; see PR for details develop A PR targeted at the develop branch that will need moving to main and removed 4 - Waiting on reviewer Waiting for reviewer to respond to author labels Oct 21, 2024
@gmarkall gmarkall added this to the v0.0.18 milestone Oct 21, 2024
@gmarkall
Copy link
Collaborator

Closing as #56 has just been merged.

@gmarkall gmarkall closed this Oct 21, 2024
gmarkall added a commit to gmarkall/numba-cuda that referenced this pull request Oct 22, 2024
- Update the codegen class docstring for LTO.
- Simplify / correct some logic in `_readenv()` (`value.lower()` could
  never be `"True"`, only `"true"`.
- Simplify additional flags and linker checks.
- Setting `self._linker.complete` in `complete()` is unnecessary, as
  calling `get_linked_cubin()` sets the link as complete already.
gmarkall added a commit that referenced this pull request Oct 23, 2024
- Update the codegen class docstring for LTO.
- Simplify / correct some logic in `_readenv()` (`value.lower()` could
  never be `"True"`, only `"true"`.
- Simplify additional flags and linker checks.
- Setting `self._linker.complete` in `complete()` is unnecessary, as
  calling `get_linked_cubin()` sets the link as complete already.
isVoid pushed a commit to isVoid/numba-cuda that referenced this pull request Oct 28, 2024
- Update the codegen class docstring for LTO.
- Simplify / correct some logic in `_readenv()` (`value.lower()` could
  never be `"True"`, only `"true"`.
- Simplify additional flags and linker checks.
- Setting `self._linker.complete` in `complete()` is unnecessary, as
  calling `get_linked_cubin()` sets the link as complete already.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
develop A PR targeted at the develop branch that will need moving to main DO NOT MERGE Hold off on merging; see PR for details
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants