-
Notifications
You must be signed in to change notification settings - Fork 440
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug in TestCuda_Other.cpp: most likely assembly inserted into Device code #515
Comments
This is the offending commit: |
Reproduce on Kokkos-Dev:
|
@crtrott this looks like an X86 instruction and it doesn't appear there are any |
Yeah I know, this looks to me like a compiler issue ... |
Also if you leave off the SNB from the architecture thing it also works. |
…evice Protecting the macros by __CUDA_ARCH__ makes them only come in when compiling the host phase. This fixes bug #515 which was caused by load_fence and store_fence introducing ASM code into the device phase.
So this was caused by insufficient protection of assembly code in load_fence and store_fence. |
This is from nightly testing with Cuda 8 on my machine. Reproduce:
SHA c11e14c is still ok. My guess is that its the memory fence stuff or so in memory_pool
The text was updated successfully, but these errors were encountered: