-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GHC-8 problems #39
Comments
Reproduced on my machine Mac OS 10.11.5, CUDA 7.5.26, GHC-8.0.1. Seems to be fine with GHC-7.10.3 though. Hmm... |
Worked fine for me on a Ubuntu 12.04 box with GHC 8.0.1, so possibly confined to OS X. |
@mchakravarty @robeverest if you have a different configuration could you try this on your machine? |
Here is an interesting ticket which discusses This is just a hypothesis however, which I'm not yet sure how to test. |
So my OSX configuration is unfortunately the same as yours, but I can confirm I'm seeing the same bug. I did also try it out on Ubuntu 14.04 and it works as expected. |
nvidia-device-query dies on CUDA.initialise under (Ubuntu 16.04, GHC 8.0.1, cuda-7.5, nvidia-361) and Programs written on top of accelerate worked fine under |
@niobium0 Okay, thanks for the confirmation that this is not limited to macOS. |
See the description in 'init.c' for details of the problem. This trick works for compiled programs, but we still have problems with running under ghci. towards: #39
On initialisation just reserve the memory block that will be required by the CUDA driver, and release it only once the user calls 'cuInit'. This still doesn't work with ghci, but feels like it is moving in the right direction. (Now, 'cuInit' crashes with 'SIGBUS' (macos) or 'SIGSEGV' (ubuntu), rather than giving the same "out of memory error" even if we had already called 'cuInit' by the previous method via LD_PRELOAD/DYLD_INSERT_LIBRARIES before the RTS initialised.) towards: #39
I will note that these workarounds probably aren't going to work on windows... : |
Trevor, thank you for the swift fix. Unfortunately I don't have a Windows machine at hand, but can verify that everything works as expected in my setup (Ubuntu 16.04, GHC 8.0.1, cuda-7.5, nvidia-367). |
Thanks for working on this Trevor. Not sure if its meant to be in a sufficiently stable state yet to build, so ignore if premature, but when I tried building on OSX10.11 with GHC 8.0.1 and gcc Apple LLVM version 7.3.0 (clang-703.0.31) I ran into problems apparently due to dynamic linking ld: -rpath can only be used when creating a dynamic final linked image for modules Foreign.CUDA.Analysis.Device and Foreign.CUDA.Types. Unclear if its an issue with the modified Setup.hs, OSX generally, or just my particular toolset. |
Hello (again) Trevor :) I ran into the same issue under archlinux, both against CUDA 7 and 8. It seems this fix hasn't been released yet, any reason for that? It's keeping me from using ghc 8 which isn't that big of a deal but still a bit annoying =) |
@alpmestan sorry, just got back from conference travel and am catching up with things. The main problem is that I didn't yet get this to work under ghci. I guess having compiled programs working at least is a big plus, so I'll finalise and throw it up on hackage shortly. |
Thanks! It's indeed annoying that it doesn't work in ghci but is still OK. Does the patch going in ghc 8.0.2 fix the ghci issue or is that one not fixed at all? |
Thanks Trevor. I'm still getting a build problem on OSX, however: [29 of 37] Compiling Foreign.CUDA.Runtime ( Foreign/CUDA/Runtime.hs, dist/build/Foreign/CUDA/Runtime.o ) Foreign/CUDA/Internal/C2HS.hs:202:3: warning: [-Winline-rule-shadowing] Foreign/CUDA/Internal/C2HS.hs:204:3: warning: [-Winline-rule-shadowing] This was from a fresh clone of the cuda repo. Are you able to build under OSX, if so could you confirm compiler version, I'm using the following: Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1 Best regards, David Duke T: +44 113 3436800 |
@alpmestan As far as I know a fix has been merged, so hopefully that will be in 8.0.2. If it doesn't make it to that release (or 8.0.2 doesn't come out for a while) I'll have another crack at trying to make it. |
@djduke working for me on OS X on both 7.10.3 and 8.0.1. I haven't upgraded to Sierra yet, I am still on El Capitan (10.11.6).
What is the
|
Hi Trevor, As far as I can see, my tool configuration matches yours. My c2hs was older (2015), I updated c2hs and tried again but the problem persists. Here is a log of building cuda from a fresh clone of your repo, along with version info for the tools. I also ran cabal build with verbose=3, and looked at the output of the final set of commands (output at the end). Regards,
Here is the most obviously relevant chunk of the output from cabal --verbose=3:
David Duke T: +44 113 3436800 |
Following up on my previous mail: I wonder if the problem is related to this issue with Cabal: where dynamic linking was incorrectly turned on when executable profiling was selected. The issue was closed and the change was committed, but possibly masking a deeper inconsistency? I'm using Cabal-1.24.0.0, and suspect as you've been using ghc-8.0.1. you will be on the same version? David.
David Duke T: +44 113 3436800 |
I just tried this with GHC HEAD and everything appears to work as expected in ghci. The RTS automatically avoids the region needed by CUDA, no need to specify that through RTS flags or otherwise (although, I'm not sure how large a region it avoids... if your total GPU+system RAM is very high maybe you will still need to specify the offset manually.)
|
That's good news — thanks for checking! |
Since GHC-8.0.2 is out now this is probably safe to close. |
Originally reported by David Duke.
While working with the Haskell Cuda library on OSX 10.11 I started getting a strange set of behaviours, and wondered if you had come anything similar? I recently updated both my GHC installation (to 8.0.1) and my CUDA toolkit (to 7.5). I therefore wanted to update Accelerate etc, but noted that your Cuda package was only noted up to 7.0. As I don't believe there are substantial changes from 7.0 -> 7.5 I thought it should still work (and I need to have the later CUDA for work not involving Haskell).
However I found that Haskell code that called the Cuda library was aborting, and tracked the failure down to the call to cuInit (made through "initialise" in your library) returning error code 2 (CUDA_DEVICE_OUT_OF_MEMORY). Its not clear why this should be happening, and to explore further I:
Given the simplicity of the two programs, I'm scratching my head for possible causes: when called from C, the wrapper is showing the correct arg and result; when called from Haskell it shows the correct arg but the wrong result! Here are the compiler invocations and runtime results (programs are attached):
I haven't had a chance to regress to ghc-7.10.3, and was also planning to try the code on linux once Cuda is reinstalled next week. Wondered if you had come across anything similar - or could check what happens on a different configuration?
Attachments: https://gist.github.com/tmcdonell/ee7c5183633a3687dafd15023f15a914
The text was updated successfully, but these errors were encountered: