Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GHC-8 problems #39

Closed
tmcdonell opened this issue Jul 4, 2016 · 24 comments
Closed

GHC-8 problems #39

tmcdonell opened this issue Jul 4, 2016 · 24 comments

Comments

@tmcdonell
Copy link
Owner

Originally reported by David Duke.


While working with the Haskell Cuda library on OSX 10.11 I started getting a strange set of behaviours, and wondered if you had come anything similar? I recently updated both my GHC installation (to 8.0.1) and my CUDA toolkit (to 7.5). I therefore wanted to update Accelerate etc, but noted that your Cuda package was only noted up to 7.0. As I don't believe there are substantial changes from 7.0 -> 7.5 I thought it should still work (and I need to have the later CUDA for work not involving Haskell).

However I found that Haskell code that called the Cuda library was aborting, and tracked the failure down to the call to cuInit (made through "initialise" in your library) returning error code 2 (CUDA_DEVICE_OUT_OF_MEMORY). Its not clear why this should be happening, and to explore further I:

  1. created my own simple C wrapper function around cuInit, which displays the arg and result.
  2. wrote a C driver to call the wrapper; when executed cuInit is called and returns error code 0.
  3. wrote a Haskell driver to call the simple wrapper directly via FFI: now when the wrapper is executed cuInit returns error code 2.

Given the simplicity of the two programs, I'm scratching my head for possible causes: when called from C, the wrapper is showing the correct arg and result; when called from Haskell it shows the correct arg but the wrong result! Here are the compiler invocations and runtime results (programs are attached):

~> gcc -c -I /usr/local/cuda/include  cuwrap.c
~> ghc callFromHs.hs  cuwrap.o -L /usr/local/cuda/lib/ -lcuda
~> gcc -o callFromC callFromC.c cuwrap.o -L /usr/local/cuda/lib/ -lcuda
~> ./callFromC
Running main.
cuInit called: arg 0, result 0
Main completed, result 0

~> ./callFromHs
Running Main
cuInit called: arg 0, result 2
Main completed, result 2

I haven't had a chance to regress to ghc-7.10.3, and was also planning to try the code on linux once Cuda is reinstalled next week. Wondered if you had come across anything similar - or could check what happens on a different configuration?

Attachments: https://gist.github.com/tmcdonell/ee7c5183633a3687dafd15023f15a914

@tmcdonell
Copy link
Owner Author

Reproduced on my machine Mac OS 10.11.5, CUDA 7.5.26, GHC-8.0.1. Seems to be fine with GHC-7.10.3 though. Hmm...

@tmcdonell
Copy link
Owner Author

Worked fine for me on a Ubuntu 12.04 box with GHC 8.0.1, so possibly confined to OS X.

@tmcdonell
Copy link
Owner Author

@mchakravarty @robeverest if you have a different configuration could you try this on your machine?

@tmcdonell
Copy link
Owner Author

Here is an interesting ticket which discusses cuInit failing due to trying to mmap a specific region. GHC-8's new memory allocator may be interfering with this.

This is just a hypothesis however, which I'm not yet sure how to test.

@robeverest
Copy link

So my OSX configuration is unfortunately the same as yours, but I can confirm I'm seeing the same bug. I did also try it out on Ubuntu 14.04 and it works as expected.

@bakhtiyarneyman
Copy link

bakhtiyarneyman commented Aug 3, 2016

nvidia-device-query dies on CUDA.initialise under

(Ubuntu 16.04, GHC 8.0.1, cuda-7.5, nvidia-361) and
(Ubuntu 16.04, GHC 8.0.1, cuda-8, nvidia-367).

Programs written on top of accelerate worked fine under
(Ubuntu 16.04, GHC 7.10.3, cuda-7.5, nvidia-361).

@tmcdonell
Copy link
Owner Author

@niobium0 Okay, thanks for the confirmation that this is not limited to macOS.

@tmcdonell tmcdonell added the linux label Aug 3, 2016
tmcdonell added a commit that referenced this issue Aug 5, 2016
See the description in 'init.c' for details of the problem. This trick works for compiled programs, but we still have problems with running under ghci.

towards: #39
tmcdonell added a commit that referenced this issue Aug 9, 2016
On initialisation just reserve the memory block that will be required by the CUDA driver, and release it only once the user calls 'cuInit'.

This still doesn't work with ghci, but feels like it is moving in the right direction. (Now, 'cuInit' crashes with 'SIGBUS' (macos) or 'SIGSEGV' (ubuntu), rather than giving the same "out of memory error" even if we had already called 'cuInit' by the previous method via LD_PRELOAD/DYLD_INSERT_LIBRARIES before the RTS initialised.)

towards: #39
@tmcdonell
Copy link
Owner Author

tmcdonell commented Aug 9, 2016

I will note that these workarounds probably aren't going to work on windows... :

@bakhtiyarneyman
Copy link

Trevor, thank you for the swift fix. Unfortunately I don't have a Windows machine at hand, but can verify that everything works as expected in my setup (Ubuntu 16.04, GHC 8.0.1, cuda-7.5, nvidia-367).

@djduke
Copy link

djduke commented Aug 24, 2016

Thanks for working on this Trevor. Not sure if its meant to be in a sufficiently stable state yet to build, so ignore if premature, but when I tried building on OSX10.11 with GHC 8.0.1 and gcc Apple LLVM version 7.3.0 (clang-703.0.31) I ran into problems apparently due to dynamic linking

ld: -rpath can only be used when creating a dynamic final linked image

for modules Foreign.CUDA.Analysis.Device and Foreign.CUDA.Types. Unclear if its an issue with the modified Setup.hs, OSX generally, or just my particular toolset.

@tmcdonell
Copy link
Owner Author

@alpmestan
Copy link

Hello (again) Trevor :)

I ran into the same issue under archlinux, both against CUDA 7 and 8. It seems this fix hasn't been released yet, any reason for that? It's keeping me from using ghc 8 which isn't that big of a deal but still a bit annoying =)

@tmcdonell
Copy link
Owner Author

@alpmestan sorry, just got back from conference travel and am catching up with things. The main problem is that I didn't yet get this to work under ghci. I guess having compiled programs working at least is a big plus, so I'll finalise and throw it up on hackage shortly.

@tmcdonell
Copy link
Owner Author

@alpmestan
Copy link

alpmestan commented Oct 7, 2016

Thanks! It's indeed annoying that it doesn't work in ghci but is still OK. Does the patch going in ghc 8.0.2 fix the ghci issue or is that one not fixed at all?

@djduke
Copy link

djduke commented Oct 7, 2016

On 7 Oct 2016, at 09:33, Trevor L. McDonell [email protected] wrote:

https://hackage.haskell.org/package/cuda-0.7.5.0


You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.

Thanks Trevor.

I'm still getting a build problem on OSX, however:

[29 of 37] Compiling Foreign.CUDA.Runtime ( Foreign/CUDA/Runtime.hs, dist/build/Foreign/CUDA/Runtime.o )
[30 of 37] Compiling Foreign.CUDA.Runtime.Texture ( dist/build/Foreign/CUDA/Runtime/Texture.hs, dist/build/Foreign/CUDA/Runtime/Texture.o )
[31 of 37] Compiling Foreign.CUDA.Driver.Marshal ( dist/build/Foreign/CUDA/Driver/Marshal.hs, dist/build/Foreign/CUDA/Driver/Marshal.o )
[32 of 37] Compiling Foreign.CUDA.Driver.IPC.Marshal ( dist/build/Foreign/CUDA/Driver/IPC/Marshal.hs, dist/build/Foreign/CUDA/Driver/IPC/Marshal.o )
[33 of 37] Compiling Foreign.CUDA.Driver.Texture ( dist/build/Foreign/CUDA/Driver/Texture.hs, dist/build/Foreign/CUDA/Driver/Texture.o )
[34 of 37] Compiling Foreign.CUDA.Driver.Module.Query ( dist/build/Foreign/CUDA/Driver/Module/Query.hs, dist/build/Foreign/CUDA/Driver/Module/Query.o )
[35 of 37] Compiling Foreign.CUDA.Driver.Module ( Foreign/CUDA/Driver/Module.hs, dist/build/Foreign/CUDA/Driver/Module.o )
[36 of 37] Compiling Foreign.CUDA.Driver ( Foreign/CUDA/Driver.hs, dist/build/Foreign/CUDA/Driver.o )
[37 of 37] Compiling Foreign.CUDA ( Foreign/CUDA.hs, dist/build/Foreign/CUDA.o )
[ 1 of 37] Compiling Foreign.CUDA.Internal.C2HS ( Foreign/CUDA/Internal/C2HS.hs, dist/build/Foreign/CUDA/Internal/C2HS.p_o )

Foreign/CUDA/Internal/C2HS.hs:202:3: warning: [-Winline-rule-shadowing]
Rule "cFloatConv/CFloat->Float" may never fire
because ‘Foreign.C.Types.CFloat’ might inline first
Probable fix: add an INLINE[n] or NOINLINE[n] pragma for ‘Foreign.C.Types.CFloat’

Foreign/CUDA/Internal/C2HS.hs:204:3: warning: [-Winline-rule-shadowing]
Rule "cFloatConv/CDouble->Double" may never fire
because ‘Foreign.C.Types.CDouble’ might inline first
Probable fix: add an INLINE[n] or NOINLINE[n] pragma for ‘Foreign.C.Types.CDouble’
ld: -rpath can only be used when creating a dynamic final linked image
clang: error: linker command failed with exit code 1 (use -v to see invocation)
gcc' failed in phaseLinker'. (Exit code: 1)
cabal: Leaving directory '.'
cabal: Error: some packages failed to install:
cuda-0.7.5.0 failed during the building phase. The exception was:
ExitFailure 1

This was from a fresh clone of the cuda repo. Are you able to build under OSX, if so could you confirm compiler version, I'm using the following:

Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.3.0 (clang-703.0.31)
Target: x86_64-apple-darwin15.6.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

Best regards,
David.


David Duke T: +44 113 3436800
Professor of Computer Science E: [email protected]
Head, School of Computing W: www.comp.leeds.ac.uk/scsdjd/
PA: Gaynor Butterwick, E: [email protected] T: +44 113 3435434

@tmcdonell
Copy link
Owner Author

@alpmestan As far as I know a fix has been merged, so hopefully that will be in 8.0.2. If it doesn't make it to that release (or 8.0.2 doesn't come out for a while) I'll have another crack at trying to make it.

@tmcdonell
Copy link
Owner Author

@djduke working for me on OS X on both 7.10.3 and 8.0.1. I haven't upgraded to Sierra yet, I am still on El Capitan (10.11.6).

> gcc --version
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.3.0 (clang-703.0.31)
Target: x86_64-apple-darwin15.6.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

> clang --version
Apple LLVM version 7.3.0 (clang-703.0.31)
Target: x86_64-apple-darwin15.6.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

> c2hs --version
C->Haskell Compiler, version 0.28.1 Switcheroo, 1 April 2016
  build platform is "x86_64-darwin" <1, True, True, 1>

What is the cuda.buildinfo[.generated] file? I have -rpath options in there no problem:

buildable: True
cc-options: -I/usr/local/cuda/include
ld-options: -L/usr/local/cuda/lib
frameworks: CUDA
extra-libraries:
    cudadevrt
    cudart_static
extra-ghci-libraries: cudart
extra-lib-dirs: /usr/local/cuda/lib
ghc-options: -optc-I/usr/local/cuda/include -optl-L/usr/local/cuda/lib -optl-Wl,-rpath,/usr/local/cuda/lib
x-extra-c2hs-options: --cppopts=-E --cppopts=-m64 --cppopts=-DUSE_EMPTY_CASE --cppopts=-U__BLOCKS__

@djduke
Copy link

djduke commented Oct 8, 2016

Hi Trevor,

As far as I can see, my tool configuration matches yours. My c2hs was older (2015), I updated c2hs and tried again but the problem persists. Here is a log of building cuda from a fresh clone of your repo, along with version info for the tools. I also ran cabal build with verbose=3, and looked at the output of the final set of commands (output at the end).

Regards,
David.

scsdjd:GitRepos> git clone https://github.com/tmcdonell/cuda
Cloning into 'cuda'...
remote: Counting objects: 3661, done.
remote: Compressing objects: 100% (50/50), done.
remote: Total 3661 (delta 22), reused 0 (delta 0), pack-reused 3611
Receiving objects: 100% (3661/3661), 1.68 MiB | 249.00 KiB/s, done.
Resolving deltas: 100% (1954/1954), done.
Checking connectivity... done.

scsdjd:GitRepos> ghc --version
The Glorious Glasgow Haskell Compilation System, version 8.0.1

scsdjd:GitRepos> gcc --version
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.3.0 (clang-703.0.31)
Target: x86_64-apple-darwin15.6.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

scsdjd:GitRepos> clang --version
Apple LLVM version 7.3.0 (clang-703.0.31)
Target: x86_64-apple-darwin15.6.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

scsdjd:GitRepos> c2hs --version
C->Haskell Compiler, version 0.28.1 Switcheroo, 1 April 2016
  build platform is "x86_64-darwin" <1, True, True, 1>

scsdjd:cuda> cabal configure
Resolving dependencies...
[1 of 1] Compiling Main             ( dist/setup/setup.hs, dist/setup/Main.o )
Linking ./dist/setup/setup ...
Configuring cuda-0.7.5.0...
Found CUDA toolkit at: /usr/local/cuda
Storing parameters to cuda.buildinfo.generated
Using build information from 'cuda.buildinfo.generated'.
Provide a 'cuda.buildinfo' file to override this behaviour.

scsdjd:cuda> cabal build
Using build information from 'cuda.buildinfo.generated'.
Provide a 'cuda.buildinfo' file to override this behaviour.
Building cuda-0.7.5.0...
Preprocessing library cuda-0.7.5.0...
[ 1 of 37] Compiling Foreign.CUDA.Internal.C2HS ( Foreign/CUDA/Internal/C2HS.hs, dist/build/Foreign/CUDA/Internal/C2HS.o )

Foreign/CUDA/Internal/C2HS.hs:202:3: warning: [-Winline-rule-shadowing]
    Rule "cFloatConv/CFloat->Float" may never fire
      because ‘Foreign.C.Types.CFloat’ might inline first
    Probable fix: add an INLINE[n] or NOINLINE[n] pragma for ‘Foreign.C.Types.CFloat’

Foreign/CUDA/Internal/C2HS.hs:204:3: warning: [-Winline-rule-shadowing]
    Rule "cFloatConv/CDouble->Double" may never fire
      because ‘Foreign.C.Types.CDouble’ might inline first
    Probable fix: add an INLINE[n] or NOINLINE[n] pragma for ‘Foreign.C.Types.CDouble’
[ 2 of 37] Compiling Foreign.CUDA.Driver.Error ( dist/build/Foreign/CUDA/Driver/Error.hs, dist/build/Foreign/CUDA/Driver/Error.o )
[ 3 of 37] Compiling Foreign.CUDA.Driver.Profiler ( dist/build/Foreign/CUDA/Driver/Profiler.hs, dist/build/Foreign/CUDA/Driver/Profiler.o )
[ 4 of 37] Compiling Foreign.CUDA.Driver.Utils ( dist/build/Foreign/CUDA/Driver/Utils.hs, dist/build/Foreign/CUDA/Driver/Utils.o )
[ 5 of 37] Compiling Foreign.CUDA.Runtime.Error ( dist/build/Foreign/CUDA/Runtime/Error.hs, dist/build/Foreign/CUDA/Runtime/Error.o )
[ 6 of 37] Compiling Foreign.CUDA.Runtime.Utils ( dist/build/Foreign/CUDA/Runtime/Utils.hs, dist/build/Foreign/CUDA/Runtime/Utils.o )
[ 7 of 37] Compiling Foreign.CUDA.Analysis.Device ( dist/build/Foreign/CUDA/Analysis/Device.hs, dist/build/Foreign/CUDA/Analysis/Device.o )
[ 8 of 37] Compiling Foreign.CUDA.Analysis.Occupancy ( Foreign/CUDA/Analysis/Occupancy.hs, dist/build/Foreign/CUDA/Analysis/Occupancy.o )
[ 9 of 37] Compiling Foreign.CUDA.Runtime.Device ( dist/build/Foreign/CUDA/Runtime/Device.hs, dist/build/Foreign/CUDA/Runtime/Device.o )
[10 of 37] Compiling Foreign.CUDA.Driver.Device ( dist/build/Foreign/CUDA/Driver/Device.hs, dist/build/Foreign/CUDA/Driver/Device.o )
[11 of 37] Compiling Foreign.CUDA.Driver.Context.Base ( dist/build/Foreign/CUDA/Driver/Context/Base.hs, dist/build/Foreign/CUDA/Driver/Context/Base.o )
[12 of 37] Compiling Foreign.CUDA.Driver.Context.Peer ( dist/build/Foreign/CUDA/Driver/Context/Peer.hs, dist/build/Foreign/CUDA/Driver/Context/Peer.o )
[13 of 37] Compiling Foreign.CUDA.Driver.Context.Primary ( dist/build/Foreign/CUDA/Driver/Context/Primary.hs, dist/build/Foreign/CUDA/Driver/Context/Primary.o )
[14 of 37] Compiling Foreign.CUDA.Driver.Module.Base ( dist/build/Foreign/CUDA/Driver/Module/Base.hs, dist/build/Foreign/CUDA/Driver/Module/Base.o )
[15 of 37] Compiling Foreign.CUDA.Driver.Module.Link ( dist/build/Foreign/CUDA/Driver/Module/Link.hs, dist/build/Foreign/CUDA/Driver/Module/Link.o )
[16 of 37] Compiling Foreign.CUDA.Analysis ( Foreign/CUDA/Analysis.hs, dist/build/Foreign/CUDA/Analysis.o )
[17 of 37] Compiling Foreign.CUDA.Types ( dist/build/Foreign/CUDA/Types.hs, dist/build/Foreign/CUDA/Types.o )
[18 of 37] Compiling Foreign.CUDA.Runtime.Event ( dist/build/Foreign/CUDA/Runtime/Event.hs, dist/build/Foreign/CUDA/Runtime/Event.o )
[19 of 37] Compiling Foreign.CUDA.Runtime.Stream ( dist/build/Foreign/CUDA/Runtime/Stream.hs, dist/build/Foreign/CUDA/Runtime/Stream.o )
[20 of 37] Compiling Foreign.CUDA.Runtime.Exec ( dist/build/Foreign/CUDA/Runtime/Exec.hs, dist/build/Foreign/CUDA/Runtime/Exec.o )
[21 of 37] Compiling Foreign.CUDA.Driver.Context.Config ( dist/build/Foreign/CUDA/Driver/Context/Config.hs, dist/build/Foreign/CUDA/Driver/Context/Config.o )
[22 of 37] Compiling Foreign.CUDA.Driver.Context ( Foreign/CUDA/Driver/Context.hs, dist/build/Foreign/CUDA/Driver/Context.o )
[23 of 37] Compiling Foreign.CUDA.Driver.Event ( dist/build/Foreign/CUDA/Driver/Event.hs, dist/build/Foreign/CUDA/Driver/Event.o )
[24 of 37] Compiling Foreign.CUDA.Driver.IPC.Event ( dist/build/Foreign/CUDA/Driver/IPC/Event.hs, dist/build/Foreign/CUDA/Driver/IPC/Event.o )
[25 of 37] Compiling Foreign.CUDA.Driver.Stream ( dist/build/Foreign/CUDA/Driver/Stream.hs, dist/build/Foreign/CUDA/Driver/Stream.o )
[26 of 37] Compiling Foreign.CUDA.Driver.Exec ( dist/build/Foreign/CUDA/Driver/Exec.hs, dist/build/Foreign/CUDA/Driver/Exec.o )

Foreign/CUDA/Driver/Exec.chs:373:1: warning: [-Wredundant-constraints]
    • Redundant constraint: Storable a
    • In the type signature for:
           cuParamSetv :: Storable a =>
                          Fun -> Int -> Ptr a -> Int -> IO Status
[27 of 37] Compiling Foreign.CUDA.Ptr ( Foreign/CUDA/Ptr.hs, dist/build/Foreign/CUDA/Ptr.o )
[28 of 37] Compiling Foreign.CUDA.Runtime.Marshal ( dist/build/Foreign/CUDA/Runtime/Marshal.hs, dist/build/Foreign/CUDA/Runtime/Marshal.o )
[29 of 37] Compiling Foreign.CUDA.Runtime ( Foreign/CUDA/Runtime.hs, dist/build/Foreign/CUDA/Runtime.o )
[30 of 37] Compiling Foreign.CUDA.Runtime.Texture ( dist/build/Foreign/CUDA/Runtime/Texture.hs, dist/build/Foreign/CUDA/Runtime/Texture.o )
[31 of 37] Compiling Foreign.CUDA.Driver.Marshal ( dist/build/Foreign/CUDA/Driver/Marshal.hs, dist/build/Foreign/CUDA/Driver/Marshal.o )
[32 of 37] Compiling Foreign.CUDA.Driver.IPC.Marshal ( dist/build/Foreign/CUDA/Driver/IPC/Marshal.hs, dist/build/Foreign/CUDA/Driver/IPC/Marshal.o )
[33 of 37] Compiling Foreign.CUDA.Driver.Texture ( dist/build/Foreign/CUDA/Driver/Texture.hs, dist/build/Foreign/CUDA/Driver/Texture.o )
[34 of 37] Compiling Foreign.CUDA.Driver.Module.Query ( dist/build/Foreign/CUDA/Driver/Module/Query.hs, dist/build/Foreign/CUDA/Driver/Module/Query.o )
[35 of 37] Compiling Foreign.CUDA.Driver.Module ( Foreign/CUDA/Driver/Module.hs, dist/build/Foreign/CUDA/Driver/Module.o )
[36 of 37] Compiling Foreign.CUDA.Driver ( Foreign/CUDA/Driver.hs, dist/build/Foreign/CUDA/Driver.o )
[37 of 37] Compiling Foreign.CUDA     ( Foreign/CUDA.hs, dist/build/Foreign/CUDA.o )
[ 1 of 37] Compiling Foreign.CUDA.Internal.C2HS ( Foreign/CUDA/Internal/C2HS.hs, dist/build/Foreign/CUDA/Internal/C2HS.p_o )

Foreign/CUDA/Internal/C2HS.hs:202:3: warning: [-Winline-rule-shadowing]
    Rule "cFloatConv/CFloat->Float" may never fire
      because ‘Foreign.C.Types.CFloat’ might inline first
    Probable fix: add an INLINE[n] or NOINLINE[n] pragma for ‘Foreign.C.Types.CFloat’

Foreign/CUDA/Internal/C2HS.hs:204:3: warning: [-Winline-rule-shadowing]
    Rule "cFloatConv/CDouble->Double" may never fire
      because ‘Foreign.C.Types.CDouble’ might inline first
    Probable fix: add an INLINE[n] or NOINLINE[n] pragma for ‘Foreign.C.Types.CDouble’
ld: -rpath can only be used when creating a dynamic final linked image
clang: error: linker command failed with exit code 1 (use -v to see invocation)

<no location info>: error:
    `gcc' failed in phase `Linker'. (Exit code: 1)
[ 7 of 37] Compiling Foreign.CUDA.Analysis.Device ( dist/build/Foreign/CUDA/Analysis/Device.hs, dist/build/Foreign/CUDA/Analysis/Device.p_o )
ld: -rpath can only be used when creating a dynamic final linked image
clang: error: linker command failed with exit code 1 (use -v to see invocation)

<no location info>: error:
    `gcc' failed in phase `Linker'. (Exit code: 1)
[17 of 37] Compiling Foreign.CUDA.Types ( dist/build/Foreign/CUDA/Types.hs, dist/build/Foreign/CUDA/Types.p_o )
ld: -rpath can only be used when creating a dynamic final linked image
clang: error: linker command failed with exit code 1 (use -v to see invocation)

<no location info>: error:
    `gcc' failed in phase `Linker'. (Exit code: 1)

scsdjd:cuda> which ld
/usr/bin//ld

scsdjd:cuda> ld -v
@(#)PROGRAM:ld  PROJECT:ld64-264.3.102
configured to support archs: armv6 armv7 armv7s arm64 i386 x86_64 x86_64h armv6m armv7k armv7m armv7em (tvOS)
LTO support using: LLVM version 7.3.0

scsdjd:cuda> cat cuda.buildinfo.generated 
buildable: True
cc-options: -I/usr/local/cuda/include
ld-options: -L/usr/local/cuda/lib
frameworks: CUDA
extra-libraries:
    cudadevrt
    cudart_static
extra-ghci-libraries: cudart
extra-lib-dirs: /usr/local/cuda/lib
ghc-options: -optc-I/usr/local/cuda/include -optl-L/usr/local/cuda/lib -optl-Wl,-rpath,/usr/local/cuda/lib
x-extra-c2hs-options: --cppopts=-E --cppopts=-m64 --cppopts=-DUSE_EMPTY_CASE --cppopts=-U__BLOCKS__
scsdjd:cuda> 
scsdjd:cuda> 

Here is the most obviously relevant chunk of the output from cabal --verbose=3:

*** C Compiler:
/usr/bin//gcc -m64 -fno-stack-protector -DTABLES_NEXT_TO_CODE -I/usr/local/cuda/include -DPROFILING -x c /var/folders/pm/f2z9xj_s6bg5nvdlfp8_dqg00000gr/T/ghc89576_0/ghc_49.c -o /var/folders/pm/f2z9xj_s6bg5nvdlfp8_dqg00000gr/T/ghc89576_0/ghc_54.s -fno-common -U__PIC__ -D__PIC__ -Wimplicit -S -O2 '-D__GLASGOW_HASKELL__=800' -include /usr/local/lib/ghc-8.0.1/include/ghcversion.h -Idist/build/Foreign/CUDA/Analysis -Idist/build -Idist/build -Idist/build/autogen -Idist/build -I. -I/usr/local/lib/ghc-8.0.1/bytestring-0.10.8.1/include -I/opt/local/include/ -I/usr/local/lib/ghc-8.0.1/base-4.9.0.0/include -I/usr/local/lib/ghc-8.0.1/integer-gmp-1.0.0.1/include -I/usr/local/lib/ghc-8.0.1/include
*** Assembler:
/usr/bin//gcc -m64 -fno-stack-protector -DTABLES_NEXT_TO_CODE -Idist/build/Foreign/CUDA/Analysis -Idist/build -Idist/build -Idist/build/autogen -Idist/build -I. -fno-common -U__PIC__ -D__PIC__ -Qunused-arguments -x assembler -c /var/folders/pm/f2z9xj_s6bg5nvdlfp8_dqg00000gr/T/ghc89576_0/ghc_54.s -o /var/folders/pm/f2z9xj_s6bg5nvdlfp8_dqg00000gr/T/ghc89576_0/ghc_56.p_o
*** Assembler:
/usr/bin//gcc -m64 -fno-stack-protector -DTABLES_NEXT_TO_CODE -Idist/build/Foreign/CUDA/Analysis -Idist/build -Idist/build -Idist/build/autogen -Idist/build -I. -fno-common -U__PIC__ -D__PIC__ -Qunused-arguments -x assembler -c /var/folders/pm/f2z9xj_s6bg5nvdlfp8_dqg00000gr/T/ghc89576_0/ghc_48.s -o /var/folders/pm/f2z9xj_s6bg5nvdlfp8_dqg00000gr/T/ghc89576_0/ghc_57.p_o
*** Linker:
/usr/bin//gcc -m64 -fno-stack-protector -DTABLES_NEXT_TO_CODE -m64 -L/usr/local/cuda/lib -Wl,-rpath,/usr/local/cuda/lib -nostdlib -Wl,-r -o dist/build/Foreign/CUDA/Analysis/Device.p_o -Wl,-filelist -Wl,/var/folders/pm/f2z9xj_s6bg5nvdlfp8_dqg00000gr/T/ghc89576_0/ghc_58.filelist
ld: -rpath can only be used when creating a dynamic final linked image
clang: error: linker command failed with exit code 1 (use -v to see invocation)

<no location info>: error:
    `gcc' failed in phase `Linker'. (Exit code: 1)

On 8 Oct 2016, at 00:33, Trevor L. McDonell [email protected] wrote:

@djduke working for me on OS X on both 7.10.3 and 8.0.1. I haven't upgraded to Sierra yet, I am still on El Capitan (10.11.6).

gcc --version
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.3.0 (clang-703.0.31)
Target: x86_64-apple-darwin15.6.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

clang --version
Apple LLVM version 7.3.0 (clang-703.0.31)
Target: x86_64-apple-darwin15.6.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

c2hs --version
C->Haskell Compiler, version 0.28.1 Switcheroo, 1 April 2016
build platform is "x86_64-darwin" <1, True, True, 1>

What is the cuda.buildinfo[.generated] file? I have -rpath options in there no problem:

buildable: True
cc-options: -I/usr/local/cuda/include
ld-options: -L/usr/local/cuda/lib
frameworks: CUDA
extra-libraries:
cudadevrt
cudart_static
extra-ghci-libraries: cudart
extra-lib-dirs: /usr/local/cuda/lib
ghc-options: -optc-I/usr/local/cuda/include -optl-L/usr/local/cuda/lib -optl-Wl,-rpath,/usr/local/cuda/lib
x-extra-c2hs-options: --cppopts=-E --cppopts=-m64 --cppopts=-DUSE_EMPTY_CASE --cppopts=-U__BLOCKS__


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.


David Duke T: +44 113 3436800
Professor of Computer Science E: [email protected]
Head, School of Computing W: www.comp.leeds.ac.uk/scsdjd/
PA: Gaynor Butterwick, E: [email protected] T: +44 113 3435434

@djduke
Copy link

djduke commented Oct 8, 2016

Following up on my previous mail: I wonder if the problem is related to this issue with Cabal:

haskell/cabal#2766

where dynamic linking was incorrectly turned on when executable profiling was selected. The issue was closed and the change was committed, but possibly masking a deeper inconsistency? I'm using Cabal-1.24.0.0, and suspect as you've been using ghc-8.0.1. you will be on the same version?

David.

On 8 Oct 2016, at 20:38, David Duke [email protected] wrote:

Hi Trevor,

As far as I can see, my tool configuration matches yours. My c2hs was older (2015), I updated c2hs and tried again but the problem persists. Here is a log of building cuda from a fresh clone of your repo, along with version info for the tools. I also ran cabal build with verbose=3, and looked at the output of the final set of commands (output at the end).

Regards,
David.

scsdjd:GitRepos> git clone https://github.com/tmcdonell/cuda
Cloning into 'cuda'...
remote: Counting objects: 3661, done.
remote: Compressing objects: 100% (50/50), done.
remote: Total 3661 (delta 22), reused 0 (delta 0), pack-reused 3611
Receiving objects: 100% (3661/3661), 1.68 MiB | 249.00 KiB/s, done.
Resolving deltas: 100% (1954/1954), done.
Checking connectivity... done.

scsdjd:GitRepos> ghc --version
The Glorious Glasgow Haskell Compilation System, version 8.0.1

scsdjd:GitRepos> gcc --version
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.3.0 (clang-703.0.31)
Target: x86_64-apple-darwin15.6.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

scsdjd:GitRepos> clang --version
Apple LLVM version 7.3.0 (clang-703.0.31)
Target: x86_64-apple-darwin15.6.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

scsdjd:GitRepos> c2hs --version
C->Haskell Compiler, version 0.28.1 Switcheroo, 1 April 2016
build platform is "x86_64-darwin" <1, True, True, 1>

scsdjd:cuda> cabal configure
Resolving dependencies...
[1 of 1] Compiling Main ( dist/setup/setup.hs, dist/setup/Main.o )
Linking ./dist/setup/setup ...
Configuring cuda-0.7.5.0...
Found CUDA toolkit at: /usr/local/cuda
Storing parameters to cuda.buildinfo.generated
Using build information from 'cuda.buildinfo.generated'.
Provide a 'cuda.buildinfo' file to override this behaviour.

scsdjd:cuda> cabal build
Using build information from 'cuda.buildinfo.generated'.
Provide a 'cuda.buildinfo' file to override this behaviour.
Building cuda-0.7.5.0...
Preprocessing library cuda-0.7.5.0...
[ 1 of 37] Compiling Foreign.CUDA.Internal.C2HS ( Foreign/CUDA/Internal/C2HS.hs, dist/build/Foreign/CUDA/Internal/C2HS.o )

Foreign/CUDA/Internal/C2HS.hs:202:3: warning: [-Winline-rule-shadowing]
Rule "cFloatConv/CFloat->Float" may never fire
because ‘Foreign.C.Types.CFloat’ might inline first
Probable fix: add an INLINE[n] or NOINLINE[n] pragma for ‘Foreign.C.Types.CFloat’

Foreign/CUDA/Internal/C2HS.hs:204:3: warning: [-Winline-rule-shadowing]
Rule "cFloatConv/CDouble->Double" may never fire
because ‘Foreign.C.Types.CDouble’ might inline first
Probable fix: add an INLINE[n] or NOINLINE[n] pragma for ‘Foreign.C.Types.CDouble’
[ 2 of 37] Compiling Foreign.CUDA.Driver.Error ( dist/build/Foreign/CUDA/Driver/Error.hs, dist/build/Foreign/CUDA/Driver/Error.o )
[ 3 of 37] Compiling Foreign.CUDA.Driver.Profiler ( dist/build/Foreign/CUDA/Driver/Profiler.hs, dist/build/Foreign/CUDA/Driver/Profiler.o )
[ 4 of 37] Compiling Foreign.CUDA.Driver.Utils ( dist/build/Foreign/CUDA/Driver/Utils.hs, dist/build/Foreign/CUDA/Driver/Utils.o )
[ 5 of 37] Compiling Foreign.CUDA.Runtime.Error ( dist/build/Foreign/CUDA/Runtime/Error.hs, dist/build/Foreign/CUDA/Runtime/Error.o )
[ 6 of 37] Compiling Foreign.CUDA.Runtime.Utils ( dist/build/Foreign/CUDA/Runtime/Utils.hs, dist/build/Foreign/CUDA/Runtime/Utils.o )
[ 7 of 37] Compiling Foreign.CUDA.Analysis.Device ( dist/build/Foreign/CUDA/Analysis/Device.hs, dist/build/Foreign/CUDA/Analysis/Device.o )
[ 8 of 37] Compiling Foreign.CUDA.Analysis.Occupancy ( Foreign/CUDA/Analysis/Occupancy.hs, dist/build/Foreign/CUDA/Analysis/Occupancy.o )
[ 9 of 37] Compiling Foreign.CUDA.Runtime.Device ( dist/build/Foreign/CUDA/Runtime/Device.hs, dist/build/Foreign/CUDA/Runtime/Device.o )
[10 of 37] Compiling Foreign.CUDA.Driver.Device ( dist/build/Foreign/CUDA/Driver/Device.hs, dist/build/Foreign/CUDA/Driver/Device.o )
[11 of 37] Compiling Foreign.CUDA.Driver.Context.Base ( dist/build/Foreign/CUDA/Driver/Context/Base.hs, dist/build/Foreign/CUDA/Driver/Context/Base.o )
[12 of 37] Compiling Foreign.CUDA.Driver.Context.Peer ( dist/build/Foreign/CUDA/Driver/Context/Peer.hs, dist/build/Foreign/CUDA/Driver/Context/Peer.o )
[13 of 37] Compiling Foreign.CUDA.Driver.Context.Primary ( dist/build/Foreign/CUDA/Driver/Context/Primary.hs, dist/build/Foreign/CUDA/Driver/Context/Primary.o )
[14 of 37] Compiling Foreign.CUDA.Driver.Module.Base ( dist/build/Foreign/CUDA/Driver/Module/Base.hs, dist/build/Foreign/CUDA/Driver/Module/Base.o )
[15 of 37] Compiling Foreign.CUDA.Driver.Module.Link ( dist/build/Foreign/CUDA/Driver/Module/Link.hs, dist/build/Foreign/CUDA/Driver/Module/Link.o )
[16 of 37] Compiling Foreign.CUDA.Analysis ( Foreign/CUDA/Analysis.hs, dist/build/Foreign/CUDA/Analysis.o )
[17 of 37] Compiling Foreign.CUDA.Types ( dist/build/Foreign/CUDA/Types.hs, dist/build/Foreign/CUDA/Types.o )
[18 of 37] Compiling Foreign.CUDA.Runtime.Event ( dist/build/Foreign/CUDA/Runtime/Event.hs, dist/build/Foreign/CUDA/Runtime/Event.o )
[19 of 37] Compiling Foreign.CUDA.Runtime.Stream ( dist/build/Foreign/CUDA/Runtime/Stream.hs, dist/build/Foreign/CUDA/Runtime/Stream.o )
[20 of 37] Compiling Foreign.CUDA.Runtime.Exec ( dist/build/Foreign/CUDA/Runtime/Exec.hs, dist/build/Foreign/CUDA/Runtime/Exec.o )
[21 of 37] Compiling Foreign.CUDA.Driver.Context.Config ( dist/build/Foreign/CUDA/Driver/Context/Config.hs, dist/build/Foreign/CUDA/Driver/Context/Config.o )
[22 of 37] Compiling Foreign.CUDA.Driver.Context ( Foreign/CUDA/Driver/Context.hs, dist/build/Foreign/CUDA/Driver/Context.o )
[23 of 37] Compiling Foreign.CUDA.Driver.Event ( dist/build/Foreign/CUDA/Driver/Event.hs, dist/build/Foreign/CUDA/Driver/Event.o )
[24 of 37] Compiling Foreign.CUDA.Driver.IPC.Event ( dist/build/Foreign/CUDA/Driver/IPC/Event.hs, dist/build/Foreign/CUDA/Driver/IPC/Event.o )
[25 of 37] Compiling Foreign.CUDA.Driver.Stream ( dist/build/Foreign/CUDA/Driver/Stream.hs, dist/build/Foreign/CUDA/Driver/Stream.o )
[26 of 37] Compiling Foreign.CUDA.Driver.Exec ( dist/build/Foreign/CUDA/Driver/Exec.hs, dist/build/Foreign/CUDA/Driver/Exec.o )

Foreign/CUDA/Driver/Exec.chs:373:1: warning: [-Wredundant-constraints]
• Redundant constraint: Storable a
• In the type signature for:
cuParamSetv :: Storable a =>
Fun -> Int -> Ptr a -> Int -> IO Status
[27 of 37] Compiling Foreign.CUDA.Ptr ( Foreign/CUDA/Ptr.hs, dist/build/Foreign/CUDA/Ptr.o )
[28 of 37] Compiling Foreign.CUDA.Runtime.Marshal ( dist/build/Foreign/CUDA/Runtime/Marshal.hs, dist/build/Foreign/CUDA/Runtime/Marshal.o )
[29 of 37] Compiling Foreign.CUDA.Runtime ( Foreign/CUDA/Runtime.hs, dist/build/Foreign/CUDA/Runtime.o )
[30 of 37] Compiling Foreign.CUDA.Runtime.Texture ( dist/build/Foreign/CUDA/Runtime/Texture.hs, dist/build/Foreign/CUDA/Runtime/Texture.o )
[31 of 37] Compiling Foreign.CUDA.Driver.Marshal ( dist/build/Foreign/CUDA/Driver/Marshal.hs, dist/build/Foreign/CUDA/Driver/Marshal.o )
[32 of 37] Compiling Foreign.CUDA.Driver.IPC.Marshal ( dist/build/Foreign/CUDA/Driver/IPC/Marshal.hs, dist/build/Foreign/CUDA/Driver/IPC/Marshal.o )
[33 of 37] Compiling Foreign.CUDA.Driver.Texture ( dist/build/Foreign/CUDA/Driver/Texture.hs, dist/build/Foreign/CUDA/Driver/Texture.o )
[34 of 37] Compiling Foreign.CUDA.Driver.Module.Query ( dist/build/Foreign/CUDA/Driver/Module/Query.hs, dist/build/Foreign/CUDA/Driver/Module/Query.o )
[35 of 37] Compiling Foreign.CUDA.Driver.Module ( Foreign/CUDA/Driver/Module.hs, dist/build/Foreign/CUDA/Driver/Module.o )
[36 of 37] Compiling Foreign.CUDA.Driver ( Foreign/CUDA/Driver.hs, dist/build/Foreign/CUDA/Driver.o )
[37 of 37] Compiling Foreign.CUDA ( Foreign/CUDA.hs, dist/build/Foreign/CUDA.o )
[ 1 of 37] Compiling Foreign.CUDA.Internal.C2HS ( Foreign/CUDA/Internal/C2HS.hs, dist/build/Foreign/CUDA/Internal/C2HS.p_o )

Foreign/CUDA/Internal/C2HS.hs:202:3: warning: [-Winline-rule-shadowing]
Rule "cFloatConv/CFloat->Float" may never fire
because ‘Foreign.C.Types.CFloat’ might inline first
Probable fix: add an INLINE[n] or NOINLINE[n] pragma for ‘Foreign.C.Types.CFloat’

Foreign/CUDA/Internal/C2HS.hs:204:3: warning: [-Winline-rule-shadowing]
Rule "cFloatConv/CDouble->Double" may never fire
because ‘Foreign.C.Types.CDouble’ might inline first
Probable fix: add an INLINE[n] or NOINLINE[n] pragma for ‘Foreign.C.Types.CDouble’
ld: -rpath can only be used when creating a dynamic final linked image
clang: error: linker command failed with exit code 1 (use -v to see invocation)

: error:
gcc' failed in phaseLinker'. (Exit code: 1)
[ 7 of 37] Compiling Foreign.CUDA.Analysis.Device ( dist/build/Foreign/CUDA/Analysis/Device.hs, dist/build/Foreign/CUDA/Analysis/Device.p_o )
ld: -rpath can only be used when creating a dynamic final linked image
clang: error: linker command failed with exit code 1 (use -v to see invocation)

: error:
gcc' failed in phaseLinker'. (Exit code: 1)
[17 of 37] Compiling Foreign.CUDA.Types ( dist/build/Foreign/CUDA/Types.hs, dist/build/Foreign/CUDA/Types.p_o )
ld: -rpath can only be used when creating a dynamic final linked image
clang: error: linker command failed with exit code 1 (use -v to see invocation)

: error:
gcc' failed in phaseLinker'. (Exit code: 1)

scsdjd:cuda> which ld
/usr/bin//ld

scsdjd:cuda> ld -v
@(#)PROGRAM:ld PROJECT:ld64-264.3.102
configured to support archs: armv6 armv7 armv7s arm64 i386 x86_64 x86_64h armv6m armv7k armv7m armv7em (tvOS)
LTO support using: LLVM version 7.3.0

scsdjd:cuda> cat cuda.buildinfo.generated
buildable: True
cc-options: -I/usr/local/cuda/include
ld-options: -L/usr/local/cuda/lib
frameworks: CUDA
extra-libraries:
cudadevrt
cudart_static
extra-ghci-libraries: cudart
extra-lib-dirs: /usr/local/cuda/lib
ghc-options: -optc-I/usr/local/cuda/include -optl-L/usr/local/cuda/lib -optl-Wl,-rpath,/usr/local/cuda/lib
x-extra-c2hs-options: --cppopts=-E --cppopts=-m64 --cppopts=-DUSE_EMPTY_CASE --cppopts=-U__BLOCKS__
scsdjd:cuda>
scsdjd:cuda>

Here is the most obviously relevant chunk of the output from cabal --verbose=3:

*** C Compiler:
/usr/bin//gcc -m64 -fno-stack-protector -DTABLES_NEXT_TO_CODE -I/usr/local/cuda/include -DPROFILING -x c /var/folders/pm/f2z9xj_s6bg5nvdlfp8_dqg00000gr/T/ghc89576_0/ghc_49.c -o /var/folders/pm/f2z9xj_s6bg5nvdlfp8_dqg00000gr/T/ghc89576_0/ghc_54.s -fno-common -U__PIC__ -D__PIC__ -Wimplicit -S -O2 '-D__GLASGOW_HASKELL__=800' -include /usr/local/lib/ghc-8.0.1/include/ghcversion.h -Idist/build/Foreign/CUDA/Analysis -Idist/build -Idist/build -Idist/build/autogen -Idist/build -I. -I/usr/local/lib/ghc-8.0.1/bytestring-0.10.8.1/include -I/opt/local/include/ -I/usr/local/lib/ghc-8.0.1/base-4.9.0.0/include -I/usr/local/lib/ghc-8.0.1/integer-gmp-1.0.0.1/include -I/usr/local/lib/ghc-8.0.1/include
*** Assembler:
/usr/bin//gcc -m64 -fno-stack-protector -DTABLES_NEXT_TO_CODE -Idist/build/Foreign/CUDA/Analysis -Idist/build -Idist/build -Idist/build/autogen -Idist/build -I. -fno-common -U__PIC__ -D__PIC__ -Qunused-arguments -x assembler -c /var/folders/pm/f2z9xj_s6bg5nvdlfp8_dqg00000gr/T/ghc89576_0/ghc_54.s -o /var/folders/pm/f2z9xj_s6bg5nvdlfp8_dqg00000gr/T/ghc89576_0/ghc_56.p_o
*** Assembler:
/usr/bin//gcc -m64 -fno-stack-protector -DTABLES_NEXT_TO_CODE -Idist/build/Foreign/CUDA/Analysis -Idist/build -Idist/build -Idist/build/autogen -Idist/build -I. -fno-common -U__PIC__ -D__PIC__ -Qunused-arguments -x assembler -c /var/folders/pm/f2z9xj_s6bg5nvdlfp8_dqg00000gr/T/ghc89576_0/ghc_48.s -o /var/folders/pm/f2z9xj_s6bg5nvdlfp8_dqg00000gr/T/ghc89576_0/ghc_57.p_o
*** Linker:
/usr/bin//gcc -m64 -fno-stack-protector -DTABLES_NEXT_TO_CODE -m64 -L/usr/local/cuda/lib -Wl,-rpath,/usr/local/cuda/lib -nostdlib -Wl,-r -o dist/build/Foreign/CUDA/Analysis/Device.p_o -Wl,-filelist -Wl,/var/folders/pm/f2z9xj_s6bg5nvdlfp8_dqg00000gr/T/ghc89576_0/ghc_58.filelist
ld: -rpath can only be used when creating a dynamic final linked image
clang: error: linker command failed with exit code 1 (use -v to see invocation)

: error:
gcc' failed in phaseLinker'. (Exit code: 1)

On 8 Oct 2016, at 00:33, Trevor L. McDonell [email protected] wrote:

@djduke working for me on OS X on both 7.10.3 and 8.0.1. I haven't upgraded to Sierra yet, I am still on El Capitan (10.11.6).

gcc --version
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.3.0 (clang-703.0.31)
Target: x86_64-apple-darwin15.6.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

clang --version
Apple LLVM version 7.3.0 (clang-703.0.31)
Target: x86_64-apple-darwin15.6.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

c2hs --version
C->Haskell Compiler, version 0.28.1 Switcheroo, 1 April 2016
build platform is "x86_64-darwin" <1, True, True, 1>

What is the cuda.buildinfo[.generated] file? I have -rpath options in there no problem:

buildable: True
cc-options: -I/usr/local/cuda/include
ld-options: -L/usr/local/cuda/lib
frameworks: CUDA
extra-libraries:
cudadevrt
cudart_static
extra-ghci-libraries: cudart
extra-lib-dirs: /usr/local/cuda/lib
ghc-options: -optc-I/usr/local/cuda/include -optl-L/usr/local/cuda/lib -optl-Wl,-rpath,/usr/local/cuda/lib
x-extra-c2hs-options: --cppopts=-E --cppopts=-m64 --cppopts=-DUSE_EMPTY_CASE --cppopts=-U__BLOCKS__


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.


David Duke T: +44 113 3436800
Professor of Computer Science E: [email protected]
Head, School of Computing W: www.comp.leeds.ac.uk/scsdjd/
PA: Gaynor Butterwick, E: [email protected] T: +44 113 3435434


David Duke T: +44 113 3436800
Professor of Computer Science E: [email protected]
Head, School of Computing W: www.comp.leeds.ac.uk/scsdjd/
PA: Gaynor Butterwick, E: [email protected] T: +44 113 3435434

@tmcdonell
Copy link
Owner Author

@djduke migrating to #43

@tmcdonell
Copy link
Owner Author

I just tried this with GHC HEAD and everything appears to work as expected in ghci.

The RTS automatically avoids the region needed by CUDA, no need to specify that through RTS flags or otherwise (although, I'm not sure how large a region it avoids... if your total GPU+system RAM is very high maybe you will still need to specify the offset manually.)

$ ghci
GHCi, version 8.1.20161011: http://www.haskell.org/ghc/  :? for help

> import Foreign.CUDA.Driver
> initialise []
> props =<< device 0
DeviceProperties {deviceName = "GeForce GT 650M", computeCapability = 3.0, ...

@mchakravarty
Copy link
Contributor

That's good news — thanks for checking!

@tmcdonell
Copy link
Owner Author

Since GHC-8.0.2 is out now this is probably safe to close.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants