Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Library won't utilize all gpu cores. [BUG] #92

Open
noahmartinwilliams opened this issue Jan 2, 2023 · 2 comments
Open

Library won't utilize all gpu cores. [BUG] #92

noahmartinwilliams opened this issue Jan 2, 2023 · 2 comments

Comments

@noahmartinwilliams
Copy link

noahmartinwilliams commented Jan 2, 2023

Description
When I run a function on the gpu that should take up all available cores it doesn't and instead uses roughly 10% of them.
I'm not sure if this is a misconfiguration, or a bug.

Steps to reproduce

Compile the following code:

module Main where

import Data.Array.Accelerate as A
import Data.Array.Accelerate.LLVM.PTX as PTX
import Prelude as P

func :: Exp Double -> Exp Double
func x = do
    let one = constant 1.0 :: Exp Double
    one / (one + (exp (-x)))

func2 :: Acc (Vector Int) -> Acc (Vector Double) -> Acc (Vector (Int, Double))
func2 a b = A.zip a b

func3 :: Acc (Vector (Int, Double)) -> Acc (Vector (Int, Double))
func3 x = do
    let (a, b) = A.unzip x
        b2 = A.map (\y -> func y) b
        a2 = A.map (\z -> z - (constant 1 :: Exp Int)) a
    A.zip a2 b2

test :: Acc (Vector (Int, Double)) -> Acc (Scalar Bool)
test x = do
    let (a, _) = A.unzip x
    A.unit ((constant 1) A./= (a A.!! (constant 0)))

numCudaCores :: Int
numCudaCores = 2048 -- I have an NVidia GeForce RTX 3050 Laptop GPU, I'm pretty sure it has 2048 cores.

main :: IO ()
main = do
    let arr = A.fill (constant (Z:.numCudaCores)) 1.0 :: Acc (Vector Double)  
        arr2 = A.fill (constant (Z:.numCudaCores)) 10000000 :: Acc (Vector Int)
        func4 = A.awhile ( test ) func3 (func2 arr2 arr)
        result = PTX.run func4
    putStrLn (show result)

and then run nvtop while it's running on a linux machine and it should show that the gpu is only being utilized around 8%.
Expected behaviour
The code should be using all CUDA cores.

Your environment
OS: Arch Linux (latest version as of january 2nd 2023).
GPU: NVidia GeForce RTX 3050 Laptop GPU
GHC: 9.0.2

accelerate-llvm-ptx commit: 6f3b6ca
accelerate commit: 5971c5d8e4dbba28d2017e7ce422cf46a20197cb

I can't post the results of nvidia-device-query because that program isn't on my computer and I can't find any information at all about what package to install on Arch to get it.

Additional context

@ivogabe
Copy link
Contributor

ivogabe commented Apr 12, 2023

You may need to use a larger input to use all GPU cores. Can you try that?

@noahmartinwilliams
Copy link
Author

noahmartinwilliams commented Apr 17, 2023

Currently I'm not able to get the llvm backends to compile, so no.

Edit: I got it to compile, and using larger inputs doesn't do anything to use more cores.

Edit 2: The native backend doesn't use all CPU cores either.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants