-
-
Notifications
You must be signed in to change notification settings - Fork 417
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate compilation to GPUs #1330
Comments
Just want to comment saying that this would be awesome - I think the actor paradigm w/ capabilities maps really neatly onto gpus & the compile time types of memory (const, val, etc.). |
I'd really like to do some work on this, but I don't think I'm particularly qualified. If any tips or pointers could be given on where to start, I'd be happy to give it a go. Though I do wonder, given that most GPU based code (from what little research I've done) functions by using a host program to send processing kernels to the GPU, and then fetching the results later, how should this behaviour be written in Pony code? |
Marked as "needs discussion during sync" to make sure that folks weigh in on @mpbagot's question. |
We discussed this on today's sync call. I didn't take notes on all of what was said, and I personally don't know a lot about it, but if you listen to the last five minutes of the call, you can hear @sylvanc's comments on it. One place to start would be to introduce an annotation ( |
It might be worth considering trying to target SPIR-V, either directly or through the LLVM<->SPIR-V converter ( https://github.com/KhronosGroup/SPIRV-LLVM ) because SPIR-V is an intermediate which can be used to represent OpenCL and that allows vendor drivers to take over and compile it down to native. |
There's quite a few issues I'm finding whilst trying to conceptualise how this could be done. Assuming only functions can be parallelised (for simplicity's sake), you have to consider the following:
/gpufunc/
fun func_a(a: U32, b: U32) : U32 =>
a + b Should it be the same as all function calls, with array/iterable inputs? result: Array[U32] = func_a([1; 2; 3], [4; 5; 6]) Or should some other form of call be used, like with the @ prefix on FFI calls, to ensure a noticeable distinction of GPU function calls vs CPU function calls?
|
LLVM has experimental support of NVPTX and AMDGPU backends. Getting Pony to run on these as a proof of concept would be great.
The text was updated successfully, but these errors were encountered: