Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WebGPU/Emu MSM/FFT #79

Open
jon-chuang opened this issue Apr 3, 2020 · 3 comments
Open

WebGPU/Emu MSM/FFT #79

jon-chuang opened this issue Apr 3, 2020 · 3 comments
Labels
T-feature Type: new features T-performance Type: performance improvements

Comments

@jon-chuang
Copy link
Contributor

I hope to either use webgpu or emu to parallelise MSM/FFTs. I am not sure what the current implementation of MSM is, but we need to load-balance the different buckets. My target is a 2-10x speedup on a powerful GPU like GTX1060-6GB.

An idea I have which may or may not make sense is to perform the FFT in parallel on the CPU first, then when the butterflies are sufficiently small, move them into small kernels on the GPU. (can't remember which way around it is). Due to lack of data dependencies, we should be able to get very high streaming throughput.

Other problems to think about is how to package the library so that GPU can be compiled for if desired/detected. Shouldn't be too hard utilising some WebGPU/#[cfg] functionality.

@Pratyush
Copy link
Member

Pratyush commented Apr 3, 2020

This would be great! I think starting with FFTs might be easier in the short term, just because it only requires field arithmetic. There is some prior work in the snarky library: o1-labs/snarky#356

It could serve as a good starting point.

@jon-chuang
Copy link
Contributor Author

jon-chuang commented Apr 3, 2020

I think we should add some auto-optimisation to decide how to split the problem between GPU and CPU/multicore or even distributed. Distributed is of value as it could lead to truly fast latencies - maybe 0.1s for offloading proving. This should be a separate issue. Can collab with RISELab maybe. I should look for some bayes opt in rust. I think probably something like Gaussian process optimisation might be fine. Should be relatively low-dimensional optimisation problem. As an alternative, the Simple(x) optimisation (rust crate) algorithm may be good.

@Pratyush
Copy link
Member

Pratyush commented Apr 3, 2020

Heh, I'm also part of the RISElab, so that front could be somewhat easier to explore

@Pratyush Pratyush transferred this issue from arkworks-rs/snark Nov 20, 2020
@Pratyush Pratyush added T-feature Type: new features T-performance Type: performance improvements labels Nov 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-feature Type: new features T-performance Type: performance improvements
Projects
None yet
Development

No branches or pull requests

2 participants