-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: overhead free type conversion for load_poly_to_gpu
#516
Conversation
This is pretty impressive speed-up, but you pushed the instance to be 2^22 to make the gap meaningful, for much smaller instances (which is our case), the gap is so small, (correct?)
wdyt? @chancharles92 @mrain |
@@ -1401,6 +1407,26 @@ mod tests { | |||
test_gpu_e2e_template::<Bn254>().unwrap(); | |||
} | |||
|
|||
fn test_gpu_ark_conversion_template<E: Pairing>() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this test. But can we do more? Earlier code comment says:
We assume that two types use the same underline repr.
In that case, we should be able to assert that the bytes are exactly the same. That way, if an incompatibility is introduced then the test will detect it immediately, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit annoying but doable: c82e230
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! I don't fully understand this new test but here's what I see:
- make a slice of random new ark scalars in
scalars
- convert them into a IC scalars slice
ic_scalars
viafrom_ark
. - copy the slice into a new GPU slice called
d_scalars
. - transform the GPU slice to Montgomery form on the GPU
- copy the GPU scalars out of GPU and back into
ic_scalars
- these
ic_scalars
(in Montgomery form) should be byte-for-byte identical to the originalscalars
. Check this viaalign_to
Not sure why you need to copy to/from GPU but AFIACT this test does what it's supposed to.
We could add a test to enforce this, no? See #516 (comment)
I think I disagree. An archived PR will experience bit rot and get forgotten. Is your concern only about the unsafe code? Perhaps that concern could be mitigated with a stricter test described above. |
Description
Update: this also fixes the API changes when
gpu_vid
is enabled.This PR avoids a memory copy in
load_poly_to_gpu
. However it contains unsafe rust code.Before:
After:
It doesn't apply to affine point conversion because of two different underlying struct reprs.
Before we can merge this PR, please make sure that all the following items have been
checked off. If any of the checklist items are not applicable, please leave them but
write a little note why.
Pending
section inCHANGELOG.md
Files changed
in the GitHub PR explorer