-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat/Msm with gpu (metal shader language) on laptop #150
Conversation
Deploying mopro with Cloudflare Pages
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you for this PR
please try to rebase the branch and fix these warnings
Warnings
warning: unused import: `G1Affine as GAffine`
--> mopro-core/src/middleware/gpu_explorations/arkworks_pippenger.rs:1:36
|
1 | use ark_bn254::{Fr as ScalarField, G1Affine as GAffine, G1Projective as G};
| ^^^^^^^^^^^^^^^^^^^
|
= note: `#[warn(unused_imports)]` on by default
warning: unused import: `AffineRepr`
--> mopro-core/src/middleware/gpu_explorations/arkworks_pippenger.rs:2:14
|
2 | use ark_ec::{AffineRepr, VariableBaseMSM};
| ^^^^^^^^^^
warning: unused import: `ark_ff::BigInt`
--> mopro-core/src/middleware/gpu_explorations/arkworks_pippenger.rs:3:5
|
3 | use ark_ff::BigInt;
| ^^^^^^^^^^^^^^
warning: unused import: `ark_serialize::CanonicalDeserialize`
--> mopro-core/src/middleware/gpu_explorations/arkworks_pippenger.rs:4:5
|
4 | use ark_serialize::CanonicalDeserialize;
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
warning: unused import: `CurveGroup`
--> mopro-core/src/middleware/gpu_explorations/metal/msm.rs:2:26
|
2 | use ark_ec::{AffineRepr, CurveGroup, Group, VariableBaseMSM};
| ^^^^^^^^^^
warning: unused imports: `BigInteger256`, `BigInteger`, `UniformRand`
--> mopro-core/src/middleware/gpu_explorations/metal/msm.rs:4:18
|
4 | biginteger::{BigInteger, BigInteger256},
| ^^^^^^^^^^ ^^^^^^^^^^^^^
5 | PrimeField, UniformRand,
| ^^^^^^^^^^^
warning: unused imports: `One`, `rand`
--> mopro-core/src/middleware/gpu_explorations/metal/msm.rs:7:30
|
7 | use ark_std::{cfg_into_iter, rand, vec::Vec, One, Zero};
| ^^^^ ^^^
warning: unused import: `ark_serialize::CanonicalDeserialize`
--> mopro-core/src/middleware/gpu_explorations/metal/msm.rs:10:5
|
10 | use ark_serialize::CanonicalDeserialize;
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
warning: unused imports: `BigInteger256`, `BigInteger`
--> mopro-core/src/middleware/gpu_explorations/metal/tests/test_bn254.rs:10:22
|
10 | biginteger::{BigInteger, BigInteger256},
| ^^^^^^^^^^ ^^^^^^^^^^^^^
warning: unused import: `FqConfig`
--> mopro-core/src/middleware/gpu_explorations/metal/tests/test_msm.rs:3:25
|
3 | use ark_bn254::{Fq, FqConfig, Fr as ScalarField, G1Affine as GAffine, G1Projective as G};
| ^^^^^^^^
warning: unused imports: `BigInteger256`, `BigInteger`
--> mopro-core/src/middleware/gpu_explorations/metal/tests/test_msm.rs:6:22
|
6 | biginteger::{BigInteger, BigInteger256},
| ^^^^^^^^^^ ^^^^^^^^^^^^^
warning: unused import: `One`
--> mopro-core/src/middleware/gpu_explorations/metal/tests/test_msm.rs:9:50
|
9 | use ark_std::{cfg_into_iter, rand, vec::Vec, One, Zero};
| ^^^
warning: unused import: `G1Projective as G`
--> mopro-core/src/middleware/gpu_explorations/utils/preprocess.rs:1:57
|
1 | use ark_bn254::{Fr as ScalarField, G1Affine as GAffine, G1Projective as G};
| ^^^^^^^^^^^^^^^^^
warning: unused import: `Field`
--> mopro-core/src/middleware/gpu_explorations/utils/preprocess.rs:2:14
|
2 | use ark_ff::{Field, PrimeField};
| ^^^^^
warning: unused import: `Rng`
--> mopro-core/src/middleware/gpu_explorations/utils/preprocess.rs:5:12
|
5 | rand::{Rng, RngCore},
| ^^^
warning: unused variable: `n`
--> mopro-core/src/middleware/gpu_explorations/metal/tests/test_bn254.rs:347:29
|
347 | fn rand_point()(n in any::<u8>()) -> G {
| ^ help: if this is intentional, prefix it with an underscore: `_n`
|
= note: `#[warn(unused_variables)]` on by default
warning: unused `Result` that must be used
--> mopro-core/src/middleware/gpu_explorations/arkworks_pippenger.rs:151:9
|
151 | writeln!(output_file, "msm_size,num_msm,avg_processing_time(ms)");
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: this `Result` may be an `Err` variant, which should be handled
= note: `#[warn(unused_must_use)]` on by default
= note: this warning originates in the macro `writeln` (in Nightly builds, run with -Z macro-backtrace for more info)
warning: unused `Result` that must be used
--> mopro-core/src/middleware/gpu_explorations/arkworks_pippenger.rs:166:17
|
166 | / writeln!(
167 | | output_file,
168 | | "{},{},{}",
169 | | result.instance_size, result.num_instance, result.avg_processing_time
170 | | );
| |_________________^
|
= note: this `Result` may be an `Err` variant, which should be handled
= note: this warning originates in the macro `writeln` (in Nightly builds, run with -Z macro-backtrace for more info)
warning: call to `.clone()` on a reference in this situation does nothing
--> mopro-core/src/middleware/gpu_explorations/metal/msm.rs:47:47
|
47 | let scalars_limbs = cfg_into_iter!(scalars.clone())
| ^^^^^^^^ help: remove this redundant call
|
= note: the type `[ark_ff::Fp<MontBackend<FrConfig, 4>, 4>]` does not implement `Clone`, so calling `clone` on `&[ark_ff::Fp<MontBackend<FrConfig, 4>, 4>]` copies the reference, which does not do anything and can be removed
= note: `#[warn(noop_method_call)]` on by default
warning: unused `Result` that must be used
--> mopro-core/src/middleware/gpu_explorations/metal/msm.rs:320:9
|
320 | writeln!(output_file, "msm_size,num_msm,avg_processing_time(ms)");
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: this `Result` may be an `Err` variant, which should be handled
= note: this warning originates in the macro `writeln` (in Nightly builds, run with -Z macro-backtrace for more info)
warning: unused `Result` that must be used
--> mopro-core/src/middleware/gpu_explorations/metal/msm.rs:335:17
|
335 | / writeln!(
336 | | output_file,
337 | | "{},{},{}",
338 | | result.instance_size, result.num_instance, result.avg_processing_time
339 | | );
| |_________________^
|
= note: this `Result` may be an `Err` variant, which should be handled
= note: this warning originates in the macro `writeln` (in Nightly builds, run with -Z macro-backtrace for more info)
warning: `mopro-core` (lib test) generated 21 warnings (run `cargo fix --lib -p mopro-core --tests` to apply 17 suggestions)
1. cd to the `mopro/` directory. | ||
2. run `./scripts/build_ios.sh config-example.toml` (remember to change your ios_device_type `simulator`/`device`) to build and update the bindings. | ||
3. open `mopro-ios/MoproKit/Example/MoproKit.xcworkspace` in Xcode. | ||
4. choose your simulator/mobile device and build the project (can also use `cmd + R` as hot key). | ||
5. choose `MSMBenchmark` and choose the algorithms and click the button below you want to start benchmark. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are going to remove the mopro-ios directory
so you can move the result in research/gpu-exploration-app
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@FoodChain1028 would you like to help with this change
To run the benchmarks of the instance size of $2^{16}$ on BLS12_377 curve in `mopro-core`, replace `<algorithm_you_want_to_test>` with the algorithm name listed above. | ||
|
||
```bash | ||
cargo test --release --features gpu-benchmarks --package mopro-core --lib -- middleware::gpu_explorations::<algorithm_you_want_to_test>::tests::test_run_benchmark --exact --nocapture |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now the trapdoortech_zprize_msm
algo is commented out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the trapdoortech_zprize_msm is not compatible with BN254 curve, we commented out for now. Do you think we should remove the unused msm? Such as trapdoor's and halo2's?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry I just saw the message
if you will use the structure to develop for BN254, you can keep it here
do you expect it will be a lot of changes to switch curves?
@@ -0,0 +1,7 @@ | |||
msm_size,num_msm,avg_processing_time(ms) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am interested about why halo2curve is benchmarked?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haha, TLDR Carlos suggested that Halo2curve's msm is state-of-the-art. So we get benchmarks of it.
As you can see below, indeed, Halo2curve performs well with "asm" feature. However, this acceleration feature is only compatible with x86_64 architecture. Therefore, we didn't take this as a reference for our future work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will they support acceleration feature in the future? or it is not possible in arm?
so it will be a legacy benchmark?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've talked to the team and it's actually possible in arm for "asm" feature but they're not going to support this in the plan atm.
mopro-core/src/middleware/gpu_explorations/arkworks_pippenger.rs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Complete chores for the reviews
To run the benchmarks of the instance size of $2^{16}$ on BLS12_377 curve in `mopro-core`, replace `<algorithm_you_want_to_test>` with the algorithm name listed above. | ||
|
||
```bash | ||
cargo test --release --features gpu-benchmarks --package mopro-core --lib -- middleware::gpu_explorations::<algorithm_you_want_to_test>::tests::test_run_benchmark --exact --nocapture |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the trapdoortech_zprize_msm is not compatible with BN254 curve, we commented out for now. Do you think we should remove the unused msm? Such as trapdoor's and halo2's?
1. cd to the `mopro/` directory. | ||
2. run `./scripts/build_ios.sh config-example.toml` (remember to change your ios_device_type `simulator`/`device`) to build and update the bindings. | ||
3. open `mopro-ios/MoproKit/Example/MoproKit.xcworkspace` in Xcode. | ||
4. choose your simulator/mobile device and build the project (can also use `cmd + R` as hot key). | ||
5. choose `MSMBenchmark` and choose the algorithms and click the button below you want to start benchmark. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@FoodChain1028 would you like to help with this change
@@ -0,0 +1,7 @@ | |||
msm_size,num_msm,avg_processing_time(ms) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haha, TLDR Carlos suggested that Halo2curve's msm is state-of-the-art. So we get benchmarks of it.
As you can see below, indeed, Halo2curve performs well with "asm" feature. However, this acceleration feature is only compatible with x86_64 architecture. Therefore, we didn't take this as a reference for our future work.
df4f48f
to
002fcfb
Compare
…chmarking at once
…x 10 benchmark data can be gen in 5 min
…approx. 30% faster
for offloading dependencies requirement
… pippenger as metal wrapper
…s of metal result
002fcfb
to
8eed163
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've listed a few directions for future works
// flatten scalar and base to Vec<u32> for GPU usage | ||
let scalars_limbs = cfg_into_iter!(scalars) | ||
.map(|s| s.into_bigint().to_u32_limbs()) | ||
.flatten() | ||
.collect::<Vec<u32>>(); | ||
let bases_limbs = cfg_into_iter!(points) | ||
.map(|b| { | ||
let b = b.into_group(); | ||
b.x.to_u32_limbs() | ||
.into_iter() | ||
.chain(b.y.to_u32_limbs()) | ||
.chain(b.z.to_u32_limbs()) | ||
.collect::<Vec<_>>() | ||
}) | ||
.flatten() | ||
.collect::<Vec<u32>>(); | ||
let buckets_matrix_limbs = { | ||
// buckets_size * num_windows is for parallelism on windows (variable-based MSM) | ||
let matrix = vec![zero; buckets_size * window_starts.len()]; | ||
cfg_into_iter!(matrix) | ||
.map(|b| { | ||
b.x.to_u32_limbs() | ||
.into_iter() | ||
.chain(b.y.to_u32_limbs()) | ||
.chain(b.z.to_u32_limbs()) | ||
.collect::<Vec<_>>() | ||
}) | ||
.flatten() | ||
.collect::<Vec<u32>>() | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possible optimization direction: enhance speed for preparing inputs for GPU
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you open a new issue?
I think it can be outside of this PR
(and because this PR is too big)
let buckets_matrix = { | ||
let raw_limbs = MetalState::retrieve_contents::<u32>(&buckets_matrix_buffer); | ||
cfg_into_iter!(raw_limbs) | ||
.chunks(24) | ||
.map(|chunk| { | ||
G::new_unchecked( | ||
Fq::from_u32_limbs(&chunk[0..8]), | ||
Fq::from_u32_limbs(&chunk[8..16]), | ||
Fq::from_u32_limbs(&chunk[16..24]), | ||
) | ||
}) | ||
.collect::<Vec<_>>() | ||
}; | ||
|
||
// include the last windox idx | ||
let bucket_starts: Vec<usize> = (0..buckets_matrix.len() + buckets_size) | ||
.step_by(buckets_size) | ||
.collect(); | ||
let window_sums: Vec<_> = cfg_into_iter!(window_starts.clone()) | ||
.enumerate() | ||
.map(|(idx, w_start)| { | ||
// only process unit scalars once in the first window. | ||
let mut res = zero; | ||
if w_start == 0 { | ||
for i in 0..instances_size { | ||
if scalars[i] == one { | ||
res += points[i]; | ||
} | ||
} | ||
} | ||
|
||
let buckets = buckets_matrix[bucket_starts[idx]..bucket_starts[idx + 1]].to_vec(); | ||
let mut running_sum = zero; | ||
buckets.into_iter().rev().for_each(|b| { | ||
running_sum += &b; | ||
res += &running_sum; | ||
}); | ||
res | ||
}) | ||
.collect(); | ||
|
||
let lowest = *window_sums.first().unwrap(); | ||
|
||
Ok(lowest | ||
+ &window_sums[1..] | ||
.iter() | ||
.rev() | ||
.fold(zero, |mut total, sum_i| { | ||
total += sum_i; | ||
for _ in 0..window_size { | ||
total = total.double(); | ||
} | ||
total | ||
})) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possible optimization direction: move all of this msm computation to GPU using metal
Hi, @vivianjeng I've rebased to main and adapted changes from the code reviews above. After the review from @FoodChain1028 , we can consider to merge it back to main. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can also add some benchmark tests with CI?
(can be very simple but at least it can work)
you can add a new .yml
file if you think it is separated from mopro
@@ -0,0 +1,7 @@ | |||
msm_size,num_msm,avg_processing_time(ms) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will they support acceleration feature in the future? or it is not possible in arm?
so it will be a legacy benchmark?
@@ -62,7 +62,6 @@ where | |||
println!( | |||
"Average time to execute MSM with {} points and {} scalars in {} iterations is: {:?}", | |||
points.len(), | |||
scalars.len(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be added?
5652d53
to
e99f439
Compare
* Fixed (untested) issue with IOS debug mode not able to build * Script `install_deps.sh` typo * If a path with Mopro IOS App contains a space the `mopro build --platforms ios` fails as spaces in address confuse the `cd` command. * If a path PROJECT_DIR contains a space, it no longer bugs on a `cd` command due to escaping with columns * (feat: core) added support for proving / verifying the sample circuit in morpo-core * (feat: ffi) implemented call interfaces for Halo2 proofs (single input argument) * (feat: template) added Halo2CircuitView for IOS * (feat: example) Renamed Halo2 example module to `halo2-example` * (feat: example) Renamed example crate to `halo2-circuit` to be used by all * (feat: core) Circom is only compiled when halo2 flag is absent * (feat: core) Major rework to build script: 1. Seperated Circom and Halo2 build dependencies. 2. Added Halo2 build dependancies - Check that all keys have been generated. - Replace the placeholder `example` project with the user specified project by using ``paths`` override in ``.cargo/config.toml`. - Check that the user provided project meets requirements - Is a valid cargo crate - Is named as `halo2-circuit` * (feat: cfg) an example of config for Halo2 as well as updated other configs. * (fix: core) issue of build.rs not compiling due to function declaration hidden with conditional compilation. Accidentally commited script changes. Reverted later. * (feat: core) added the autogenerated `.cargo/config.toml` to gitignore * (feat: ffi) added support for halo2 feature and conditional compilation for all functions (with a stud in case a different proof system is used) * (fix: script) reverted changes to script * (feat: tmpl) added kind to template * (feat: tmpl) added Halo2 View to IOS template * (feat: core) added debug prints for build script to know where the SRS, PK and VK are read from * (feat: script) updated the scripts to support Halo2 circuits * (feat: script) added a sample Halo2 circuit to the template * (feat: script) added a sample Halo2 circuit config to the template config * (feat: lock) stable Halo2 version * (feat: example) added conditional compilation for circom example * (feat: core+example) moved circuit specific logic to halo2-circuit crate, making middleware generic * (feat: core) added deserialisation for circuit inputs * (feat: ffi) updated ffi to be compatible with core changes * (fix: core) warning cleanup * (fix: core) build.rs warning cleanup * (fix: ffi) fixed circom compilation issue * (feat: example) created a folder for halo2 example circuits and moved example fibonacci circuit there * (fix: example) remove workspace option as it fails build * (fix: ios-template) update IOS template files to work with new Halo2 circuit inputs * (feat: halo2-template) updated halo2 template with up-to-date halo2 example * (fix: lock) updated lock file * (feat: docs) added documentation on how to use Halo2 circuits * fix: fix toolchain version * fix: fix mopro test * chore: remove unused config file, simplify prepare script * feat: enable parallel in gpu-benchmarks * fix: fix println * style nav and footer * add intro elements and text * import features svgs and text * features layout * build with yarn * remove package-lock.json * fix: fix yarn.lock file * change link colors * set yarn version 3.6.3 * fix arrow image on Docs pages * restore h1 class to default, update px to rem * remove comments * fix mobile header * fix landing page for mobile * fix: remove yarnrc * fix: fix ci error * fix: fix mopro test * (fix: general) Changed "kind" parameter in config to "adapter" to be consistent with [docs](https://zkmopro.org/docs/intro). * (fix: example) Commited the Halo2 Fibonacci circuit to `mopro-core/examples`, as before it was missing. * (fix: general) Changed missed "kind" parameter in config to "adapter" to be consistent with [docs](https://zkmopro.org/docs/intro). * (fix: linter) Fixed all linter errors for `cargo fmt --all` * (fix: ffi) Circom tests are only to be run when circom flag is enabled. * (fix: examples) Removed multiply example (to be added as a different circuit example). * (fix: ffi) Resolved issue with indent. * (refactor: common) Renamed `verify_halo2_proof2` and `prove_halo2_proof2`to `..._halo2_proof` * Feat/Msm with gpu (metal shader language) on laptop (#150) * feat(msm_benchmark): integrate zprize2022 TrapdoorTech msm algo on Rust * refactor(msm_benchmark): separate arkworks_pippenger as baseline from others * refactor(benchmark): rewrite scalars and points gen to preprocess * refactor(baseline): rewrite benchmark method to 2^10 x 2^16 instance size * refactor: modify benchmark standard to match zprize works * feat(baseline): adopt zprize benchmarking method and enable multi benchmarking at once * feat(ffi): integrate trapdoor tech msm in mopro-ffi * feat(ffi): add test of trapdoor tech msm * fix(ffi): modify input for trapdoor msm * fix: update arkowrks pippenger input/ ouput * fix: update trapdoor tech zprize msm input/ ouput * fix: add the feature flag back * fix: modify msm functions input * fix: lint * feat(benchmark data): accelerate the benchmark data generation. 2^20 x 10 benchmark data can be gen in 5 min * feat: add a README file for gpu-exploration * feat(gpu-explorations): benchmark msm on BN254 curve, which leads to approx. 30% faster * feat(gpu-explorations): integrated halo2curve's msm and benchmarks * refactor(msm): disable other msm's except the arkworks 0.4 msm for offloading dependencies requirement * feat(metal): provide basic structure of metal backend and rust wrapper for msm * fix: compile pathway error * chore: fix shader path config and identify parallel part of arkworks' pippenger as metal wrapper * feat(metal): draft msm wrapper in Rust for metal backend * doc: added reference for bls-12-381 and bn254 * chore(metal): add python helper to compute BN254 params * feat(metal): introduce u256 type implementation and bn254 params * docs(metal): generate abstract addition chain instructions for further implementation * feat: add instruction for bn254 addchain * test(metal): add fixed-params tests for bn254 operations * test(metal): update u256 type and focus on add test * test(metal): fix To and From BigInt format, provide better view on add_test result * fix: make the path root-compatible * chore: added error test * feat: compiled metal lib * chore(metal): update test log for better view on the bug * fix(metal): correct logic of {to, from}_u32_limbs and addition logic in metal of [low, high, 0, 0] * fix(metal): correct data repr logic between metal and arkworks and complete uint test * fix(metal): update bn254 tests and fix logic * fix(metal): update bn254 neg test * fix(metal): use larger result arrays * test(metal): add & sub fuzzing test for Fq_bn254 * fix(metal): correct the Montgomery Mul. Constant and complete mul test * test(metal): add pow test for bn254 base field * fix(metal): fix logic for msm usage on bn254 * fix(metal): correct the metal buffer index * refactor(metal): utils module for data format between metal (GPU) and arkworks (CPU) * fix(metal): correct encode/decode logic * test(metal): add test for msm accumulation phase to ensure correctness of metal result * test(metal): add test to bn254 points arithmetics * fix: modified double_in_place * refactor(metal): add limbs_conversion module for to/from metal * feat(metal): implement arkworks msm accumulation logic in metal * refactor(metal): add Fq conversion to/from limbs for metal * feat(metal): compute msm bucket in window-wise fashion * test(metal): add msm wrapper test on metal implementation * feat(metal): implement msm with enabling GPU computation on accumulation phase * refactor(metal): update paths for metal shader files * feat: integrate metal msm into mopro-ffi * feat(metal): add mont_reduction module for gpu result conversion * test(metal): enable latest limb conversion and remove unused module * feat(metal): optimize msm bucket computation with window-wise accumulation * feat(metal): Rust wrapper for latest metal msm accumulation * chore: update the instanceSize and numInstance in metal to make consistency * chore: remove commented code for bls12_377 curve parsing * fix: correct warning for GPU explorations code * chore(gpu-benchmarks): correct minor changes --------- Co-authored-by: moven0831 <[email protected]> * (feat: ci) Added an option to run CI on when a PR is ready for review, opened, re-opened and synchronized. * Script `install_deps.sh` typo * If a path with Mopro IOS App contains a space the `mopro build --platforms ios` fails as spaces in address confuse the `cd` command. * If a path PROJECT_DIR contains a space, it no longer bugs on a `cd` command due to escaping with columns * (feat: core) added support for proving / verifying the sample circuit in morpo-core * (feat: ffi) implemented call interfaces for Halo2 proofs (single input argument) * (feat: template) added Halo2CircuitView for IOS * (feat: example) Renamed Halo2 example module to `halo2-example` * (feat: example) Renamed example crate to `halo2-circuit` to be used by all * (feat: core) Circom is only compiled when halo2 flag is absent * (feat: core) Major rework to build script: 1. Seperated Circom and Halo2 build dependencies. 2. Added Halo2 build dependancies - Check that all keys have been generated. - Replace the placeholder `example` project with the user specified project by using ``paths`` override in ``.cargo/config.toml`. - Check that the user provided project meets requirements - Is a valid cargo crate - Is named as `halo2-circuit` * (feat: cfg) an example of config for Halo2 as well as updated other configs. * (fix: core) issue of build.rs not compiling due to function declaration hidden with conditional compilation. Accidentally commited script changes. Reverted later. * (feat: core) added the autogenerated `.cargo/config.toml` to gitignore * (feat: ffi) added support for halo2 feature and conditional compilation for all functions (with a stud in case a different proof system is used) * (fix: script) reverted changes to script * (feat: tmpl) added kind to template * (feat: tmpl) added Halo2 View to IOS template * (feat: core) added debug prints for build script to know where the SRS, PK and VK are read from * (feat: script) updated the scripts to support Halo2 circuits * (feat: script) added a sample Halo2 circuit to the template * (feat: script) added a sample Halo2 circuit config to the template config * (feat: lock) stable Halo2 version * (feat: example) added conditional compilation for circom example * (feat: core+example) moved circuit specific logic to halo2-circuit crate, making middleware generic * (feat: core) added deserialisation for circuit inputs * (feat: ffi) updated ffi to be compatible with core changes * (fix: core) warning cleanup * (fix: core) build.rs warning cleanup * (fix: ffi) fixed circom compilation issue * (feat: example) created a folder for halo2 example circuits and moved example fibonacci circuit there * (fix: example) remove workspace option as it fails build * (fix: ios-template) update IOS template files to work with new Halo2 circuit inputs * (feat: halo2-template) updated halo2 template with up-to-date halo2 example * (fix: lock) updated lock file * (fix: general) Changed "kind" parameter in config to "adapter" to be consistent with [docs](https://zkmopro.org/docs/intro). * (fix: example) Commited the Halo2 Fibonacci circuit to `mopro-core/examples`, as before it was missing. * (fix: general) Changed missed "kind" parameter in config to "adapter" to be consistent with [docs](https://zkmopro.org/docs/intro). * (fix: linter) Fixed all linter errors for `cargo fmt --all` * (fix: ffi) Circom tests are only to be run when circom flag is enabled. * (fix: examples) Removed multiply example (to be added as a different circuit example). * (fix: ffi) Resolved issue with indent. * (refactor: common) Renamed `verify_halo2_proof2` and `prove_halo2_proof2`to `..._halo2_proof` * (merge: ffi) Merged changes from GPU explorations to the FFI file, moving them from `lib.rs` to `circom.rs` * (fix: ffi) Fix a compilation error in `circom.rs` * chore: update cargo.lock * (feat: web) Added a Halo2 page to the docs. * (fix: conf) Fixed a GitHub conflict issue in config * (chore: halo2) optimised use statement conditional compilation by grouping them together * (fix: examples) fixed the halo2 fibonacci example README to be up-to-date with the content of the example crate * (fix: template) moved updated fibonacci halo2 example to the template * Script `install_deps.sh` typo * If a path with Mopro IOS App contains a space the `mopro build --platforms ios` fails as spaces in address confuse the `cd` command. * If a path PROJECT_DIR contains a space, it no longer bugs on a `cd` command due to escaping with columns * (feat: core) added support for proving / verifying the sample circuit in morpo-core * (feat: ffi) implemented call interfaces for Halo2 proofs (single input argument) * (feat: template) added Halo2CircuitView for IOS * (feat: example) Renamed Halo2 example module to `halo2-example` * (feat: example) Renamed example crate to `halo2-circuit` to be used by all * (feat: core) Circom is only compiled when halo2 flag is absent * (feat: core) Major rework to build script: 1. Seperated Circom and Halo2 build dependencies. 2. Added Halo2 build dependancies - Check that all keys have been generated. - Replace the placeholder `example` project with the user specified project by using ``paths`` override in ``.cargo/config.toml`. - Check that the user provided project meets requirements - Is a valid cargo crate - Is named as `halo2-circuit` * (feat: cfg) an example of config for Halo2 as well as updated other configs. * (fix: core) issue of build.rs not compiling due to function declaration hidden with conditional compilation. Accidentally commited script changes. Reverted later. * (feat: core) added the autogenerated `.cargo/config.toml` to gitignore * (feat: ffi) added support for halo2 feature and conditional compilation for all functions (with a stud in case a different proof system is used) * (fix: script) reverted changes to script * (feat: tmpl) added kind to template * (feat: tmpl) added Halo2 View to IOS template * (feat: core) added debug prints for build script to know where the SRS, PK and VK are read from * (feat: script) updated the scripts to support Halo2 circuits * (feat: script) added a sample Halo2 circuit to the template * (feat: script) added a sample Halo2 circuit config to the template config * (feat: lock) stable Halo2 version * (feat: example) added conditional compilation for circom example * (feat: core+example) moved circuit specific logic to halo2-circuit crate, making middleware generic * (feat: core) added deserialisation for circuit inputs * (feat: ffi) updated ffi to be compatible with core changes * (fix: core) warning cleanup * (fix: core) build.rs warning cleanup * (fix: ffi) fixed circom compilation issue * (feat: example) created a folder for halo2 example circuits and moved example fibonacci circuit there * (fix: example) remove workspace option as it fails build * (fix: ios-template) update IOS template files to work with new Halo2 circuit inputs * (feat: halo2-template) updated halo2 template with up-to-date halo2 example * (fix: lock) updated lock file * (fix: general) Changed "kind" parameter in config to "adapter" to be consistent with [docs](https://zkmopro.org/docs/intro). * (fix: example) Commited the Halo2 Fibonacci circuit to `mopro-core/examples`, as before it was missing. * (fix: general) Changed missed "kind" parameter in config to "adapter" to be consistent with [docs](https://zkmopro.org/docs/intro). * (fix: linter) Fixed all linter errors for `cargo fmt --all` * (fix: ffi) Circom tests are only to be run when circom flag is enabled. * (fix: examples) Removed multiply example (to be added as a different circuit example). * (fix: ffi) Resolved issue with indent. * (refactor: common) Renamed `verify_halo2_proof2` and `prove_halo2_proof2`to `..._halo2_proof` * (feat: ffi) implemented call interfaces for Halo2 proofs (single input argument) * (feat: script) added a sample Halo2 circuit to the template * (feat: ffi) updated ffi to be compatible with core changes * (fix: example) Commited the Halo2 Fibonacci circuit to `mopro-core/examples`, as before it was missing. * (fix: examples) Removed multiply example (to be added as a different circuit example). * (fix: ffi) Resolved issue with indent. * (refactor: common) Renamed `verify_halo2_proof2` and `prove_halo2_proof2`to `..._halo2_proof` * (merge: ffi) Merged changes from GPU explorations to the FFI file, moving them from `lib.rs` to `circom.rs` * (fix: ffi) Fix a compilation error in `circom.rs` * (feat: web) Added a Halo2 page to the docs. * (chore: halo2) optimised use statement conditional compilation by grouping them together * (fix: examples) fixed the halo2 fibonacci example README to be up-to-date with the content of the example crate * (fix: template) moved updated fibonacci halo2 example to the template --------- Co-authored-by: Ya-wen, Jeng <[email protected]> Co-authored-by: CJ Rose <[email protected]> Co-authored-by: FoodChain <[email protected]> Co-authored-by: moven0831 <[email protected]>
mopro-core/src/middleware/gpu_explorations/metal/
.BLS12381
curve was integrated and we converted them intoBN254
.The result of
metal_msm
that runs 2^10 (1024) randomly-chosen points and scalars: