feat: stabilize alt_bn128 familiy of host functions #6824

matklad · 2022-05-18T12:07:52Z

Feature to stabilize

This PR stabilizes three host functions: alt_bn128_g1_multiexp, alt_bn128_g1_sum, alt_bn128_pairing_check. They implement addition, scalar multiplication, and pairing check for a specific elliptic curve used in the ethereum ecosystem (eip-196).

Testing and QA

This feature underwent extensive testing:

we had several audits
aurora impements ethereum precompiles on top of these host functions, and those precompiles pass ethereum tests
this PR adds a couple more tests generating using the implementation used in go-ethereum.
we verified our costs against costs in ethereum, they are roughly comparable in terms of wall-clock time

Pre-mortem

The biggest risk I see is that we are not experts in elliptic curve crypto, so it's hard to judge if the API overall makes sense. Maybe it could be more general, maybe there are better curves, etc. However, it does fit aurora use-case and, given that the impl here is rather straightforward, even if we change something in the future, keeping the current functions won't be too onerous.

Checklist

Link to nightly nayduck run: https://nayduck.near.org/#/run/2510
Update CHANGELOG.md to include this protocol feature in the Unreleased section.

core/primitives-core/src/config.rs

jakmeier · 2022-05-19T02:37:15Z

runtime/near-test-contracts/src/lib.rs

@@ -166,7 +166,7 @@ pub fn arbitrary_contract(seed: u64) -> Vec<u8> {
    config.exceptions_enabled = false;
    config.saturating_float_to_int_enabled = false;
    config.sign_extension_enabled = false;
-    config.available_imports = Some(rs_contract().to_vec());
+    config.available_imports = Some(base_rs_contract().to_vec());


I have some problems understanding why this is changed. Can you explain it to me please?
For one, it seems counter-intuitive that arbitrary_contract now returns not the standard test contract. I would have thought that a caller that specifically asks for "arbitrary" is able to handle any contract, so the standard contract should be good enough.
Further, I don't really see how this relates to the feature being stabilized here. Is it to avoid testing the change in test_wasmer2_artifact_output_stability? Why wouldn't we want to test that?

We rely on arbitrary_contact being deterministic, as we use it our artifact stability test here:

nearcore/runtime/near-vm-runner/src/tests/cache.rs

Line 100 in b7e844a

fn test_wasmer2_artifact_output_stability() {

rs_contract may grow new imports over time.

That explains why we are doing it. But doesn't it change the semantics of arbitrary_contract to an extent that we should rename the function and change the comment on it? (I think we only call it from this test, which relies on contract properties making it non-arbitrary)

I’m confused as well. What does ‘arbitrary_contact being deterministic,’ mean here? There is no guarantee that base_rs_contract will not be changed in the future.

The only thing that matters here is the imports of the base_rs_contract, and those are unlikely to change (b/c adding import is a protocol change).

We could rename it to arbitrary_deterministic_contract, or add a comment explaining how we rely on it being deterministic, but I'd rather not do this. Today, we have a single call-site for this function, and its seems to early to enshrine a specific semantics here. Maybe we'd want to just move this function from common library to that test!

Ok, thanks for explaining. Moving it to the test would probably make sense, yeah. But it doesn't make a big difference. The signature of arbitrary_contract, to which I count the name itself, too, is still awkward to me.
But I feel it does not matter that much. I was only worried that we are adding a tiny bit of technical debt here but IMO it's not worth the time spent on further discussions. :)

PandaRR007 · 2022-05-19T05:51:27Z

Hi guys, will this feature be included in release 1.27.0? Thanks.

matklad · 2022-05-19T09:22:21Z

@PandaRR007 I think it will be!

PandaRR007 · 2022-05-19T10:38:13Z

@PandaRR007 I think it will be!

Good news. I'm looking forward to it.

mina86 · 2022-05-21T19:32:41Z

Hi guys, will this feature be included in release 1.27.0? Thanks.

It most likely won’t. The current policy is that we cut a release a week before the testnet release which happens next Wednesday. In other words only things which were in master on 18th will be included in 1.27.0-rc.1 and no new futures come in during -rc cycle.

mina86 · 2022-05-21T19:39:03Z

core/primitives/src/version.rs

@@ -233,10 +232,9 @@ impl ProtocolFeature {
            | ProtocolFeature::LimitContractLocals
            | ProtocolFeature::ChunkNodesCache
            | ProtocolFeature::LowerStorageKeyLimit => 53,
+            ProtocolFeature::AltBn128 => 54,


55, and you’ll also need to increase STABLE_PROTOCOL_VERSION.

Good catch! I see that now we "jump" over versions 52 and 54, in a sense that these versions won't have any protocol features associated with them. How does this happen? Intuitively it seems that, to make a protocol change, we should have ProtocolFeature, so that we can apply old logic for old protocol versions.

54 is a network layer protocol change when @pompon0 added protobuf support so it doesn’t affect the chain itself. We probably should decouple the two versions at some point. Though with protobufs it might be easier not to worry about network layer protocol version that much.

mina86 · 2022-05-21T19:43:35Z

runtime/near-test-contracts/src/lib.rs

@@ -166,7 +166,7 @@ pub fn arbitrary_contract(seed: u64) -> Vec<u8> {
    config.exceptions_enabled = false;
    config.saturating_float_to_int_enabled = false;
    config.sign_extension_enabled = false;
-    config.available_imports = Some(rs_contract().to_vec());
+    config.available_imports = Some(base_rs_contract().to_vec());


I’m confused as well. What does ‘arbitrary_contact being deterministic,’ mean here? There is no guarantee that base_rs_contract will not be changed in the future.

mina86 · 2022-05-21T19:50:40Z

runtime/runtime-params-estimator/test-contract/build.sh

+# FIXME(#6822): we should just remove the payload logic. 10Kib variant is
+# broken, because the baseline contract is >10KiB (data for alt_bn estimatons).
+
 # 10KiB
-dd if=/dev/urandom of=./res/payload bs=$(expr 10240 - ${bare_wasm}) count=1
+dd if=/dev/urandom of=./res/payload bs=1 count=1
 cargo build --target wasm32-unknown-unknown --release  --features "payload$features_with_comma"
 cp target/wasm32-unknown-unknown/release/test_contract.wasm ./res/stable_small_contract.wasm


Why FIXME? Can’t we just do it now? If bare_wasm is ≥ 10240 than just cp -- test_contract.wasm stable_small_cotract.wasm and we can go on with our lives.

If bare_wasm is ≥ 10240 than just cp -- test_contract.wasm stable_small_cotract.wasm and we can go on with our lives.

The contract will then be bigger than 10KiB, but the current code is written as if it being exactly 10KiB matters.

Ultimately, I suspect that this whole file is mostly dead code at this point, but I'd rather not deal with it during stabilization PR.

Yeah please do not change it in this PR. The estimator uses the sizes. Eventhough it doesn't rely on it being exact, I would still want to check that and a stabilization PR is not the right place for such a change anyway.

jakmeier

Approving to stabilise this, LGTM. I am happy to see this moving forward! Too bad we will have to wait for another cycle, I did not have the 1 week on my radar...

(Second approval is also required although not enforced by GH.)

mina86 · 2022-05-23T16:17:36Z

I did not have the 1 week on my radar...

Yeah, it’s a new policy. What has been happening so far is that on the day of the release we would scramble to make the cut and then test things before pushing testnet release which wasn’t ideal.

matklad · 2022-05-24T14:41:00Z

Test failure is interesting:

[2022-05-24 12:35:35] INFO: Got protocol 53 in mainnet release 1.26.0.
[2022-05-24 12:35:35] INFO: Got protocol 53 in testnet release 1.26.0-rc.3.
[2022-05-24 12:35:35] INFO: Got protocol 55 on master branch.

This is probably a side-effect of our time-based protocol upgrade process?

mina86 · 2022-05-24T16:09:51Z

I was afraid upgradable.py might be an issue. The time-based upgrades aren’t an issue here. The test compares versions that --version outputs. The problem in this instance is that 1.27.0-rc.1 with version 54 hasn’t been released yet and the test doesn’t understand that upcoming 1.27.0-rc.1 will use protocol version 54. I think at this point the easiest solution is to wait till Wednesday evening or Thursday once 1.27.0 rolls out and then the test will see 53 on mainnet, 54 on testnet and will allow 55 on master.

matklad · 2022-05-24T16:55:02Z

sgtm!

matklad · 2022-06-01T18:10:22Z

@mina86 PTAL!

mina86 · 2022-06-01T18:46:13Z

runtime/near-vm-logic/src/tests/alt_bn128.rs

@@ -109,12 +109,31 @@ fn test_alt_bn128_g1_multiexp() {
    }

    check_ok(&le_bytes![], &le_bytes![0x0 0x0]);
+    check_ok(
+        &le_bytes![
+            0x2d6b17489d86fcd5f91e8e92eb55081d8cb4413e408047249ef4fb5baa1b518b 0x1e4d0a30dbadd9dad40f7847c7013754ded8d0371c052d19f01453f4ae1506d7 0x1,


I’m confused by formatting in this file. Sometimes the data is on separate line, sometimes the whole thing is a single line. Commas also seem to be used arbitrarily. Furthermore, I’d wrap all the buffers at the space. It’s probably too much noise to change it all though so whatever.

mina86 · 2022-06-01T18:52:32Z

runtime/near-vm-runner/src/tests/cache.rs

    let prepared_hashes = [
-        5920482302426237644,
-        4305202105567340810,
-        5775536517394665889,
-        6282866610476321669,
-        9987754974020503265,
-        2522443647498253022,
-        1434775828544411571,
+        12248437801724644735,
+        2647244875869025389,
+        892153519407678490,
+        8592050243596620350,
+        2309330154575012917,
+        9323529151210819831,
+        11488755771702465226,
    ];
    let mut got_prepared_hashes = Vec::with_capacity(seeds.len());
    let compiled_hashes = [
-        4678798493694903297,
-        4722680261811640693,
-        7795642610370765019,
-        15143423944524767029,
-        7504125870827587271,
-        3662584175683490815,
-        13449186496170384379,
+        5827744486935367002,
+        3163481497450515654,
+        12932669301919595047,
+        4509630115775888919,
+        5285162149441033812,
+        15892844827657184765,
+        7871022777077203514,
    ];


Looks like we want to change this to use insta as well at some point?

Yeah, we probably should, though, this shouldn't be changing all that often (this change is particular is because I adjusted the infra to be more stable, not because this is a genuine change).

Extra test cases were generated using go-ethereum implementation of the curves.

Co-authored-by: Jakob Meier <[email protected]>

# Feature to stabilize This PR stabilizes three host functions: `alt_bn128_g1_multiexp`, `alt_bn128_g1_sum`, `alt_bn128_pairing_check`. They implement addition, scalar multiplication, and pairing check for a specific elliptic curve used in the ethereum ecosystem ([eip-196](https://github.com/ethereum/EIPs/blob/master/EIPS/eip-196.md)). # Testing and QA This feature underwent extensive testing: * we had several audits * aurora impements ethereum precompiles on top of these host functions, and those precompiles pass ethereum tests * this PR adds a couple more tests generating using the implementation used in go-ethereum. * we verified our costs against costs in ethereum, they are roughly comparable in terms of wall-clock time # Pre-mortem The biggest risk I see is that we are not experts in elliptic curve crypto, so it's hard to judge if the API overall makes sense. Maybe it could be more general, maybe there are better curves, etc. However, it does fit aurora use-case and, given that the impl here is rather straightforward, even if we change something in the future, keeping the current functions won't be too onerous. # Checklist - [x] Link to nightly nayduck run: https://nayduck.near.org/#/run/2510 - [x] Update CHANGELOG.md to include this protocol feature in the `Unreleased` section.

frol · 2022-07-28T09:04:23Z

DISCLAIMER: I just ask it out of curiosity, so feel free to ignore it if you don't have the time to answer.

@matklad I am quite late to the party, but I am curious whether we measured the performance of these host functions vs Wasm implementation. It sounds quite unfortunate that we need to have host functions to optimize number crunching performance as Wasm by design supposedly should have covered us here.

P.S. It would be helpful to have link(s) to the PRs that implemented this as a nightly feature to see potential discussions there

matklad · 2022-07-28T10:15:22Z

P.S. It would be helpful to have link(s) to the PRs that implemented this as a nightly feature to see potential discussions there

Good call, #7288

I am quite late to the party, but I am curious whether we measured the performance of these host functions vs Wasm implementation.

There are two questions here:

what is the perf gap between native and our particular WebAssembly runtime (optimized for reliability)
what is the perf gap between native and a WebAssembly runtime optimized performance

For 1, I can't recall super-specific numbers, but I think we did measure a massive cost reduction. To get specific number, you want to play with this test before/after commit locally:

birchmd/aurora-engine@fd4243b#diff-ed8b4fc612dfece459decfe0a47cf4079f5b3e3b7c29cc1f7c4e3be2b42d9b87

The test proves that host fn brought the cost under 200TGas. I don't know the exact difference, but my vague recollection is that was huge.

For 2, we didn't do any measurements, though I'd still expect a significant perf difference there.

Wasm by design supposedly should have covered us here.

My current gut feeling is that our wasm runtime provides non-horrible number crunching perf, but that it is expected to be significantly worse than what you get from a host function

Reasons specific to our WebAssembly runtime (reliability over perf):

we have a simple non-optimizing single-pass compiler
gas-counting is non-trivial overhead
we don't support certain perf-oriented wasm extensions, eg. SIMD. Note that even if we did support SIMD, there would be a perf penalty for more complicated gas accounting.
with a wasm impl, the cost is pessimistic -- even if particular hash function runs fast in WebAssembly, we have to pessimistically estimate it (we use worst-case cost for WebAssembly instruction). For a host function, we estimate a fixed computation, so we don't need to be pessimistic across this dimension.

Reasons general to WebAssembly:

At the moment, WebAssembly generally doesn't expose performance-oriented CPU instructions like fused multiply-add or add-with-carry. My intuition here is that WebAssembly is 2X slower for run-of-the-mill code (pointer chaising, conditions) and 20X slower for really hot code, something you traditionally write in asm (codecs, crypto, interpreters).

akhi3030 · 2022-07-29T07:55:23Z

As a related data point, in https://gov.near.org/t/near-polkadot-using-ibc-trustless-bridging-requests/22807/5, @blasrodri benchmarked the performance difference between wasm and native execution for some signature verification. More specifically they showed that native execution can be much faster.

robert-zaremba · 2022-11-30T23:34:47Z

alt_bn128 is from being fast (in 2022). And there are some security concerns.

In the meantime, many most of the projects opted in BLS12-381. Now, I think the most exciting is Pasta (halo2). Would be great to consider them as well.

matklad requested a review from a team as a code owner May 18, 2022 12:07

matklad requested a review from mm-near May 18, 2022 12:07

matklad mentioned this pull request May 18, 2022

feat: stabilize alt_bn128 familiy of host functions #6813

Closed

2 tasks

matklad requested review from akhi3030 and jakmeier May 18, 2022 12:10

jakmeier reviewed May 19, 2022

View reviewed changes

matklad requested a review from mina86 May 19, 2022 09:22

mina86 reviewed May 21, 2022

View reviewed changes

jakmeier approved these changes May 23, 2022

View reviewed changes

matklad force-pushed the m/stabilize-altbn-128 branch 5 times, most recently from 7bf52a3 to 02a4655 Compare May 24, 2022 12:32

matklad force-pushed the m/stabilize-altbn-128 branch 3 times, most recently from 7f310fc to 27e6560 Compare June 1, 2022 17:56

matklad requested a review from mina86 June 1, 2022 18:10

mina86 approved these changes Jun 1, 2022

View reviewed changes

matklad added the S-automerge label Jun 2, 2022

test: add more positive tests for alt_bn128

770f153

Extra test cases were generated using go-ethereum implementation of the curves.

matklad and others added 5 commits June 2, 2022 11:13

feat: stabilize alt_bn128 familiy of host functions

ed832ae

hack: work-around broken estimations

5b029aa

doc: changelog

52efb41

Update core/primitives-core/src/config.rs

05abb6c

Co-authored-by: Jakob Meier <[email protected]>

fix: use correct protocol version

115fd00

matklad force-pushed the m/stabilize-altbn-128 branch from 2f5e850 to 115fd00 Compare June 2, 2022 10:13

near-bulldozer bot merged commit 60d9f4b into master Jun 2, 2022

near-bulldozer bot deleted the m/stabilize-altbn-128 branch June 2, 2022 10:27

This was referenced Nov 8, 2022

[Proposal] alt_bn128 curve math near/NEPs#98

Closed

Update spec to add support for alt_bn128_g1_multiexp, alt_bn128_g1_sum, and alt_bn128_pairing_check functions near/NEPs#426

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: stabilize alt_bn128 familiy of host functions #6824

feat: stabilize alt_bn128 familiy of host functions #6824

matklad commented May 18, 2022 •

edited

Loading

jakmeier May 19, 2022

matklad May 19, 2022

jakmeier May 19, 2022

mina86 May 21, 2022

matklad May 23, 2022

jakmeier May 23, 2022

PandaRR007 commented May 19, 2022

matklad commented May 19, 2022

PandaRR007 commented May 19, 2022

mina86 commented May 21, 2022 •

edited

Loading

mina86 May 21, 2022

matklad May 23, 2022

mina86 May 23, 2022

mina86 May 21, 2022

mina86 May 21, 2022

matklad May 23, 2022

jakmeier May 23, 2022

jakmeier left a comment

mina86 commented May 23, 2022

matklad commented May 24, 2022

mina86 commented May 24, 2022

matklad commented May 24, 2022

matklad commented Jun 1, 2022

mina86 Jun 1, 2022

mina86 Jun 1, 2022

matklad Jun 2, 2022

frol commented Jul 28, 2022

matklad commented Jul 28, 2022

akhi3030 commented Jul 29, 2022

robert-zaremba commented Nov 30, 2022

feat: stabilize alt_bn128 familiy of host functions #6824

feat: stabilize alt_bn128 familiy of host functions #6824

Conversation

matklad commented May 18, 2022 • edited Loading

Feature to stabilize

Testing and QA

Pre-mortem

Checklist

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

PandaRR007 commented May 19, 2022

matklad commented May 19, 2022

PandaRR007 commented May 19, 2022

mina86 commented May 21, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jakmeier left a comment

Choose a reason for hiding this comment

mina86 commented May 23, 2022

matklad commented May 24, 2022

mina86 commented May 24, 2022

matklad commented May 24, 2022

matklad commented Jun 1, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

frol commented Jul 28, 2022

matklad commented Jul 28, 2022

akhi3030 commented Jul 29, 2022

robert-zaremba commented Nov 30, 2022

matklad commented May 18, 2022 •

edited

Loading

mina86 commented May 21, 2022 •

edited

Loading