-
Notifications
You must be signed in to change notification settings - Fork 638
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: stabilize alt_bn128 familiy of host functions #6824
Conversation
@@ -166,7 +166,7 @@ pub fn arbitrary_contract(seed: u64) -> Vec<u8> { | |||
config.exceptions_enabled = false; | |||
config.saturating_float_to_int_enabled = false; | |||
config.sign_extension_enabled = false; | |||
config.available_imports = Some(rs_contract().to_vec()); | |||
config.available_imports = Some(base_rs_contract().to_vec()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some problems understanding why this is changed. Can you explain it to me please?
For one, it seems counter-intuitive that arbitrary_contract
now returns not the standard test contract. I would have thought that a caller that specifically asks for "arbitrary" is able to handle any contract, so the standard contract should be good enough.
Further, I don't really see how this relates to the feature being stabilized here. Is it to avoid testing the change in test_wasmer2_artifact_output_stability
? Why wouldn't we want to test that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We rely on arbitrary_contact
being deterministic, as we use it our artifact stability test here:
fn test_wasmer2_artifact_output_stability() { |
rs_contract may grow new imports over time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That explains why we are doing it. But doesn't it change the semantics of arbitrary_contract
to an extent that we should rename the function and change the comment on it? (I think we only call it from this test, which relies on contract properties making it non-arbitrary)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m confused as well. What does ‘arbitrary_contact being deterministic,’ mean here? There is no guarantee that base_rs_contract will not be changed in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only thing that matters here is the imports of the base_rs_contract, and those are unlikely to change (b/c adding import is a protocol change).
We could rename it to arbitrary_deterministic_contract
, or add a comment explaining how we rely on it being deterministic, but I'd rather not do this. Today, we have a single call-site for this function, and its seems to early to enshrine a specific semantics here. Maybe we'd want to just move this function from common library to that test!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, thanks for explaining. Moving it to the test would probably make sense, yeah. But it doesn't make a big difference. The signature of arbitrary_contract
, to which I count the name itself, too, is still awkward to me.
But I feel it does not matter that much. I was only worried that we are adding a tiny bit of technical debt here but IMO it's not worth the time spent on further discussions. :)
Hi guys, will this feature be included in release 1.27.0? Thanks. |
@PandaRR007 I think it will be! |
Good news. I'm looking forward to it. |
It most likely won’t. The current policy is that we cut a release a week before the testnet release which happens next Wednesday. In other words only things which were in master on 18th will be included in 1.27.0-rc.1 and no new futures come in during -rc cycle. |
core/primitives/src/version.rs
Outdated
@@ -233,10 +232,9 @@ impl ProtocolFeature { | |||
| ProtocolFeature::LimitContractLocals | |||
| ProtocolFeature::ChunkNodesCache | |||
| ProtocolFeature::LowerStorageKeyLimit => 53, | |||
ProtocolFeature::AltBn128 => 54, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
55, and you’ll also need to increase STABLE_PROTOCOL_VERSION.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! I see that now we "jump" over versions 52 and 54, in a sense that these versions won't have any protocol features associated with them. How does this happen? Intuitively it seems that, to make a protocol change, we should have ProtocolFeature
, so that we can apply old logic for old protocol versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
54 is a network layer protocol change when @pompon0 added protobuf support so it doesn’t affect the chain itself. We probably should decouple the two versions at some point. Though with protobufs it might be easier not to worry about network layer protocol version that much.
@@ -166,7 +166,7 @@ pub fn arbitrary_contract(seed: u64) -> Vec<u8> { | |||
config.exceptions_enabled = false; | |||
config.saturating_float_to_int_enabled = false; | |||
config.sign_extension_enabled = false; | |||
config.available_imports = Some(rs_contract().to_vec()); | |||
config.available_imports = Some(base_rs_contract().to_vec()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m confused as well. What does ‘arbitrary_contact being deterministic,’ mean here? There is no guarantee that base_rs_contract will not be changed in the future.
# FIXME(#6822): we should just remove the payload logic. 10Kib variant is | ||
# broken, because the baseline contract is >10KiB (data for alt_bn estimatons). | ||
|
||
# 10KiB | ||
dd if=/dev/urandom of=./res/payload bs=$(expr 10240 - ${bare_wasm}) count=1 | ||
dd if=/dev/urandom of=./res/payload bs=1 count=1 | ||
cargo build --target wasm32-unknown-unknown --release --features "payload$features_with_comma" | ||
cp target/wasm32-unknown-unknown/release/test_contract.wasm ./res/stable_small_contract.wasm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why FIXME? Can’t we just do it now? If bare_wasm is ≥ 10240 than just cp -- test_contract.wasm stable_small_cotract.wasm
and we can go on with our lives.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If bare_wasm is ≥ 10240 than just cp -- test_contract.wasm stable_small_cotract.wasm and we can go on with our lives.
The contract will then be bigger than 10KiB, but the current code is written as if it being exactly 10KiB matters.
Ultimately, I suspect that this whole file is mostly dead code at this point, but I'd rather not deal with it during stabilization PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah please do not change it in this PR. The estimator uses the sizes. Eventhough it doesn't rely on it being exact, I would still want to check that and a stabilization PR is not the right place for such a change anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving to stabilise this, LGTM. I am happy to see this moving forward! Too bad we will have to wait for another cycle, I did not have the 1 week on my radar...
(Second approval is also required although not enforced by GH.)
Yeah, it’s a new policy. What has been happening so far is that on the day of the release we would scramble to make the cut and then test things before pushing testnet release which wasn’t ideal. |
7bf52a3
to
02a4655
Compare
Test failure is interesting:
This is probably a side-effect of our time-based protocol upgrade process? |
I was afraid upgradable.py might be an issue. The time-based upgrades aren’t an issue here. The test compares versions that |
sgtm! |
7f310fc
to
27e6560
Compare
@mina86 PTAL! |
@@ -109,12 +109,31 @@ fn test_alt_bn128_g1_multiexp() { | |||
} | |||
|
|||
check_ok(&le_bytes![], &le_bytes![0x0 0x0]); | |||
check_ok( | |||
&le_bytes![ | |||
0x2d6b17489d86fcd5f91e8e92eb55081d8cb4413e408047249ef4fb5baa1b518b 0x1e4d0a30dbadd9dad40f7847c7013754ded8d0371c052d19f01453f4ae1506d7 0x1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m confused by formatting in this file. Sometimes the data is on separate line, sometimes the whole thing is a single line. Commas also seem to be used arbitrarily. Furthermore, I’d wrap all the buffers at the space. It’s probably too much noise to change it all though so whatever.
let prepared_hashes = [ | ||
5920482302426237644, | ||
4305202105567340810, | ||
5775536517394665889, | ||
6282866610476321669, | ||
9987754974020503265, | ||
2522443647498253022, | ||
1434775828544411571, | ||
12248437801724644735, | ||
2647244875869025389, | ||
892153519407678490, | ||
8592050243596620350, | ||
2309330154575012917, | ||
9323529151210819831, | ||
11488755771702465226, | ||
]; | ||
let mut got_prepared_hashes = Vec::with_capacity(seeds.len()); | ||
let compiled_hashes = [ | ||
4678798493694903297, | ||
4722680261811640693, | ||
7795642610370765019, | ||
15143423944524767029, | ||
7504125870827587271, | ||
3662584175683490815, | ||
13449186496170384379, | ||
5827744486935367002, | ||
3163481497450515654, | ||
12932669301919595047, | ||
4509630115775888919, | ||
5285162149441033812, | ||
15892844827657184765, | ||
7871022777077203514, | ||
]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we want to change this to use insta as well at some point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we probably should, though, this shouldn't be changing all that often (this change is particular is because I adjusted the infra to be more stable, not because this is a genuine change).
Extra test cases were generated using go-ethereum implementation of the curves.
2f5e850
to
115fd00
Compare
# Feature to stabilize This PR stabilizes three host functions: `alt_bn128_g1_multiexp`, `alt_bn128_g1_sum`, `alt_bn128_pairing_check`. They implement addition, scalar multiplication, and pairing check for a specific elliptic curve used in the ethereum ecosystem ([eip-196](https://github.com/ethereum/EIPs/blob/master/EIPS/eip-196.md)). # Testing and QA This feature underwent extensive testing: * we had several audits * aurora impements ethereum precompiles on top of these host functions, and those precompiles pass ethereum tests * this PR adds a couple more tests generating using the implementation used in go-ethereum. * we verified our costs against costs in ethereum, they are roughly comparable in terms of wall-clock time # Pre-mortem The biggest risk I see is that we are not experts in elliptic curve crypto, so it's hard to judge if the API overall makes sense. Maybe it could be more general, maybe there are better curves, etc. However, it does fit aurora use-case and, given that the impl here is rather straightforward, even if we change something in the future, keeping the current functions won't be too onerous. # Checklist - [x] Link to nightly nayduck run: https://nayduck.near.org/#/run/2510 - [x] Update CHANGELOG.md to include this protocol feature in the `Unreleased` section.
DISCLAIMER: I just ask it out of curiosity, so feel free to ignore it if you don't have the time to answer. @matklad I am quite late to the party, but I am curious whether we measured the performance of these host functions vs Wasm implementation. It sounds quite unfortunate that we need to have host functions to optimize number crunching performance as Wasm by design supposedly should have covered us here. P.S. It would be helpful to have link(s) to the PRs that implemented this as a nightly feature to see potential discussions there |
Good call, #7288
There are two questions here:
For 1, I can't recall super-specific numbers, but I think we did measure a massive cost reduction. To get specific number, you want to play with this test before/after commit locally: birchmd/aurora-engine@fd4243b#diff-ed8b4fc612dfece459decfe0a47cf4079f5b3e3b7c29cc1f7c4e3be2b42d9b87 The test proves that host fn brought the cost under 200TGas. I don't know the exact difference, but my vague recollection is that was huge. For 2, we didn't do any measurements, though I'd still expect a significant perf difference there.
My current gut feeling is that our wasm runtime provides non-horrible number crunching perf, but that it is expected to be significantly worse than what you get from a host function Reasons specific to our WebAssembly runtime (reliability over perf):
Reasons general to WebAssembly:
|
As a related data point, in https://gov.near.org/t/near-polkadot-using-ibc-trustless-bridging-requests/22807/5, @blasrodri benchmarked the performance difference between wasm and native execution for some signature verification. More specifically they showed that native execution can be much faster. |
In the meantime, many most of the projects opted in |
Feature to stabilize
This PR stabilizes three host functions:
alt_bn128_g1_multiexp
,alt_bn128_g1_sum
,alt_bn128_pairing_check
. They implement addition, scalar multiplication, and pairing check for a specific elliptic curve used in the ethereum ecosystem (eip-196).Testing and QA
This feature underwent extensive testing:
Pre-mortem
The biggest risk I see is that we are not experts in elliptic curve crypto, so it's hard to judge if the API overall makes sense. Maybe it could be more general, maybe there are better curves, etc. However, it does fit aurora use-case and, given that the impl here is rather straightforward, even if we change something in the future, keeping the current functions won't be too onerous.
Checklist
Unreleased
section.