Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reworked message dispatch logic #1017

Conversation

xgreenx
Copy link
Collaborator

@xgreenx xgreenx commented Nov 15, 2021

In this PR I moved common parts of each message to execute_dispatchable(pull of the contract). In the future, we can optimize it to do one push of the contract.

Introduced an additional static buffer to store input. Now the user has access to the input buffer at any time during the execution of the contract(it is loaded only one time). This buffer is needed to rework the decoding process.

Moved the decode logic to the CALLABLE part. It allows the compiler to use value after decoding and do some optimization.

Call unwrap instead of passing error upper during execute dispatchable and decoding. It allows the compiler to optimize the code better.

Optimized return_value function to simply use buffer for encoding the result. We don't need complex methods from ScoppedBuffer because after the call of that method the program will exit.

All these optimizations allow reducing the code of optimized contracts well.

Erc20 takes 7862 bytes instead of 10282(saved 2420).

The more methods in the contract, the stronger the effect of the change.
That contract contains a lot of methods and the chagne saved 6489 bytes:
Original wasm size: 71.8K, Optimized: 38.8K(38844) -> Original wasm size: 81.7K, Optimized: 32.4K(32355)

let input_bindings = generator::input_bindings(constructor.inputs());
let input_bindings = constructor.inputs()
.map(|_| quote! {
::scale::Decode::decode(&mut _input).map_err(|_| ::ink_lang::reflect::DispatchError::CouldNotReadInput)?
Copy link
Collaborator Author

@xgreenx xgreenx Nov 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also removing of .map_err and usage of .unwrap() reduce size for ~500 bytes for each contract in the description. So we can create two analogs of CALLABLE(for debug and release mode).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Erc20: Original wasm size: 31.5K, Optimized: 9.1K(9067) - saved 571
Erc1155: Original wasm size: 83.9K, Optimized: 46.2K(46168) - saved 600
multisig: Original wasm size: 103.8K, Optimized: 46.8K(46798) - saved 359

@cmichi
Copy link
Collaborator

cmichi commented Nov 16, 2021

@xgreenx You can get the ink-waterfall stage to run by removing the failing CI checks altogether from .gitlab-ci.yml. So just remove all failing stages (clippy-std, clippy-wasm, etc.) and push these changes to this PR.

@xgreenx
Copy link
Collaborator Author

xgreenx commented Nov 16, 2021

@xgreenx You can get the ink-waterfall stage to run by removing the failing CI checks altogether from .gitlab-ci.yml. So just remove all failing stages (clippy-std, clippy-wasm, etc.) and push these changes to this PR.

ink-waterfall failed=( tests::delegator::delegator_works. I tried it on local node and it works(manually, didn't try to run a test). Maybe it is not enough gas?

@xgreenx
Copy link
Collaborator Author

xgreenx commented Nov 16, 2021

@xgreenx You can get the ink-waterfall stage to run by removing the failing CI checks altogether from .gitlab-ci.yml. So just remove all failing stages (clippy-std, clippy-wasm, etc.) and push these changes to this PR.

ink-waterfall failed=( tests::delegator::delegator_works. I tried it on local node and it works(manually, didn't try to run a test). Maybe it is not enough gas?

Oh, after my change the contract requires 53k additional gas=D Because I'm allocating 16MB in stack=D But I think we can decide in this PR how to optimize it(for example use static buffer form instance)

@paritytech-ci
Copy link

paritytech-ci commented Nov 16, 2021

🦑 📈 ink! Example Contracts ‒ Changes Report 📉 🦑

These are the results when building the examples/* contracts from this branch with cargo-contract 0.17.0-06c5598 and comparing them to ink! master:

Δ Optimized Size Δ Used Gas Total Optimized Size Total Used Gas
accumulator +1.62 K 1.62 K
adder +2.82 K 2.82 K
contract-terminate -0.12 K -1 1.19 K 200_777
contract-transfer -0.05 K +16 7.94 K 1_415
delegator -0.86 K -1_063 9.20 K 8_447
dns -1.14 K -1_816 9.33 K 12_472
erc1155 -2.88 K -3_718 24.62 K 44_093
erc20 -2.06 K -1_929 8.16 K 7_828
erc721 -2.44 K +1_264 11.26 K 35_966
flipper +0.04 K +21 1.81 K 368
incrementer +0.05 K +23 1.69 K 363
multisig -1.20 K -6_313 27.36 K 48_583
proxy -0.23 K -80 3.58 K 2_549
rand-extension -1.07 K -92 4.04 K 6_562
subber +2.83 K 2.83 K
trait-erc20 -2.08 K -1_928 8.44 K 8_716
trait-flipper +1.62 K +22 1.62 K 364
trait-incrementer +1.73 K +47 1.73 K 727

Link to the run | Last update: Thu Jan 27 10:40:02 CET 2022

Copy link
Collaborator

@Robbepop Robbepop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not yet reviewed the whole thing but I am very skeptical of the introduction of the local static buffer. Imo we should make sure to use the already existing global static buffer in those cases.

pub trait ContractMessageDecoder {
/// The ink! smart contract message decoder type.
type Type: scale::Decode + ExecuteDispatchable;
pub trait ContractMessageExecutor {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove all the documentation and usage examples? Please re-add or write new.

pub trait ContractConstructorDecoder {
/// The ink! smart contract constructor decoder type.
type Type: DecodeDispatch + ExecuteDispatchable;
pub trait ContractConstructorExecutor {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please re-add or write new docs with usage examples.

);
}
}
// use ink_lang as ink;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please re-enable this UI test if possible. We should always try to have testable code.

/// # Note
///
/// This function stops the execution of the contract immediately.
pub fn return_value_scoped(return_flags: ReturnFlags, return_value: &[u8]) -> ! {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very confusing name imo. The scoped buffer concept is an internal low-level concept that should not be exposed to the user. I'd rename this to return_bytes since that's what it is.

/// # Note
///
/// This function stops the execution of the contract immediately.
pub fn input_scoped(buffer: &mut [u8]) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very confusing name for the reason already described in another comment. I would rename this to input_bytes since that is what is actually is.

@@ -220,6 +222,17 @@ pub trait EnvBackend {
where
R: scale::Encode;

/// Returns the buffer to the caller of the executed contract.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Returns the buffer to the caller of the executed contract.
/// Returns the bytes to the caller of the executed contract.

///
/// The `flags` parameter can be used to revert the state changes of the
/// entire execution if necessary.
fn return_value_scoped(&mut self, flags: ReturnFlags, return_value: &[u8]) -> !;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fn return_value_scoped(&mut self, flags: ReturnFlags, return_value: &[u8]) -> !;
fn return_value_scoped(&mut self, flags: ReturnFlags, returned_bytes: &[u8]) -> !;

/// # Note
///
/// This function stops the execution of the contract immediately.
pub fn return_value_scoped(return_flags: ReturnFlags, return_value: &[u8]) -> ! {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub fn return_value_scoped(return_flags: ReturnFlags, return_value: &[u8]) -> ! {
pub fn return_bytes(return_flags: ReturnFlags, returned_bytes: &[u8]) -> ! {

/// # Note
///
/// This function stops the execution of the contract immediately.
pub fn input_scoped(buffer: &mut [u8]) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub fn input_scoped(buffer: &mut [u8]) {
pub fn input_bytes(buffer: &mut [u8]) -> Result<()> {

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another problem I have with this API is that it does not cover the erraneous case where buffer is too small to hold the entire given input.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we used the same schema in the previous implementation. We decoded the whole ink's message variant

Comment on lines 489 to 490
const CAPACITY: usize = 1 << 14;
let mut local_buffer: [u8; CAPACITY] = [0; CAPACITY];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use the already existing static buffer that each ink! smart contract already has? I do not see a technical reason why using the already existing static buffer should be less efficient than using a local one.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use a static buffer because it allows us to not allocate the memory in runtime -> less gas.

I will try to rework it to use a static buffer. The problem is that off_chain and off_chain_experemental don't use a static buffer. So I can't return &[u8]. But if you are okey with that idea I will add a static buffer to that engines.

The local buffer is used to fastly create that PR and check the impact=)

@xgreenx
Copy link
Collaborator Author

xgreenx commented Nov 19, 2021

I want to clarify, that it is draft PR to show the idea=) I will add documentation and tests when you will agree with the idea(=

.gitlab-ci.yml Outdated
@@ -6,11 +6,7 @@


stages:
- check
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this PR change the CI?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will revert it back after merging with master

Comment on lines 499 to 360
pub fn input_bytes<'a>() -> &'a [u8] {
<EnvInstance as OnInstance<'a>>::on_instance(|instance: &'a mut EnvInstance| {
EnvBackend::input_bytes(instance)
})
Copy link
Collaborator

@Robbepop Robbepop Nov 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This API as stated is extremely unsafe to use by design. This is exactly why the decode_input API exists as is. Just notice how with this you could safely (!!) do the following:

let input = ink_env::input_bytes(); // Returns a slice refering to ink!'s static buffer
let caller = ink_env::set_contract_storage(key, &[0x1; 1]);
// The above line mutates ink!'s static buffer to hold an AccountId.
// Note that any other API that interferes with ink!'s static buffer could have been taken as an example here.

println!("{:?}", input);
// At this point the above input binding no longer holds the input but the AccountId.

I think I do not have to point out how extremely unsafe this API design is and therefore I would never accept a PR introducing it, even if it was marked unsafe since we do not want to introduce unsafe APIs if we can do better.

Basically with ink!'s static buffer the rule number 1 is to never return a reference to it.
So far we have been successful in designing APIs that are efficient and work around this fact.

Copy link
Collaborator Author

@xgreenx xgreenx Nov 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea was that borrow checker will prevent you from the usage of ink_env::caller() if you will use input after that. But I tested it now and... It allows that=D So yes, it should be changed.

Do you have idea how better to provide the usage of static buffer to the user in safety way?

Copy link
Collaborator

@Robbepop Robbepop Nov 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have idea how better to provide the usage of static buffer to the user in safety way?

Well I designed the decode_input API exactly this way because of exactly this issue. So my proposal is to keep decode_input API as is.
Even with a closure approach such as

fn input_bytes<'a, F, T>(f: F) -> Result<T, Error>
where
    F: FnOnce(&'a [u8]) -> T;

This cannot work since then you encounter the same problem within the closure body.

Anything else would require expensive runtime checks such as with RefCell.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the idea of the change is to decode directly in the place where we want to use arguments. Instead of decoding them in another place and then moving them to the place where we want to use them=)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a better idea is to still use decode_input but not move the arguments and instead use references where the user wants to use references. Let me refer to this issue for this use case.

Copy link
Collaborator

@Robbepop Robbepop Nov 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant that we can change code-generation and optimize it do one ::ink_storage::traits::push_spread_root at the end of execute_dispatchable instead of the end of each message section.

Yes, there is a lot of optimization potential in the way we currently dispatch ink! constructors and messages but none of which requires changes in the ink_env APIs imo. That's what I meant.

BTW, It should have the same performance. But in the case of a local buffer you need to allocate that memory in the stack - it is additional operations in Runtime.

Yes, so theoretically a static buffer should be even more efficient, right? ;)

Copy link
Collaborator Author

@xgreenx xgreenx Nov 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding artificial methods to ink! trait definitions is a no-go. We must make sure that ink! modifies trait definitions (and others) as minimally as possible.

I agree with that. But I think that this is the only possible solution at the moment if we want to support all the features of traits in rust. Because if the user defines type alias, or constant, or super trait, you will not able to do a trick as you did for __ink_TraitInfo=)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should not discuss all at once. Please let us focus on the improvements to the dispatch logic in this PR.

Copy link
Collaborator Author

@xgreenx xgreenx Nov 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mentioned trait because it requires the next logic:
First, we need to get 4 bytes to find a selector. After we know which method we need to call. After that method will decode the remaining bytes into arguments.

So it is why I'm trying to decode them agnoticly=)

What do you think about two static buffers? One will be always read-only and will contain input bytes. Another will be used for encoding/decoding. In this case, the user will have the access to the input at any time. And he can decode what he wants by himself.

Copy link
Collaborator Author

@xgreenx xgreenx Nov 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We always get the whole input for the user(even if he wants only 4 bytes we will read all 16 KB). And if he wants to have a simple signature of method(with 0 arguments) but later he wants to decode some parts of the input, he again should get the whole data. And it can be expensive to call ext::seal_input each time. So I think the idea with a second array only for input is not so bad(=

@xgreenx xgreenx force-pushed the optimizations/upgraded-version-of-common-part branch from d2b7a52 to b2d86fa Compare November 27, 2021 19:52
@codecov-commenter
Copy link

codecov-commenter commented Nov 28, 2021

Codecov Report

Merging #1017 (2b808cc) into master (2785a90) will increase coverage by 2.74%.
The diff coverage is 88.48%.

@@            Coverage Diff             @@
##           master    #1017      +/-   ##
==========================================
+ Coverage   75.69%   78.44%   +2.74%     
==========================================
  Files         252      253       +1     
  Lines        9395     9487      +92     
==========================================
+ Hits         7112     7442     +330     
+ Misses       2283     2045     -238     
Impacted Files Coverage Δ
crates/env/src/api.rs 40.90% <0.00%> (+4.54%) ⬆️
crates/env/src/backend.rs 83.87% <ø> (+3.22%) ⬆️
...tes/env/src/engine/experimental_off_chain/impls.rs 53.15% <0.00%> (-0.98%) ⬇️
crates/lang/codegen/src/generator/arg_list.rs 100.00% <ø> (ø)
crates/lang/codegen/src/generator/mod.rs 100.00% <ø> (ø)
crates/lang/src/reflect/dispatch.rs 70.00% <ø> (+70.00%) ⬆️
crates/env/src/engine/off_chain/impls.rs 46.55% <20.00%> (+4.17%) ⬆️
.../tests/ui/contract/pass/dispatch-executor-works.rs 94.59% <94.59%> (ø)
crates/lang/codegen/src/generator/dispatch.rs 98.98% <95.12%> (+4.41%) ⬆️
...rates/env/src/engine/experimental_off_chain/mod.rs 100.00% <100.00%> (ø)
... and 68 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@xgreenx
Copy link
Collaborator Author

xgreenx commented Nov 28, 2021

The change reduces the size of each contract except subcontracts of the delegator example. I will check why and how we can improve it too.

The open question is: Are we okay to have a separate static buffer to store input of the contract?

@xgreenx
Copy link
Collaborator Author

xgreenx commented Nov 28, 2021

I found that all contracts from workspace contain debug information(stuff related to Debug and Display). It increases the size of all contracts there.

Removing [workspace] decreases the size of each contract. I think it is better to remove it to see the actual size of the contracts.

@xgreenx xgreenx changed the title Upgraded version of common parts Reworked message dispatch logic Nov 29, 2021
@xgreenx
Copy link
Collaborator Author

xgreenx commented Nov 29, 2021

The change is ready, so I'm waiting for your comments=)

The change affects delegator sub-contract in a positive direction too is saves ~300 bytes.

@HCastano
Copy link
Contributor

@xgreenx can you update this for the latest master? I'm curious as to how the size improvements look. Since this is a big change to the dispatch logic I think we should only move forward with it if the savings look good

@xgreenx xgreenx force-pushed the optimizations/upgraded-version-of-common-part branch from 4551a13 to 55bbdae Compare January 26, 2022 23:37
@xgreenx
Copy link
Collaborator Author

xgreenx commented Jan 27, 2022

ink-waterfall didn't update the sizes=)

@HCastano
Copy link
Contributor

ink-waterfall didn't update the sizes=)

Yep, looks like there's a problem (paritytech/ink-waterfall#20).

The job did succeed though, and here are the results (link to run):

|                      | Δ Optimized Size | Δ Used Gas | Total Optimized Size | Total Used Gas |
|----------------------|------------------|------------|----------------------|----------------|
| `accumulator`        | +0.19 K          |            | 1.62 K               |                |
| `adder`              | -0.12 K          |            | 2.82 K               |                |
| `contract-terminate` | -0.12 K          | -1         | 1.19 K               | 200_777        |
| `contract-transfer`  | -0.05 K          | +16        | 7.94 K               | 1_415          |
| `delegator`          | -0.86 K          | -1_063     | 9.20 K               | 8_447          |
| `dns`                | -1.14 K          | -1_816     | 9.33 K               | 12_472         |
| `erc1155`            | -2.88 K          | -3_718     | 24.62 K              | 44_093         |
| `erc20`              | -2.06 K          | -1_929     | 8.16 K               | 7_828          |
| `erc721`             | -2.44 K          | +1_264     | 11.26 K              | 35_966         |
| `flipper`            | +0.04 K          | +21        | 1.81 K               | 368            |
| `incrementer`        | +0.05 K          | +23        | 1.69 K               | 363            |
| `multisig`           | -1.20 K          | -6_313     | 27.36 K              | 48_583         |
| `proxy`              | -0.23 K          | -80        | 3.58 K               | 2_549          |
| `rand-extension`     | -1.07 K          | -92        | 4.04 K               | 6_562          |
| `subber`             | -0.13 K          |            | 2.83 K               |                |
| `trait-erc20`        | -2.08 K          | -1_928     | 8.44 K               | 8_716          |
| `trait-flipper`      | +0.09 K          | +22        | 1.62 K               | 364            |
| `trait-incrementer`  | +0.11 K          | +47        | 1.73 K               | 727            |

So the savings are alright, but they're not mindblowing either. I'm on the fence about this given the nature of the changes, but give me some time to try and better understand the new dispatch logic you're proposing before we decide what to do

@xgreenx
Copy link
Collaborator Author

xgreenx commented Jan 27, 2022

So the savings are alright, but they're not mindblowing either

The more methods in the contract, the stronger the effect of the change.
That contract contains a lot of methods and the change saved 6489 bytes

@HCastano
Copy link
Contributor

@xgreenx This has been neglected for a while (I guess mostly by us, sorry).

If you want to revive it please open a new PR so we can a fresh perspective and set of sizes

@HCastano HCastano closed this Feb 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants