Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-value emulation: Use globals, not allocated tuples #3566

Merged
merged 4 commits into from
Nov 12, 2022

Conversation

nomeata
Copy link
Collaborator

@nomeata nomeata commented Nov 10, 2022

So far, the IC does not support wasm using multi-value returns from functions and blocks (see #1459). The backend was originally written with that feature in mind, and later we added a work-around (module FakeMultiVal).

For functions we work around it by storing multiple values in fixed globals, and reading them after the call. No allocations needed.

For blocks with multiple values, however, we allocate a tuple (array) on the heap, write the values there, and read them later. This adds unnecessary heap churn and fills up memory faster.

So this PR uses the mechanism from the functions also for blocks.

Goes well with #3556

Here is a (particularly bad) example:

In

func multi_value_blocks(n : Nat) : (Nat, Nat) {
  return (
    if (n == 0) {
      returns_tuple()
    } else {
      if (n == 1) {
        returns_tuple()
      } else {
        returns_tuple()
      }
   }
 );
};

Now no allocation happens (and all the reads and writes to the global variables are eliminated due to #3556):

   (func $multi_value_blocks (type 1) (param $clos i32) (param $n i32)
     local.get $n
     i32.const 0
     call $B_eq
-    if (result i32)  ;; label = @1
+    if  ;; label = @1
       i32.const 0
       call $returns_tuple
-      global.get 4
-      global.get 5
-      call $to_2_tuple
     else
       local.get $n
       i32.const 2
       call $B_eq
-      if (result i32)  ;; label = @2
+      if  ;; label = @2
         i32.const 0
         call $returns_tuple
-        global.get 4
-        global.get 5
-        call $to_2_tuple
       else
         i32.const 0
         call $returns_tuple
-        global.get 4
-        global.get 5
-        call $to_2_tuple
       end
     end
-    call $from_2_tuple
     return)

So far, the IC does not support wasm using multi-value returns from
functions and blocks (see #1459). The backend was originally written
with that feature in mind, and later we added a work-around (module
`FakeMultiVal`).

For functions we work around it by storing multiple values in fixed
globals, and reading them after the call. No allocations needed.

For blocks with multiple values, however, we allocate a tuple (array) on the
heap, write the values there, and read them later. This adds unnecessary
heap churn and fills up memory faster.

So this PR uses the mechanism from the functions also for blocks.

Goes well with #3556
@ghost ghost added the cla:agreed label Nov 10, 2022
@github-actions
Copy link

github-actions bot commented Nov 10, 2022

Comparing from 480a3bb to 5285ed1:
In terms of gas, 1 tests improved and the mean change is -0.0%.
In terms of size, 1 tests regressed and the mean change is +0.0%.

@nomeata nomeata marked this pull request as ready for review November 11, 2022 10:44
@ggreif ggreif requested review from kentosugama and ggreif November 11, 2022 23:45
Copy link
Contributor

@crusso crusso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but @ggreif should take a look too.

Will this automatically kick in for switches too, assuming we reduce those to if_?

@ggreif
Copy link
Contributor

ggreif commented Nov 12, 2022

Will this automatically kick in for switches too, assuming we reduce those to if_?

Last time I looked the switch result was communicated back via Wasm locals (set by assignment).

@nomeata
Copy link
Collaborator Author

nomeata commented Nov 12, 2022

Last time I looked the switch result was communicated back via Wasm locals (set by assignment).

Correct:

  | SwitchE (e, cs) ->
    SR.Vanilla,
    let code1 = compile_exp_vanilla env ae e in
    let (set_i, get_i) = new_local env "switch_in" in
    let (set_j, get_j) = new_local env "switch_out" in

    let rec go env cs = match cs with
      | [] -> CanFail (fun k -> k)
      | {it={pat; exp=e}; _}::cs ->
          let (ae1, code) = compile_pat_local env ae pat in
          orElse ( CannotFail get_i ^^^ code ^^^
                   CannotFail (compile_exp_vanilla env ae1 e) ^^^ CannotFail set_j)
                 (go env cs)
          in
      let code2 = go env cs in
      code1 ^^ set_i ^^ orTrap env code2 ^^ get_j

I don’t remember why it was hard to return the result on the stack. Maybe the pattern matching code DSL (CannotFail etc.) simply wasn’t set up for it. No need to change that now.

@crusso
Copy link
Contributor

crusso commented Nov 12, 2022

Is there any chance of this code going wrong if you are already in the midst of producing a flattened tuple on the stack?

@nomeata
Copy link
Collaborator Author

nomeata commented Nov 12, 2022

You mean the Switch code or the IF code?

But I don’t think so; whatever is below the stack is untouched by the generated code.

@crusso
Copy link
Contributor

crusso commented Nov 12, 2022

You mean the Switch code or the IF code?

But I don’t think so; whatever is below the stack is untouched by the generated code.

I meant this PR's code, but I guess these are just extra registers and never live long after a call but always loaded onto the stack

1 similar comment
@crusso
Copy link
Contributor

crusso commented Nov 12, 2022

You mean the Switch code or the IF code?

But I don’t think so; whatever is below the stack is untouched by the generated code.

I meant this PR's code, but I guess these are just extra registers and never live long after a call but always loaded onto the stack

@nomeata
Copy link
Collaborator Author

nomeata commented Nov 12, 2022

Right, they are always short-lived, from the end of a block or function to directly after the block (or after the call).

Copy link
Contributor

@crusso crusso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice one!

@nomeata nomeata added the automerge-squash When ready, merge (using squash) label Nov 12, 2022
@mergify mergify bot merged commit 1fde46a into master Nov 12, 2022
@mergify mergify bot deleted the joachim/block-multi-value branch November 12, 2022 22:06
@mergify mergify bot removed the automerge-squash When ready, merge (using squash) label Nov 12, 2022
mergify bot pushed a commit that referenced this pull request Mar 10, 2023
so that `switch` behaves like `if` does since #3566.

Only kicks in if _all_ branches produce a manifest tuple or call a function that returns
tuples; if only one branch has an expression that is a first-class tuple, all will return a
first-class tuple (could be improved by adding cases to `compile_exp_as`)

Should have made parts of
dfinity/motoko-base#535 not necessary.

Example code:
```
func multi_value_switch(n : Nat) : (Nat, Nat) {
  switch n {
    case 0 { returns_tuple() };
    case 1 { returns_tuple() };
    case _ { (5,6) };
  };
};
```

Changes (calls to `to_2_tuple` are allocating):
```diff
   (func $multi_value_switch (type 1) (param $clos i32) (param $n i32)
-    (local $switch_in i32) (local $switch_out i32) (local $n.1 i32)
+    (local $switch_in i32)
     local.get $n
     local.set $switch_in
-    block (result i32)  ;; label = @1
-      local.get $switch_in
-      i32.const 0
-      call $B_eq
-      if  ;; label = @2
-      else
-        i32.const 0
-        br 1 (;@1;)
-      end
-      i32.const 0
-      call $returns_tuple
-      global.get 17
-      global.get 18
-      call $to_2_tuple
-      local.set $switch_out
-      i32.const 1
-    end
-    if  ;; label = @1
-    else
+    block  ;; label = @1
       block (result i32)  ;; label = @2
         local.get $switch_in
-        i32.const 2
+        i32.const 0
         call $B_eq
         if  ;; label = @3
         else
           i32.const 0
           br 1 (;@2;)
         end
         i32.const 0
         call $returns_tuple
-        global.get 17
-        global.get 18
-        call $to_2_tuple
-        local.set $switch_out
-        i32.const 1
+        br 1 (;@1;)
       end
       if  ;; label = @2
       else
-        i32.const 2097763
-        local.set $switch_out
+        block (result i32)  ;; label = @3
+          local.get $switch_in
+          i32.const 2
+          call $B_eq
+          if  ;; label = @4
+          else
+            i32.const 0
+            br 1 (;@3;)
+          end
+          i32.const 0
+          call $returns_tuple
+          br 2 (;@1;)
+        end
+        if  ;; label = @3
+        else
+          i32.const 10
+          i32.const 12
+          global.set 18
+          global.set 17
+          br 2 (;@1;)
+        end
       end
-    end
-    local.get $switch_out
-    call $from_2_tuple)
+    end)
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants