Skip to content

Commit

Permalink
Automatically merged updates to draft EIP(s) 615 (ethereum#2044)
Browse files Browse the repository at this point in the history
Hi, I'm a bot! This change was automatically merged because:

 - It only modifies existing Draft or Last Call EIP(s)
 - The PR was approved or written by at least one author of each modified EIP
 - The build is passing
  • Loading branch information
expede authored and ilanolkies committed Nov 12, 2019
1 parent 47b3d0b commit 212a7f3
Showing 1 changed file with 14 additions and 14 deletions.
28 changes: 14 additions & 14 deletions EIPS/eip-615.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,9 +69,9 @@ Especially important is efficient translation to and from [eWasm](https://github
These forms
> *`INSTRUCTION`*
>
> *`INSTRUCTION x`*
> *`INSTRUCTION x`*
>
> *`INSTRUCTION x, y`*
> *`INSTRUCTION x, y`*
name an *`INSTRUCTION`* with no, one and two arguments, respectively. An instruction is represented in the bytecode as a single-byte opcode. Any arguments are laid out as immediate data bytes following the opcode inline, interpreted as fixed length, MSB-first, two's-complement, two-byte positive integers. (Negative values are reserved for extensions.)

Expand Down Expand Up @@ -102,7 +102,7 @@ To support subroutines, `BEGINSUB`, `JUMPSUB`, and `RETURNSUB` are provided. Br
#### Switches, Callbacks, and Virtual Functions

Dynamic jumps are also used for `O(1)` indirection: an address to jump to is selected to push on the stack and be jumped to. So we also propose two more instructions to provide for constrained indirection. We support these with vectors of `JUMPDEST` or `BEGINSUB` offsets stored inline, which can be selected with an index on the stack. That constrains validation to a specified subset of all possible destinations. The danger of quadratic blow up is avoided because it takes as much space to store the jump vectors as it does to code the worst case exploit.
Dynamic jumps are also used for `O(1)` indirection: an address to jump to is selected to push on the stack and be jumped to. So we also propose two more instructions to provide for constrained indirection. We support these with vectors of `JUMPDEST` or `BEGINSUB` offsets stored inline, which can be selected with an index on the stack. That constrains validation to a specified subset of all possible destinations. The danger of quadratic blow up is avoided because it takes as much space to store the jump vectors as it does to code the worst case exploit.

Dynamic jumps to a `JUMPDEST` are used to implement `O(1)` jumptables, which are useful for dense switch statements. Wasm and most CPUs provide similar instructions.

Expand Down Expand Up @@ -193,7 +193,7 @@ frame | 21
______|___________ 22
<- SP
```
and after pushing two arguments and branching with `JUMPSUB` to a `BEGINSUB 2, 3`
and after pushing two arguments and branching with `JUMPSUB` to a `BEGINSUB 2, 3`
```
PUSH 10
PUSH 11
Expand Down Expand Up @@ -256,7 +256,7 @@ _Execution_ is as defined in the [Yellow Paper](https://ethereum.github.io/yello
>**5** Invalid instruction
We propose to expand and extend the Yellow Paper conditions to handle the new instructions we propose.
We propose to expand and extend the Yellow Paper conditions to handle the new instructions we propose.

To handle the return stack we expand the conditions on stack size:
>**2a** The size of the data stack does not exceed 1024.
Expand Down Expand Up @@ -290,7 +290,7 @@ All of the remaining conditions we validate statically.

#### Costs & Codes

All of the instructions are `O(1)` with a small constant, requiring just a few machine operations each, whereas a `JUMP` or `JUMPI` must do an O(log n) binary search of an array of `JUMPDEST` offsets before every jump. With the cost of `JUMPI` being _high_ and the cost of `JUMP` being _mid_, we suggest the cost of `JUMPV` and `JUMPSUBV` should be _mid_, `JUMPSUB` and `JUMPIF` should be _low_, and`JUMPTO` and the rest should be _verylow_. Measurement will tell.
All of the instructions are `O(1)` with a small constant, requiring just a few machine operations each, whereas a `JUMP` or `JUMPI` must do an `O(log n)` binary search of an array of `JUMPDEST` offsets before every jump. With the cost of `JUMPI` being _high_ and the cost of `JUMP` being _mid_, we suggest the cost of `JUMPV` and `JUMPSUBV` should be _mid_, `JUMPSUB` and `JUMPIF` should be _low_, and`JUMPTO` and the rest should be _verylow_. Measurement will tell.

We suggest the following opcodes:
```
Expand All @@ -315,7 +315,7 @@ These changes would need to be implemented in phases at decent intervals:
If desired, the period of deprecation can be extended indefinitely by continuing to accept code not versioned as new—but without validation. That is, by delaying or canceling phase 2.

Regardless, we will need a versioning scheme like [EIP-1702](https://github.com/ethereum/EIPs/pull/1702) to allow current code and EIP-615 code to coexist on the same blockchain.
Regardless, we will need a versioning scheme like [EIP-1702](https://github.com/ethereum/EIPs/pull/1702) to allow current code and EIP-615 code to coexist on the same blockchain.

## Rationale

Expand All @@ -325,7 +325,7 @@ As described above, the approach was simply to deprecate the problematic dynamic

## Implementation

Implementation of this proposal need not be difficult. At the least, interpreters can simply be extended with the new opcodes and run unchanged otherwise. The new opcodes require only stacks for the frame pointers and return offsets and the few pushes, pops, and assignments described above. The bulk of the effort is the validator, which in most languages can almost be transcribed from the pseudocode above.
Implementation of this proposal need not be difficult. At the least, interpreters can simply be extended with the new opcodes and run unchanged otherwise. The new opcodes require only stacks for the frame pointers and return offsets and the few pushes, pops, and assignments described above. The bulk of the effort is the validator, which in most languages can almost be transcribed from the pseudocode above.

A lightly tested C++ reference implementation is available in [Greg Colvin's Aleth fork.](https://github.com/gcolvin/aleth/tree/master/libaleth-interpreter) This version required circa 110 lines of new interpreter code and a well-commented, 178-line validator.

Expand All @@ -346,7 +346,7 @@ Validating that jumps are to valid addresses takes two sequential passes over th
is_sub[code_size] // is there a BEGINSUB at PC?
is_dest[code_size] // is there a JUMPDEST at PC?
sub_for_pc[code_size] // which BEGINSUB is PC in?
bool validate_jumps(PC)
{
current_sub = PC
Expand All @@ -366,7 +366,7 @@ Validating that jumps are to valid addresses takes two sequential passes over th
is_dest[PC] = true
sub_for_pc[PC] = current_sub
}
// check that targets are in subroutine
for (PC = 0; instruction = bytecode[PC]; PC = advance_pc(PC))
{
Expand All @@ -390,7 +390,7 @@ Note that code like this is already run by EVMs to check dynamic jumps, includin

#### Subroutine Validation

This function can be seen as a symbolic execution of a subroutine in the EVM code, where only the effect of the instructions on the state being validated is computed. Thus the structure of this function is very similar to an EVM interpreter. This function can also be seen as an acyclic traversal of the directed graph formed by taking instructions as vertexes and sequential and branching connections as edges, checking conditions along the way. The traversal is accomplished via recursion, and cycles are broken by returning when a vertex which has already been visited is reached. The time complexity of this traversal is `O(|E|+|V|): The sum of the number of edges and number of verticies in the graph.
This function can be seen as a symbolic execution of a subroutine in the EVM code, where only the effect of the instructions on the state being validated is computed. Thus the structure of this function is very similar to an EVM interpreter. This function can also be seen as an acyclic traversal of the directed graph formed by taking instructions as vertexes and sequential and branching connections as edges, checking conditions along the way. The traversal is accomplished via recursion, and cycles are broken by returning when a vertex which has already been visited is reached. The time complexity of this traversal is `O(|E|+|V|)`: The sum of the number of edges and number of verticies in the graph.

The basic approach is to call `validate_subroutine(i, 0, 0)`, for `i` equal to the first instruction in the EVM code through each `BEGINDATA` offset. `validate_subroutine()` traverses instructions sequentially, recursing when `JUMP` and `JUMPI` instructions are encountered. When a destination is reached that has been visited before it returns, thus breaking cycles. It returns true if the subroutine is valid, false otherwise.

Expand Down Expand Up @@ -440,7 +440,7 @@ The basic approach is to call `validate_subroutine(i, 0, 0)`, for `i` equal to t
return false
if instruction is STOP, RETURN, or SUICIDE
return true
return true
// violates single entry
if instruction is BEGINSUB
Expand All @@ -463,10 +463,10 @@ The basic approach is to call `validate_subroutine(i, 0, 0)`, for `i` equal to t
if instruction is JUMPTO
{
PC = jump_target(PC)
continue
continue
}
// recurse to jump to code to validate
// recurse to jump to code to validate
if instruction is JUMPIF
{
if not validate_subroutine(jump_target(PC), return_pc, SP)
Expand Down

0 comments on commit 212a7f3

Please sign in to comment.