feat(fw): implement execute to run tests on live networks (#686)

Co-authored-by: danceratopz <[email protected]>
ethereum · Oct 24, 2024 · ac615f6 · ac615f6
1 parent 247c312
commit ac615f6
Show file tree

Hide file tree

Showing 39 changed files with 2,926 additions and 151 deletions.
diff --git a/docs/executing_tests/index.md b/docs/executing_tests/index.md
@@ -0,0 +1,117 @@
+# Executing Tests on Local Networks or Hive
+
+@ethereum/execution-spec-tests is capable of running tests on local networks or on Hive with a few considerations. This page describes how to do so.
+
+## The `execute` command and `pytest` plugin
+
+The `execute` command is capable of parse and execute all tests in the `tests` directory, collect the transactions it requires, send them to a client connected to a network, wait for the network to include them in a block and, finally, check the resulting state of the involved smart-contracts against the expected state to validate the behavior of the clients.
+
+It will not check for the state of the network itself, only the state of the smart-contracts, accounts and transactions involved in the tests, so it is possible that the network becomes unstable or forks during the execution of the tests, but this will not be detected by the command.
+
+The way this is achieved is by using a pytest plugin that will collect all the tests the same way as the fill plugin does, but instead of compiling the transactions and sending them as a batch to the transition tool, they are prepared and sent to the client one by one.
+
+Before sending the actual test transactions to the client, the plugin uses a special pre-allocation object that collects the contracts and EOAs that are used by the tests and, instead of pre-allocating them in a dictionary as the fill plugin does, it sends transactions to deploy contracts or fund the accounts for them to be available in the network.
+
+The pre-allocation object requires a seed account with funds available in the network to be able to deploy contracts and fund accounts. In the case of a live remote network, the seed account needs to be provided via a command-line parameter, but in the case of a local hive network, the seed account is automatically created and funded by the plugin via the genesis file.
+
+At the end of each test, the plugin will also check the remaining balance of all accounts and will attempt to automatically recover the funds back to the seed account in order to execute the following tests.
+
+## Differences between the `fill` and `execute` plugins
+
+The test execution with the `execute` plugin is different from the `fill` plugin in a few ways:
+
+### EOA and Contract Addresses
+
+The `fill` plugin will pre-allocate all the accounts and contracts that are used in the tests, so the addresses of the accounts and contracts will be known before the tests are executed, Further more, the test contracts will start from the same address on different tests, so there are collisions on the account addresses used across different tests. This is not the case with the `execute` plugin, as the accounts and contracts are deployed on the fly, from sender keys that are randomly generated and therefore are different in each execution.
+
+Reasoning behind the random generation of the sender keys is that one can execute the same test multiple times in the same network and the plugin will not fail because the accounts and contracts are already deployed.
+
+### Transactions Gas Price
+
+The `fill` plugin will use a fixed and minimum gas price for all the transactions it uses for testing, but this is not possible with the `execute` plugin, as the gas price is determined by the current state of the network.
+
+At the moment, the `execute` plugin does not query the client for the current gas price, but instead uses a fixed increment to the gas price in order to avoid the transactions to be stuck in the mempool.
+
+## Running Tests on a Hive Single-Client Local Network
+
+Tests can be executed on a local hive-controlled single-client network by running the `execute hive` command.
+
+This command requires hive to be running in `--dev` mode:
+
+```bash
+./hive --dev --client go-ethereum
+```
+
+This will start hive in dev mode with the single go-ethereum client available for launching tests.
+
+By default, the hive server will be listening on `http://127.0.0.1:3000`, but this can be changed by setting the `--dev.addr` flag:
+
+```bash
+./hive --dev --client go-ethereum --dev.addr http://127.0.0.1:5000
+```
+
+The `execute hive` can now be executed to connect to the hive server, but the environment variable `HIVE_SIMULATOR` needs to be set to the address of the hive server:
+
+```bash
+export HIVE_SIMULATOR=http://127.0.0.1:3000
+```
+
+And the tests can be executed with:
+
+```bash
+uv run execute hive --fork=Cancun
+```
+
+This will execute all available tests in the `tests` directory on the `Cancun` fork by connecting to the hive server running on `http://127.0.0.1:3000` and launching a single client with the appropriate genesis file.
+
+The genesis file is passed to the client with the appropriate configuration for the fork schedule, system contracts and pre-allocated seed account.
+
+All tests will be executed in the same network, in the same client, and serially, but when the `-n auto` parameter is passed to the command, the tests can also be executed in parallel.
+
+One important feature of the `execute hive` command is that, since there is no consensus client running in the network, the command drives the chain by the use of the Engine API to prompt the execution client to generate new blocks and include the transactions in them.
+
+## Running Test on a Live Remote Network
+
+Tests can be executed on a live remote network by running the `execute remote` command.
+
+The command also accepts the `--fork` flag which should match the fork that is currently active in the network (fork transition tests are not supported yet).
+
+The `execute remote` command requires to be pointed to an RPC endpoint of a client that is connected to the network, which can be specified by using the `--rpc-endpoint` flag:
+
+```bash
+uv run execute remote --rpc-endpoint=https://rpc.endpoint.io
+```
+
+Another requirement is that the command is provided with a seed account that has funds available in the network to deploy contracts and fund accounts. This can be done by setting the `--rpc-seed-key` flag:
+
+```bash
+uv run execute remote --rpc-endpoint=https://rpc.endpoint.io --rpc-seed-key 0x000102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f
+```
+
+The value needs to be a private key that is used to sign the transactions that deploy the contracts and fund the accounts.
+
+One last requirement is that the `--rpc-chain-id` flag is set to the chain id of the network that is being tested:
+
+```bash
+uv run execute remote --rpc-endpoint=https://rpc.endpoint.io --rpc-seed-key 0x000102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f --rpc-chain-id 12345
+```
+
+## `execute` Command Test Execution
+
+After executing wither `execute hive` or `execute remote`, the command will first create a random sender account from which all required test accounts will be deployed and funded, and this account is funded by sweeping (by default) the seed account.
+
+The sweep amount can be configured by setting the `--seed-account-sweep-amount` flag:
+
+```bash
+--seed-account-sweep-amount "1000 ether"
+```
+
+Once the sender account is funded, the command will start executing tests one by one by sending the transactions from this account to the network.
+
+Test transactions are not sent from the main sender account though, they are sent from a different unique account that is created for each test (accounts returned by `pre.fund_eoa`).
+
+If the command is run using the `-n` flag, the tests will be executed in parallel, and each process will have its own separate sender account, so the amount that is swept from the seed account is divided by the number of processes, so this has to be taken into account when setting the sweep amount and also when funding the seed account.
+
+After finishing each test the command will check the remaining balance of all accounts and will attempt to recover the funds back to the sender account, and at the end of all tests, the remaining balance of the sender account will be swept back to the seed account.
+
+There are instances where it will be impossible to recover the funds back from a test, for example, funds that are sent to a contract that has no built-in way to send them back, the funds will be stuck in the contract and they will not be recoverable.
diff --git a/docs/navigation.md b/docs/navigation.md
@@ -27,6 +27,7 @@
     * [EOF Tests](consuming_tests/eof_test.md)
     * [Common Types](consuming_tests/common_types.md)
     * [Exceptions](consuming_tests/exceptions.md)
+  * [Executing Tests](executing_tests/index.md)
   * [Getting Help](getting_help/index.md)
   * [Developer Doc](dev/index.md)
     * [Managing Configurations](dev/configurations.md)

diff --git a/docs/writing_tests/test_markers.md b/docs/writing_tests/test_markers.md
@@ -271,6 +271,48 @@ def test_something_with_all_tx_types_but_skip_type_1(state_test_only, tx_type):
 
 In this example, the test will be skipped if `tx_type` is equal to 1 by returning a `pytest.mark.skip` marker, and return `None` otherwise.
 
+## Fill/Execute Markers
+
+These markers are used to apply different markers to a test depending on whether it is being filled or executed.
+
+### `@pytest.mark.fill`
+
+This marker is used to apply markers to a test when it is being filled.
+
+```python
+import pytest
+
+from ethereum_test_tools import Alloc, StateTestFiller
+
+@pytest.mark.fill(pytest.mark.skip(reason="Only for execution"))
+def test_something(
+    state_test: StateTestFiller, 
+    pre: Alloc
+):
+    pass
+```
+
+In this example, the test will be skipped when it is being filled.
+
+### `@pytest.mark.execute`
+
+This marker is used to apply markers to a test when it is being executed.
+
+```python
+import pytest
+
+from ethereum_test_tools import Alloc, StateTestFiller
+
+@pytest.mark.execute(pytest.mark.xfail(reason="Depends on block context"))
+def test_something(
+    state_test: StateTestFiller, 
+    pre: Alloc
+):
+    pass
+```
+
+In this example, the test will be marked as expected to fail when it is being executed, which is particularly useful so that the test is still executed but does not fail the test run.
+
 ## Other Markers
 
 ### `@pytest.mark.slow`

diff --git a/pyproject.toml b/pyproject.toml
@@ -43,6 +43,7 @@ dependencies = [
     "ethereum-types>=0.2.1,<0.3",
     "pyyaml>=6.0.2",
     "types-pyyaml>=6.0.12.20240917",
+    "pytest-json-report>=1.5.0,<2",
 ]
 
 [project.urls]
@@ -85,6 +86,7 @@ docs = [
 [project.scripts]
 fill = "cli.pytest_commands.fill:fill"
 phil = "cli.pytest_commands.fill:phil"
+execute = "cli.pytest_commands.execute:execute"
 tf = "cli.pytest_commands.fill:tf"
 checkfixtures = "cli.check_fixtures:check_fixtures"
 consume = "cli.pytest_commands.consume:consume"

diff --git a/pytest-execute-hive.ini b/pytest-execute-hive.ini
@@ -0,0 +1,24 @@
+[pytest]
+console_output_style = count
+minversion = 7.0
+python_files = *.py
+testpaths = tests/
+markers =
+    slow
+    pre_alloc_modify
+addopts = 
+    -p pytest_plugins.concurrency
+    -p pytest_plugins.execute.sender
+    -p pytest_plugins.execute.pre_alloc
+    -p pytest_plugins.solc.solc
+    -p pytest_plugins.execute.rpc.hive
+    -p pytest_plugins.execute.execute
+    -p pytest_plugins.shared.execute_fill
+    -p pytest_plugins.forks.forks
+    -p pytest_plugins.spec_version_checker.spec_version_checker
+    -p pytest_plugins.pytest_hive.pytest_hive
+    -p pytest_plugins.help.help
+    -m "not eip_version_check"
+    --tb short
+    --dist loadscope
+    --ignore tests/cancun/eip4844_blobs/point_evaluation_vectors/
diff --git a/pytest-execute-recover.ini b/pytest-execute-recover.ini
@@ -0,0 +1,15 @@
+[pytest]
+console_output_style = count
+minversion = 7.0
+python_files = *.py
+testpaths = src/pytest_plugins/execute/test_recover.py
+markers =
+    slow
+    pre_alloc_modify
+addopts = 
+    -p pytest_plugins.execute.rpc.remote
+    -p pytest_plugins.execute.recover
+    -p pytest_plugins.help.help
+    -m "not eip_version_check"
+    --tb short
+    --dist loadscope
diff --git a/pytest-execute.ini b/pytest-execute.ini
@@ -0,0 +1,24 @@
+[pytest]
+console_output_style = count
+minversion = 7.0
+python_files = *.py
+testpaths = tests/
+markers =
+    slow
+    pre_alloc_modify
+addopts = 
+    -p pytest_plugins.concurrency
+    -p pytest_plugins.execute.sender
+    -p pytest_plugins.execute.pre_alloc
+    -p pytest_plugins.solc.solc
+    -p pytest_plugins.execute.execute
+    -p pytest_plugins.shared.execute_fill
+    -p pytest_plugins.execute.rpc.remote_seed_sender
+    -p pytest_plugins.execute.rpc.remote
+    -p pytest_plugins.forks.forks
+    -p pytest_plugins.spec_version_checker.spec_version_checker
+    -p pytest_plugins.help.help
+    -m "not eip_version_check"
+    --tb short
+    --dist loadscope
+    --ignore tests/cancun/eip4844_blobs/point_evaluation_vectors/
diff --git a/pytest-framework.ini b/pytest-framework.ini
@@ -14,3 +14,4 @@ addopts =
     --ignore=src/pytest_plugins/consume/direct/test_via_direct.py
     --ignore=src/pytest_plugins/consume/hive_simulators/engine/test_via_engine.py
     --ignore=src/pytest_plugins/consume/hive_simulators/rlp/test_via_rlp.py
+    --ignore=src/pytest_plugins/execute/test_recover.py
diff --git a/pytest.ini b/pytest.ini
@@ -11,6 +11,7 @@ addopts =
     -p pytest_plugins.filler.pre_alloc
     -p pytest_plugins.solc.solc
     -p pytest_plugins.filler.filler
+    -p pytest_plugins.shared.execute_fill
     -p pytest_plugins.forks.forks
     -p pytest_plugins.spec_version_checker.spec_version_checker
     -p pytest_plugins.eels_resolver

diff --git a/src/cli/pytest_commands/common.py b/src/cli/pytest_commands/common.py
@@ -2,7 +2,7 @@
 Common functions for CLI pytest-based entry points.
 """
 
-from typing import Any, Callable, List
+from typing import Any, Callable, Dict, List
 
 import click
 
@@ -38,6 +38,31 @@ def common_click_options(func: Callable[..., Any]) -> Decorator:
     return click.argument("pytest_args", nargs=-1, type=click.UNPROCESSED)(func)
 
 
+REQUIRED_FLAGS: Dict[str, List] = {
+    "fill": [],
+    "consume": [],
+    "execute": [
+        "--rpc-endpoint",
+        "x",
+        "--rpc-seed-key",
+        "x",
+        "--rpc-chain-id",
+        "1",
+    ],
+    "execute-hive": [],
+    "execute-recover": [
+        "--rpc-endpoint",
+        "x",
+        "--rpc-chain-id",
+        "1",
+        "--start-eoa-index",
+        "1",
+        "--destination",
+        "0x1234567890123456789012345678901234567890",
+    ],
+}
+
+
 def handle_help_flags(pytest_args: List[str], pytest_type: str) -> List[str]:
     """
     Modifies the help arguments passed to the click CLI command before forwarding to
@@ -49,7 +74,11 @@ def handle_help_flags(pytest_args: List[str], pytest_type: str) -> List[str]:
     ctx = click.get_current_context()
 
     if ctx.params.get("help_flag"):
-        return [f"--{pytest_type}-help"] if pytest_type in {"consume", "fill"} else pytest_args
+        return (
+            [f"--{pytest_type}-help", *REQUIRED_FLAGS[pytest_type]]
+            if pytest_type in {"consume", "fill", "execute", "execute-hive", "execute-recover"}
+            else pytest_args
+        )
     elif ctx.params.get("pytest_help_flag"):
         return ["--help"]
 

diff --git a/src/cli/pytest_commands/execute.py b/src/cli/pytest_commands/execute.py
@@ -0,0 +1,71 @@
+"""
+CLI entry point for the `execute` pytest-based command.
+"""
+
+import sys
+from typing import Tuple
+
+import click
+import pytest
+
+from .common import common_click_options, handle_help_flags
+
+
+@click.group(context_settings=dict(help_option_names=["-h", "--help"]))
+def execute() -> None:
+    """
+    Execute command to run tests in hive or live networks.
+    """
+    pass
+
+
+@execute.command(context_settings=dict(ignore_unknown_options=True))
+@common_click_options
+def hive(
+    pytest_args: Tuple[str, ...],
+    **kwargs,
+) -> None:
+    """
+    Execute tests using hive in dev-mode as backend, requires hive to be running
+    (using command: `./hive --dev`).
+    """
+    pytest_type = "execute-hive"
+    args = handle_help_flags(list(pytest_args), pytest_type=pytest_type)
+    ini_file = "pytest-execute-hive.ini"
+    args = ["-c", ini_file] + args
+    result = pytest.main(args)
+    sys.exit(result)
+
+
+@execute.command(context_settings=dict(ignore_unknown_options=True))
+@common_click_options
+def remote(
+    pytest_args: Tuple[str, ...],
+    **kwargs,
+) -> None:
+    """
+    Execute tests using a remote RPC endpoint.
+    """
+    pytest_type = "execute"
+    args = handle_help_flags(list(pytest_args), pytest_type=pytest_type)
+    ini_file = "pytest-execute.ini"
+    args = ["-c", ini_file] + args
+    result = pytest.main(args)
+    sys.exit(result)
+
+
+@execute.command(context_settings=dict(ignore_unknown_options=True))
+@common_click_options
+def recover(
+    pytest_args: Tuple[str, ...],
+    **kwargs,
+) -> None:
+    """
+    Recover funds from a failed test execution using a remote RPC endpoint.
+    """
+    pytest_type = "execute-recover"
+    args = handle_help_flags(list(pytest_args), pytest_type=pytest_type)
+    ini_file = "pytest-execute-recover.ini"
+    args = ["-c", ini_file] + args
+    result = pytest.main(args)
+    sys.exit(result)