perf(`anvil`): enhance block mining performance in Anvil node for high throughput and efficiency #7039

mshakeg · 2024-02-07T19:51:23Z

Component

Anvil

Describe the feature you would like

I propose a performance enhancement for the Anvil node, specifically targeting the efficiency of block mining. Through some tests I've observed that while Anvil demonstrates impressive transaction processing capabilities, there's a noticeable disparity in throughput efficiency primarily attributed to the time spent mining blocks. This feature request seeks optimizations in Anvil's block mining to reduce execution time, thereby increasing the overall transactions per second (TPS) throughput and making the node more suitable for applications requiring high transaction processing speeds as well as frequent mining of blocks.

Additional context

Anvil version: 0.2.0 (2cf84d9 2024-02-07T00:15:49.622159000Z)

To illustrate the current performance characteristics and provide a basis for this request, I conducted a test using a Uniswap V3 transaction replay script. The findings highlight a significant potential for performance gains in block mining processes. For instance, when increasing the nullSwapsPerBlock from 1 to 2000, the average TPS improved dramatically(by a factor of 7x), indicating that the node spends a significant portion of time mining blocks vs actual transaction execution. To replicate this test:

clone this repo anvil-backtester, install deps(pnpm i)
start the anvil node: pnpm anvil:start
run the test script: pnpm test:anvil-memory with nullSwapsPerBlock set to 1 and then again set to 2000 and observe results similar to the following indicating significant overhead in mining blocks:

{
  blocksToMine: 25,
  nullSwapsPerBlock: 1,
  totalTxs: 50,
  executionTime: 0.084,
  averageTPS: 595.2380952380952,
  averageTimePerTx: 1.6800000000000002
}

{
  blocksToMine: 25,
  nullSwapsPerBlock: 2000,
  totalTxs: 100000,
  executionTime: 24.747,
  averageTPS: 4040.8938457186728,
  averageTimePerTx: 0.24747000000000002
}

The text was updated successfully, but these errors were encountered:

mattsse · 2024-02-07T19:56:41Z

it likely spends most of the time cleaning up / updating old state

could you try with --prune-history if you notice any difference?

There's definitely room for significant improvements here

mshakeg · 2024-02-07T20:09:29Z

@mattsse I am using --prune-history in the anvil command as shown below

https://github.com/mshakeg/anvil-backtester/blob/main/shell/anvil.sh

Removing --prune-history and --transaction-block-keeper 4 from the above command does not result in any noticeable changes in performance.

mattsse · 2024-02-07T20:11:38Z

hmm, could you perhaps run this with samply https://github.com/mstange/samply and see if anything sticks out

I'll try to investigate shortly

mshakeg · 2024-02-07T20:56:10Z

@mattsse thanks, don't really know what to make of the profile, but I've attached the trace on evm_mine, maybe GPT4 could be a source of inspiration :)

Based on this call trace, here are a few points to consider for profiling and improving performance:

Database Interactions: The evm_mine operation involves interactions with an in-memory database. Optimizations here could involve reducing the number of reads and writes, caching frequently accessed data, or improving the database's data structures.

State Trie Manipulation: There are multiple calls to trie_db functions, which indicate manipulation of the state trie. This is an area that typically has a significant impact on performance. Optimizing trie algorithms or using a more efficient trie structure could yield performance improvements.

Hash Calculations: The keccak_hasher and tiny_keccak functions suggest that Keccak hashing is part of the operation. Optimizing hashing or reducing the number of hash calculations required could improve performance.

EVM Execution: The revm specific calls such as run_interpreter and preverified_inner imply that EVM bytecode execution is a part of the process. Profiling the EVM's interpreter loop, opcode execution, and context switching could reveal bottlenecks.

Smart Contract Calls: Calls to inspect_call_instruction and Host::call suggest that smart contract function calls are being made. Optimizing the way smart contracts are called and executed, possibly by reducing the overhead of call setup and teardown, could improve performance. This could include minimizing the overhead associated with setting up the environment for a contract call and efficiently handling the stack and memory operations.

Parallelism and Concurrency: Evaluate if any parts of the evm_mine process can be executed in parallel. Some operations, especially state-independent ones, may benefit from concurrent execution.

Memory Management: Functions like drop_in_place suggest that there is active management of memory, possibly with data structures being de-allocated. Improving memory allocation strategies, avoiding unnecessary allocations, and reusing memory buffers could reduce overhead and improve performance.

Opcode Optimization: Within the EVM execution, certain opcodes may be used more frequently or may be more resource-intensive. Profiling at the opcode level could help identify if specific opcodes are bottlenecks and could be optimized.

Caching Strategies: For repetitive operations, especially within the EVM interpreter, caching results of expensive computations could be beneficial if they're likely to be repeated with the same inputs.

Profiling and Instrumentation Tools: Utilize profiling tools that can provide granular insights into CPU and memory usage. Rust's performance tools, such as perf on Linux or DTrace/BPF on BSD/Mac, can help identify hot paths and functions that are taking the most time or consuming the most resources.

Algorithmic Efficiency: Review the algorithms used in the trie manipulation and hashing to ensure they are the most efficient for the use case. Sometimes, algorithmic improvements can yield better performance gains than low-level optimizations.

Code Review and Refactoring: There might be opportunities to refactor the code for efficiency. This could involve combining functions, inlining functions to reduce call overhead, or simplifying complex logic.

Batch Processing: If the evm_mine operation can be batched (i.e., processing multiple transactions or blocks in a single operation), it could reduce the per-operation overhead and take advantage of more efficient bulk processing techniques.

Asynchronous Processing: Look into asynchronous processing where applicable to avoid blocking operations, particularly for I/O bound tasks.

mattsse · 2024-02-08T13:14:36Z

thanks!

will investigate, but looks like stateroot

mshakeg · 2024-02-08T13:42:50Z

@mattsse thanks, might be a good idea to have flags that disable logic not really needed on a local node, similar to how the eth_sendUnsignedTransaction method can be used to send an unsigned transaction.

zerosnacks · 2024-07-11T09:44:14Z

Relevant conversation in #7546: #7546 (comment)

grandizzy · 2024-10-19T10:03:55Z

@mshakeg I retried your test driver with latest anvil and got following results

with nullSwapsPerBlock=1

{
  blocksToMine: 10,
  nullSwapsPerBlock: 1,
  totalTxs: 20,
  executionTime: 0.024,
  averageTPS: 833.3333333333334,
  averageTimePerTx: 1.2
}

with nullSwapsPerBlock=2000 constantly getting values around

{
  blocksToMine: 10,
  nullSwapsPerBlock: 2000,
  totalTxs: 40000,
  executionTime: 5.987,
  averageTPS: 6681.142475363287,
  averageTimePerTx: 0.149675
}

best result in couple of tries (when reinstalled anvil)

{
  blocksToMine: 10,
  nullSwapsPerBlock: 2000,
  totalTxs: 40000,
  executionTime: 4.806,
  averageTPS: 8322.929671244277,
  averageTimePerTx: 0.12015
}

Note that I have to restart anvil between runs because of which locks test driver on subsequent createPool, ref ethers-io/ethers.js#4224

      const tx = await uniswapV3Factory.createPool(token0, token1, FeeAmount.LOW);
      const rc = await tx.wait(); <- locks here, even tx is mined and includded

Would this be a reasonable enhancement in scope of this ticket? Thank you

Cc @klkvr re #7546 comment

grandizzy · 2024-11-15T18:24:39Z

bump @mshakeg please check comment above. thanks!

mshakeg · 2024-11-17T15:32:58Z

@grandizzy are you referring to the locking issue? if so then I wouldn't say it's related to this issue, so maybe open a new issue if there isn't already one?

grandizzy · 2024-11-17T15:45:40Z

@mshakeg not to the locking issue ( that's some not in our control but ethersjs see also #7275 (comment)
#4399 (comment)
, most probably ethers-io/ethers.js#4224) but to the new numbers I posted above using your test driver (that looks beeter than original posted). Thank you

mshakeg · 2024-11-17T15:51:35Z

@grandizzy sure, though have you done some profiling to determine why the large TPS discrepancy? If much time is spent mining blocks(and computing merkl roots when mined for example) that could be disabled in a local node then agreed with adding an option to skip these computations for an even more performant local node.

grandizzy · 2024-11-17T15:58:19Z

I just retested the original issue with latest anvil version and noticed different numbers, hence my question if still a problem and continue investigation or if it could be closed

grandizzy · 2024-11-17T16:09:28Z

@grandizzy sure, though have you done some profiling to determine why the large TPS discrepancy? If much time is spent mining blocks(and computing merkl roots when mined for example) that could be disabled in a local node then agreed with adding an option to skip these computations for an even more performant local node.

I think this explain it #5499 (comment)

grandizzy · 2024-11-19T10:02:23Z

going to merge this one with #5499 and to track potential hardhat behavior, @mshakeg please reopen if you think they should be addresses differently. thank you!

mshakeg added the T-feature Type: feature label Feb 7, 2024

gakonst added this to Foundry Feb 7, 2024

github-project-automation bot moved this to Todo in Foundry Feb 7, 2024

DaniPopes mentioned this issue Apr 5, 2024

fix: use alloy-trie for eth_getProof #7546

Merged

zerosnacks added T-perf Type: performance C-anvil Command: anvil and removed T-feature Type: feature labels Jul 11, 2024

zerosnacks changed the title ~~Enhance Block Mining Performance in Anvil Node for High Throughput and Efficiency~~ perf(anvil): enhance block mining performance in Anvil node for high throughput and efficiency Jul 11, 2024

zerosnacks changed the title ~~perf(anvil): enhance block mining performance in Anvil node for high throughput and efficiency~~ perf(anvil): enhance block mining performance in Anvil node for high throughput and efficiency Jul 11, 2024

zerosnacks mentioned this issue Jul 11, 2024

meta(anvil): tracking issue for Anvil improvements #8269

Open

zerosnacks added this to the v1.0.0 milestone Jul 26, 2024

jenpaff removed this from the v1.0.0 milestone Sep 26, 2024

grandizzy closed this as not planned Won't fix, can't repro, duplicate, stale Nov 19, 2024

github-project-automation bot moved this from Todo to Done in Foundry Nov 19, 2024

grandizzy mentioned this issue Dec 5, 2024

Anvil debug_traceTransaction works differently for transactions mined by anvil vs ones mined before #9497

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(`anvil`): enhance block mining performance in Anvil node for high throughput and efficiency #7039

perf(`anvil`): enhance block mining performance in Anvil node for high throughput and efficiency #7039

mshakeg commented Feb 7, 2024

mattsse commented Feb 7, 2024

mshakeg commented Feb 7, 2024 •

edited

Loading

mattsse commented Feb 7, 2024

mshakeg commented Feb 7, 2024

mattsse commented Feb 8, 2024

mshakeg commented Feb 8, 2024

zerosnacks commented Jul 11, 2024 •

edited

Loading

grandizzy commented Oct 19, 2024 •

edited

Loading

grandizzy commented Nov 15, 2024

mshakeg commented Nov 17, 2024

grandizzy commented Nov 17, 2024

mshakeg commented Nov 17, 2024

grandizzy commented Nov 17, 2024 •

edited

Loading

grandizzy commented Nov 17, 2024

grandizzy commented Nov 19, 2024

perf(anvil): enhance block mining performance in Anvil node for high throughput and efficiency #7039

perf(anvil): enhance block mining performance in Anvil node for high throughput and efficiency #7039

Comments

mshakeg commented Feb 7, 2024

Component

Describe the feature you would like

Additional context

mattsse commented Feb 7, 2024

mshakeg commented Feb 7, 2024 • edited Loading

mattsse commented Feb 7, 2024

mshakeg commented Feb 7, 2024

mattsse commented Feb 8, 2024

mshakeg commented Feb 8, 2024

zerosnacks commented Jul 11, 2024 • edited Loading

grandizzy commented Oct 19, 2024 • edited Loading

grandizzy commented Nov 15, 2024

mshakeg commented Nov 17, 2024

grandizzy commented Nov 17, 2024

mshakeg commented Nov 17, 2024

grandizzy commented Nov 17, 2024 • edited Loading

grandizzy commented Nov 17, 2024

grandizzy commented Nov 19, 2024

perf(`anvil`): enhance block mining performance in Anvil node for high throughput and efficiency #7039

perf(`anvil`): enhance block mining performance in Anvil node for high throughput and efficiency #7039

mshakeg commented Feb 7, 2024 •

edited

Loading

zerosnacks commented Jul 11, 2024 •

edited

Loading

grandizzy commented Oct 19, 2024 •

edited

Loading

grandizzy commented Nov 17, 2024 •

edited

Loading