Perf/NoWAL during OldBodies #6227

asdacap · 2023-10-26T06:14:15Z

Most of the write IO during old bodies are actually due to WAL (Write Ahead Log) file, which is the mechanism for which rocksdb recover on unclean shutdown.
This PR disable WAL writes for bodies writes from OldBodies.
For recovery, the LowestInsertedBlockNumber is not updated on all request, instead it is updated every 100000 blocks, at which point an explicit flush is also triggered to make sure the memtables are written at that point.
This reduces total writes during OldBodies by about 60-70% or so. This is more than 50% because the WAL file are not compressed and the bodies are compressed to about 60-70% on flush.
No change in bodies sync time. Unless your SSD can't sustain about 350MB/s of writes before and you have really fast internet.
Graph is after, before, after, before.

Changes

Add WriteFlags to PutSpan.
Add WriteFlags to BlockTree.Insert
Set nowal during old bodies.

Types of changes

What types of changes does your code introduce?

Optimization

Testing

Requires testing

Yes

If yes, did you write tests?

Yes

Notes on testing

Manually kill -s KILL nethermind a couple of time during sync. Then run a custom python script to verify all expected blocks is present. Verified, it resumed slightly before the point it was killed

LukaszRozmej

Why 100k? What would happen if we increase or decrease this number?
What is the current bottleneck? CPU? Network?

asdacap · 2023-10-26T12:00:00Z

The 100k is somewhat arbitrary. The only consideration is to not split the write buffer too much as it will make file size small and increase number of file. Each write buffer size is 256MB during blobfile tune, if I'm not mistaken, so thats probably 2.5k blocks. So, if we set that to 5k, probably half of the write buffer got split, which we don't want.

I'm not sure what is the current limit. I'm guessing its CPU, but its not using 100% CPU. If its network, I've set up 4 geth node running locally as static peer, so thats probably not it. It know that per-connection, there is a single thread limit due to decoding devp2p. I already set network processing thread count to 32, thats not it too.

LukaszRozmej approved these changes Oct 26, 2023

View reviewed changes

asdacap added 9 commits October 28, 2023 11:15

Pass write flags

9293eba

Disable wal on old bodies

00dcdb4

Fix build

1d5694e

Fix flag not forwarded

7be0695

Added some tests

a9fb244

Fix sync can't finish

57aeea2

Missed a file

1de5a41

Make sure flush pointer is correct

adfae40

Whitespace

bb8b894

asdacap force-pushed the perf/oldbodies-with-nowal branch from e8c7704 to bb8b894 Compare October 28, 2023 03:16

asdacap merged commit 9e3aa25 into master Oct 28, 2023

asdacap deleted the perf/oldbodies-with-nowal branch October 28, 2023 06:32

Scooletz mentioned this pull request Jan 24, 2025

FastHeadersSyncFeed explicitly flush before setting metadata #8103

Draft

16 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Perf/NoWAL during OldBodies #6227

Perf/NoWAL during OldBodies #6227

asdacap commented Oct 26, 2023

LukaszRozmej left a comment

asdacap commented Oct 26, 2023

Perf/NoWAL during OldBodies #6227

Perf/NoWAL during OldBodies #6227

Conversation

asdacap commented Oct 26, 2023

Changes

Types of changes

What types of changes does your code introduce?

Testing

Requires testing

If yes, did you write tests?

Notes on testing

LukaszRozmej left a comment

Choose a reason for hiding this comment

asdacap commented Oct 26, 2023