feat: discard state writes in StateCompute and StateReplay. #10457

raulk · 2023-03-12T00:34:21Z

Related Issues

This originally started because Glif and other operators begun having blockstore ballooning issues starting with v1.20.3. The issue was the usage of StateCompute in the Eth API. That method finishes by flushing the new state to the blockstore. Badger is an LSM + value log store, and it happily accepts every write.

Proposed Changes

The fix is to add a flag to skip writing the state tree when executing a tipset. By default we write, except when the caller requests us not to. Instead of polluting APIs with yet another option, I decided to add an option struct (ExecutorOpts). Unfortunately, TipSetState is used a many places, so to avoid breakage I introduced a TipSetWithOpts method. We could also use the functional options pattern, but those are generally used in more public/user-facing APIs.

This change is no longer necessary for the Eth API, since we no longer need to compute state after #10446. However, they will be very beneficial to users of StateCompute (e.g. exchanges) and StateReplay.

Additional Info

Checklist

Before you mark the PR ready for review, please make sure that:

Commits have a clear commit message.
PR title is in the form of of <PR type>: <area>: <change being made>
- example: fix: mempool: Introduce a cache for valid signatures
- PR type: fix, feat, build, chore, ci, docs, perf, refactor, revert, style, test
- area, e.g. api, chain, state, market, mempool, multisig, networking, paych, proving, sealing, wallet, deps
New features have usage guidelines and / or documentation updates in
- Lotus Documentation
- Discussion Tutorials
Tests exist for new functionality or change in behavior
CI is green

This required a small refactor in some execution APIs. Instead of adding one more parameter in methods that are already parameter-heavy, I chose to wrap arguments in an options struct.

arajasek

Broadly looks good to me, and a nice improvement. One note.

arajasek · 2023-03-13T14:51:08Z

chain/stmgr/execute.go

+	return sm.TipSetStateWithOpts(ctx, ts, &TipSetStateOpts{})
+}
+
+type TipSetStateOpts struct {


I'm not convinced this is necessary? With the existing tipsetstate cache, and the newly added fast lookup in #10445, I think we can drop this. We'll only wind up executing the tipset (making the write) if:

we're executing a tipset that isn't in our main chain

we're missing the actual state / receipts root.

Stebalien · 2023-03-13T15:59:12Z

However, they will be very beneficial to users of StateCompute (e.g. exchanges) and StateReplay.

I'm not so sure. Those users likely need the state.

Is the issue here that we don't check if the blockstore has it before writing?

raulk · 2023-04-17T10:40:30Z

We're seeing reports from archival nodes used by analytics tools ballooning in state store size. I think this is the culprit.

Those users likely need the state.

I guess this applies to novel message applications? E.g. where a message is being applied to an alternative tipset or messages are being selectively replayed one after another? In that case, yes, this PR would break those use cases.

Is the issue here that we don't check if the blockstore has it before writing?

That's another way to think about this, probably the safest way to implement this? (But also the most unfortunate as it adds an index lookup for every write).

raulk · 2023-04-18T14:10:03Z

Superseded by #10680, which is simplier and less invasive.

eshon · 2023-05-25T11:07:42Z

FYI - Glif Nodes just reported that archival nodes now seem to be growing at about 19 GiB / day, down from about 50 GiB / day before. (An FEVM archival node is ~3.6 TiB currently.)

raulk requested a review from a team as a code owner March 12, 2023 00:34

raulk mentioned this pull request Mar 12, 2023

make StateCompute not write the output state to the blockstore. #10453

Closed

7 tasks

raulk added this to the Eth RPC perf + fixes milestone Mar 12, 2023

Base automatically changed from steb/optimize-eth-block to master March 12, 2023 15:46

raulk added the release/backport label Mar 12, 2023

raulk force-pushed the raulk/fix-statecompute-write branch from 38f2adc to 530b1d1 Compare March 12, 2023 20:43

add option to discard state writes in StateCompute and StateReplay.

7ea6676

This required a small refactor in some execution APIs. Instead of adding one more parameter in methods that are already parameter-heavy, I chose to wrap arguments in an options struct.

raulk force-pushed the raulk/fix-statecompute-write branch from 530b1d1 to 7ea6676 Compare March 13, 2023 11:09

arajasek reviewed Mar 13, 2023

View reviewed changes

jennijuju removed the release/backport label Mar 22, 2023

jennijuju modified the milestones: Eth RPC perf + fixes, Lotus - v1.23.0 Mar 22, 2023

Stebalien mentioned this pull request Apr 17, 2023

feat: badger: add a has check before writing to reduce duplicates #10680

Merged

4 tasks

raulk closed this Apr 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: discard state writes in StateCompute and StateReplay. #10457

feat: discard state writes in StateCompute and StateReplay. #10457

raulk commented Mar 12, 2023

arajasek left a comment

arajasek Mar 13, 2023

Stebalien commented Mar 13, 2023

raulk commented Apr 17, 2023

raulk commented Apr 18, 2023

eshon commented May 25, 2023

feat: discard state writes in StateCompute and StateReplay. #10457

feat: discard state writes in StateCompute and StateReplay. #10457

Conversation

raulk commented Mar 12, 2023

Related Issues

Proposed Changes

Additional Info

Checklist

arajasek left a comment

Choose a reason for hiding this comment

arajasek Mar 13, 2023

Choose a reason for hiding this comment

Stebalien commented Mar 13, 2023

raulk commented Apr 17, 2023

raulk commented Apr 18, 2023

eshon commented May 25, 2023