Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

worldstate refactor #6209

Merged
merged 32 commits into from
Mar 8, 2024
Merged

Conversation

matkt
Copy link
Contributor

@matkt matkt commented Nov 28, 2023

PR description

for verkle, I’m starting a refactor. In simple terms, many classes can distinguish whether they’re in ‘bonsai’ or ‘forest’ mode, so a generic interface doesn’t offer much value, except for some minor and synchronization parts.
So, I’ve developed a ’WorldstatestorageCoordinator class. This will have a storage strategy, which could be either ‘bonsai’ or ‘forest’. This coordinator is needed when we’re unsure about the storage mode we use, and it helps for the routing process. when we’re sure about the storage mode, we directly use the appropriate strategy.
This approach reduces the number of generic interfaces, eliminating locations that are not applicable to modes like ‘forest’ or other not clean things like that.
Will be better when we will have to add some custom method or field for verkle

Test done on holesky

Checkpoint with trielog and blockchain pruning Snapsync with codeByhash
Bonsai OK OK

Test done on sepolia

Full Checkpoint Full with Full flat DB
Bonsai OK OK ok
Forest OK OK not needed

Test done on mainnet

Checkpoint with Full flat DB
Bonsai OK

Test done on QBFT network

Deploy contract, read storage , selfdestruct, block creation and block import

Fixed Issue(s)

Copy link

  • I thought about documentation and added the doc-change-required label to this PR if updates are required.
  • I thought about the changelog and included a changelog update if required.
  • If my PR includes database changes (e.g. KeyValueSegmentIdentifier) I have thought about compatibility and performed forwards and backwards compatibility tests

@matkt matkt force-pushed the feature/worldstate-refactor branch from caa73c3 to 657a78e Compare November 28, 2023 10:37
Signed-off-by: Karim TAAM <[email protected]>
Signed-off-by: Karim TAAM <[email protected]>
Signed-off-by: Karim TAAM <[email protected]>
Signed-off-by: Karim Taam <[email protected]>
Signed-off-by: Karim Taam <[email protected]>
Signed-off-by: Karim Taam <[email protected]>
@matkt matkt force-pushed the feature/worldstate-refactor branch from 24a2750 to a368a35 Compare December 20, 2023 16:34
@matkt matkt marked this pull request as ready for review December 21, 2023 11:04
Copy link
Contributor

@gfukushima gfukushima left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm with a few minor questions

Copy link
Contributor

@garyschulte garyschulte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good refactor with WorldStateStorageCoordinator. I think we should take the opportunity to get rid of some stuff rather than perpetuate it (like the state backup stuff), but that can be in another PR (before or after this one).

Only gripe is in the coordinator where we unnecessarily perpetuate each storage format and its worldstate as a part of the method signature

@@ -81,15 +81,19 @@ public void run() {

final BesuController besuController = createBesuController();
final MutableBlockchain blockchain = besuController.getProtocolContext().getBlockchain();
final WorldStateStorage worldStateStorage =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this has been an undocumented experimental feature for 3 years. we should get rid of it. probably in a different PR

@@ -83,7 +83,7 @@ public class RestoreState implements Runnable {
private long trieNodeCount;
private boolean compressed;
private BesuController besuController;
private WorldStateStorage.Updater updater;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same, non-blocking. forest is deprecated and we should deprecate these commands too

}
}

public <RESPONSE> RESPONSE applyForStrategy(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is worse tbh. now we have both forest and bonsai details as part of the interface and package imports, and the method signature so generic it is meaningless.

This should be private and the only external place we use it, in fast sync, that will soon be the same (just code hash) after #5889

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment about how to remove this via generics and/or WorldStateKeyValueStorage sub-interfaces like PathBasedWorldStorage, HashBasedWorldStorage, etc.

@@ -84,12 +84,12 @@ public StateBackupService(
final Blockchain blockchain,
final Path backupDir,
final EthScheduler scheduler,
final WorldStateStorage worldStateStorage) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔪 💀

@@ -58,7 +58,7 @@ public class MarkSweepPruner {
private static final int MAX_MARKING_THREAD_POOL_SIZE = 2;

private final int operationsPerTransaction;
private final WorldStateStorage worldStateStorage;
private final ForestWorldStateKeyValueStorage worldStateKeyValueStorage;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

final WorldStatePreimageStorage preimageStorage,
final EvmConfiguration evmConfiguration) {
this.worldStateStorage = worldStateStorage;
this.worldStateKeyValueStorage = (ForestWorldStateKeyValueStorage) worldStateKeyValueStorage;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can probably get away from this explicit cast by changing the constructor signature. It looks like only evmtool BlockchainModule is calling this from a context that doesn't explicitly have storage as a ForestWorldStateKeyValueStorage.

Copy link
Contributor Author

@matkt matkt Jan 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried but if I remove this cast I will have to add it to the other contructor that call this one. I can maybe n ot use this constructor for the other one . I don't know what do you think

public ForestMutableWorldState(
      final WorldStateKeyValueStorage worldStateKeyValueStorage,
      final WorldStatePreimageStorage preimageStorage,
      final EvmConfiguration evmConfiguration) {
    this(
        MerkleTrie.EMPTY_TRIE_NODE_HASH,
        worldStateKeyValueStorage,
        preimageStorage,
        evmConfiguration);
  }

  public ForestMutableWorldState(
      final Bytes32 rootHash,
      final WorldStateKeyValueStorage worldStateKeyValueStorage,
      final WorldStatePreimageStorage preimageStorage,
      final EvmConfiguration evmConfiguration) {
    this.worldStateKeyValueStorage = (ForestWorldStateKeyValueStorage) worldStateKeyValueStorage;
    this.accountStateTrie = newAccountStateTrie(rootHash);
    this.preimageStorage = preimageStorage;
    this.evmConfiguration = evmConfiguration;
  }`

public class WorldStateStorageCoordinator {
private final WorldStateKeyValueStorage worldStateKeyValueStorage;

public WorldStateStorageCoordinator(final WorldStateKeyValueStorage worldStateKeyValueStorage) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 to the coordinator concept. I like that this is the central place to add/remove storage formats and their interfaces.

the only bits I do not care for are where we double-down on forest and bonsai storage modes in the signatures, the apply and consume ForStrategy methods basically. I think we could do some lambda magic to limit the bonsai/forest stuff in these methods, like:

public <RESPONSE, T extends WorldStateKeyValueStorage> RESPONSE applyForStrategy(
      final Function<T, RESPONSE> strategy) {

Copy link
Contributor Author

@matkt matkt Jan 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why I am adding Bonsai and Forest in the signature is because I no longer want a common interface between Bonsai and Forest. The coordinator will play the role of redirection and do the cast to simplify other calls. So, as I don't have a common interface, I can no longer create a single strategy interface that I pass to the coordinator. I can do that but I will have to add the cast logic in other calls and in order to simplify this call I will add complexity to the other calls . I must therefore pass both and the coordinator calls the right one.

If I recreate an interface, it's doing what we had before and I specifically wanted to avoid that. Knowing that Forest will be eliminated soon, this should also remove the coordinator and this mechanism. But maybe I didn't understand your idea.

Copy link
Contributor

@garyschulte garyschulte Jan 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what I was suggesting:

garyschulte@aab6187

I didn't remove all the applyWhen, consumeFor, etc but I think we can simplify the coordinator a lot by just keeping an interface hierarchy and using it as a WorldStateCoordinator generic param.

Ideally when we get rid of Forest, we can just remove the hash-based interface (and implementations) and not touch the coordinator interface

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you confirm on this PR that it's ok for you to merge with the current implementation ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed 👍

@jframe
Copy link
Contributor

jframe commented Jan 16, 2024

Don't see anything to worry about in terms of impact on private networks

Copy link
Contributor

@garyschulte garyschulte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚢

Signed-off-by: Karim Taam <[email protected]>
Signed-off-by: Karim Taam <[email protected]>
Signed-off-by: Karim Taam <[email protected]>
Signed-off-by: Karim Taam <[email protected]>
Signed-off-by: Karim Taam <[email protected]>
Signed-off-by: Karim Taam <[email protected]>
Copy link
Contributor

@gfukushima gfukushima left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left a few nits and non-blocker comments, lgtm in general and it is nice to avoid the nulls we use to pass to match the general storage format interface. This approach seems to be friendly and extensible to add verkle

@@ -51,7 +51,7 @@ public class Pruner {
private final ExecutorService executorService;

@VisibleForTesting
Pruner(
public Pruner(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

think this doesn't need to be public.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will fix it thanks

this.worldStateRootHash =
Hash.wrap(
Bytes32.wrap(worldStateStorage.getWorldStateRootHash().orElse(Hash.EMPTY_TRIE_HASH)));
Bytes32.wrap(
worldStateKeyValueStorage.getWorldStateRootHash().orElse(getEmptyTrieHash())));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I slightly prefer the previous version of this, using the Hash.EMPTY_TRIE_HASH

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's for verkle , because we will override the value . with verkle empty trie hash is different

return 1;
}

@Override
public Optional<Bytes> getExistingData(
final SnapWorldDownloadState downloadState, final WorldStateStorage worldStateStorage) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What did we use the downloadState for previously? 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it was just useless parameter that we forgot to remove. I checked and it was not used

@gfukushima
Copy link
Contributor

Took a stab fixing the only unit test broken in this PR, please have a look if it looks good to you.

@garyschulte garyschulte merged commit 2360908 into hyperledger:main Mar 8, 2024
55 checks passed
amsmota pushed a commit to Citi/besu that referenced this pull request Apr 16, 2024
* worldstate storage refactor

Signed-off-by: Karim TAAM <[email protected]>
Signed-off-by: Gabriel Fukushima <[email protected]>
Co-authored-by: Gabriel Fukushima <[email protected]>
Signed-off-by: amsmota <[email protected]>
amsmota pushed a commit to Citi/besu that referenced this pull request Apr 16, 2024
* worldstate storage refactor

Signed-off-by: Karim TAAM <[email protected]>
Signed-off-by: Gabriel Fukushima <[email protected]>
Co-authored-by: Gabriel Fukushima <[email protected]>
Signed-off-by: amsmota <[email protected]>
matthew1001 pushed a commit to kaleido-io/besu that referenced this pull request Jun 7, 2024
* worldstate storage refactor

Signed-off-by: Karim TAAM <[email protected]>
Signed-off-by: Gabriel Fukushima <[email protected]>
Co-authored-by: Gabriel Fukushima <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants