Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Unexpected exception in pipline" stops state healing #6503

Closed
catwith1hat opened this issue Jan 31, 2024 · 4 comments
Closed

"Unexpected exception in pipline" stops state healing #6503

catwith1hat opened this issue Jan 31, 2024 · 4 comments

Comments

@catwith1hat
Copy link

catwith1hat commented Jan 31, 2024

Besu-24.1.1

besu --sync-mode=X_CHECKPOINT --data-storage-format=FOREST --network=mainnet

fails to complete the state healing step on the initial sync:

2024-01-30 22:36:39.929+01:00 | EthScheduler-Services-26 (batchPersistTrieNodeData) | INFO  | SnapsyncMetricsManager | Healed 15816732 world state trie nodes, Peer count: 25
2024-01-30 22:37:35.344+01:00 | EthScheduler-Services-23 (requestLoadLocalTrieNodeData) | INFO  | Pipeline | Unexpected exception in pipeline. Aborting.

and then dies (no new log lines for 8 hours).

I restarted besu couple of times, over the last 3 days and it's the fourth time Besu gets stuck like this. I also tried taking the CL offline during state healing so that the pivot point does not move. Eventually Besu just loses all peers and gets stuck as well.

The full log file is here

Versions (Add all that apply)

  • Software version: besu/v24.1.1/linux-x86_64/openjdk-java-19
  • Java version:
openjdk 19.0.2 2023-01-17
OpenJDK Runtime Environment (build 19.0.2+7-nixos)
OpenJDK 64-Bit Server VM (build 19.0.2+7-nixos, mixed mode, sharing)
  • OS Name & Version:
BUG_REPORT_URL="https://github.com/NixOS/nixpkgs/issues"
BUILD_ID="23.11pre-git"
DOCUMENTATION_URL="https://nixos.org/learn.html"
HOME_URL="https://nixos.org/"
ID=nixos
LOGO="nix-snowflake"
NAME=NixOS
PRETTY_NAME="NixOS 23.11 (Tapir)"
SUPPORT_END="2024-06-30"
SUPPORT_URL="https://nixos.org/community.html"
VERSION="23.11 (Tapir)"
VERSION_CODENAME=tapir
VERSION_ID="23.11"
  • Kernel Version: Linux node2 6.1.74-hardened1 #1-NixOS SMP PREEMPT_DYNAMIC Sat Jan 20 10:50:11 UTC 2024 x86_64 GNU/Linux
  • Consensus Client & Version if using Proof of Stake: Nimbus beacon node v24.1.2-24.1.2-stateofus

The datadir sits on an KINGSTON NVME on top of ext4 behind LVM on-top of LUKS. The same SSD (and host) was able to sync geth.

Thanks!

@siladu
Copy link
Contributor

siladu commented Jan 31, 2024

Just tested a FOREST + X_CHECKPOINT sync on holesky network and no issues.

@matkt may have some insights into the heal logs.

@garyschulte
Copy link
Contributor

This is a known issue with forest and snap trie healing unfortunately. I have been able to reproduce it readily on a local machine recently. However, since the forest storage format is slated to be deprecated soon we are not intending to fix it.

Bonsai is superior to forest in all use cases except as an archive node. Is there a reason you want to use forest format with snap-sync (no historical state)?

catwith1hat added a commit to catwith1hat/besu-docs that referenced this issue Feb 1, 2024
@catwith1hat
Copy link
Author

@garyschulte I picked Forest because I wanted to provide syncing service to other new nodes on the network. But I am happy to switch to Bonsai if that's the only choice.

I also opened a PR against to docs to note that the Forest storage format doesn't sync with mainnet.

Closing.

@garyschulte
Copy link
Contributor

@garyschulte I picked Forest because I wanted to provide syncing service to other new nodes on the network. But I am happy to switch to Bonsai if that's the only choice.

I also opened a PR against to docs to note that the Forest storage format doesn't sync with mainnet.

Closing.

it will snap sync with mainnet, but it takes a long time and can encounter bugs. It only is getting worse as the state grows.

Thanks for filing the issue. Also, soon we will be serving snap data from bonsai #5887 , probably a couple releases hence.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants