Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync Halt with Best peer has wrong pivot block #7763

Open
2 tasks
siladu opened this issue Oct 14, 2024 · 4 comments
Open
2 tasks

Sync Halt with Best peer has wrong pivot block #7763

siladu opened this issue Oct 14, 2024 · 4 comments
Labels
bug Something isn't working P4 Low (ex: Node doesn't start up when the configuration file has unexpected "end-of-line" character)

Comments

@siladu
Copy link
Contributor

siladu commented Oct 14, 2024

Tasks TODO:

  • Add a test, e.g. simulate an FCU which later gets reorged and show that the SyncTargetManager holds onto the reorged hash.
  • Fix, e.g. only hold onto the bad hash for a certain time / number of blocks and let a subsequent FCU override it.

Bug Analysis

On dev-elc-bu-tk-mainnet-simon-perf-24.10.0-RC1-snap-01

Sync hasn't progressed and repeatedly gets this error...
{"@timestamp":"2024-10-14T04:14:06,834","level":"WARN","thread":"nioEventLoopGroup-3-4","class":"SyncTargetManager","message":"Best peer has wrong pivot block (#20939063) expecting 0x1e84b312979f4e6f2708e31fcc52bfbcc5a0aac7047565cbfd06fcb16a8623b8 but received 0x7466118286794e63be64f59b31665db8297d640432ea084c3d201d5097d44908. Disconnect: PeerId: 0xdef2fc1486c65961... PeerReputation score: 102, timeouts: {}, useless: 0, validated? true, disconnected? false, client: Geth/v1.13.14-stable-2ecaf439/linux-amd64/go1.22.2, [Connection with hashCode -1435576429 inboundInitiated? false initAt 1728879241331], enode://def2fc1486c65961da0fdd8545dd55d8d3b4b23c855523b6f142a27d1c7a3f02365a25724c4b607ed778e68a1cad68dc5ce84c6c325ea77671a5d74dd9f55ba5@57.129.54.20:30000, isServingSnap false, has height 20961355, connected for 5503 ms","throwable":""}

When this error started:

{"@timestamp":"2024-10-11T01:21:09,424","level":"INFO","thread":"EthScheduler-Services-692 (importBlock)","class":"ImportBlocksStep","message":"Block import progress: 17339357 of 20938939 (82%), Peer count: 25","throwable":""}
{"@timestamp":"2024-10-11T01:21:31,278","level":"INFO","thread":"EthScheduler-Services-771","class":"PivotSelectorFromHeadBlock","message":"Returning head block hash 0x1e84b312979f4e6f2708e31fcc52bfbcc5a0aac7047565cbfd06fcb16a8623b8 as pivot","throwable":""}
{"@timestamp":"2024-10-11T01:21:31,279","level":"INFO","thread":"EthScheduler-Services-771","class":"PivotSelectorFromBlock","message":"Selecting new pivot block: Optional[0x1e84b312979f4e6f2708e31fcc52bfbcc5a0aac7047565cbfd06fcb16a8623b8]","throwable":""}
{"@timestamp":"2024-10-11T01:22:39,430","level":"INFO","thread":"EthScheduler-Services-772","class":"SnapWorldDownloadState","message":"Running world state heal process from peers with pivot block 20939063","throwable":""}
{"@timestamp":"2024-10-11T01:23:07,247","level":"WARN","thread":"nioEventLoopGroup-3-9","class":"SyncTargetManager","message":"Best peer has wrong pivot block (#20939063) expecting 0x1e84b312979f4e6f2708e31fcc52bfbcc5a0aac7047565cbfd06fcb16a8623b8 but received 0x7466118286794e63be64f59b31665db8297d640432ea084c3d201d5097d44908.  Disconnect: PeerId: 0x71bba39dce3d25d3... PeerReputation score: 150, timeouts: {}, useless: 0, validated? true, disconnected? false, client: Nethermind/v1.28.0+9c4816c2/linux-x64/dotnet8.0.8, [Connection with hashCode -1794742265 inboundInitiated? true initAt 1728609639692], enode://71bba39dce3d25d3d9eeade5b2c5f5544dbd9901ca9c6f46e35cf10d3d2fef646fed9b566151c7c41662621851b60a995a8205bf19e51c63d0409bc60ff02e64@108.21.71.217:30304?discport=0, isServingSnap true, has height 20939069, connected for 147555 ms","throwable":""}
{"@timestamp":"2024-10-11T01:23:07,516","level":"INFO","thread":"EthScheduler-Services-26 (batchPersistTrieNodeData)","class":"SnapSyncMetricsManager","message":"Healed 7143263 world state trie nodes, Peer count: 24","throwable":""}
{"@timestamp":"2024-10-11T01:23:12,274","level":"WARN","thread":"nioEventLoopGroup-3-4","class":"SyncTargetManager","message":"Best peer has wrong pivot block (#20939063) expecting 0x1e84b312979f4e6f2708e31fcc52bfbcc5a0aac7047565cbfd06fcb16a8623b8 but received 0x7466118286794e63be64f59b31665db8297d640432ea084c3d201d5097d44908.  Disconnect: PeerId: 0x92c502a57678977a... PeerReputation score: 150, timeouts: {}, useless: 0, validated? true, disconnected? false, client: Nethermind/v1.28.0+9c4816c2/linux-x64/dotnet8.0.8, [Connection with hashCode 1856665911 inboundInitiated? true initAt 1728605423917], enode://92c502a57678977a32615948a140a67a13ee8943cb4d30a06d4ed2955424a50f0d37d35539e76a55dd246826a1d4c97371bf3d165bad4dd31222a3b51973e99b@50.21.173.66:30303?discport=0, isServingSnap false, has height 20939063, connected for 4368357 ms","throwable":""}

Occasional ""Unable to find sync target" message...

{"@timestamp":"2024-10-13T23:59:29,121","level":"WARN","thread":"nioEventLoopGroup-3-2","class":"SyncTargetManager","message":"Best peer has wrong pivot block (#20939063) expecting 0x1e84b312979f4e6f2708e31fcc52bfbcc5a0aac7047565cbfd06fcb16a8623b8 but received 0x7466118286794e63be64f59b31665db8297d640432ea084c3d201d5097d44908.  Disconnect: PeerId: 0x02f514ecd067931f... PeerReputation score: 103, timeouts: {}, useless: 0, validated? true, disconnected? false, client: Nethermind/v1.28.0+9c4816c2/linux-x64/dotnet8.0.8, [Connection with hashCode 1228643462 inboundInitiated? true initAt 1728863963616], enode://02f514ecd067931fc411adbdc182daecbf15a00786e1376da310f750f3250f72eb64f5378b52563ccbf73d8a02b390d260d32b28ffddb9803ce2db8b5ef3ce99@81.6.40.92:30404?discport=42163, isServingSnap true, has height 20960085, connected for 5505 ms","throwable":""}
{"@timestamp":"2024-10-13T23:59:34,122","level":"INFO","thread":"EthScheduler-Timer-0","class":"SyncTargetManager","message":"Unable to find sync target. Waiting for 5 peers minimum. Currently checking 0 peers for usefulness.","throwable":""}
{"@timestamp":"2024-10-13T23:59:47,304","level":"WARN","thread":"nioEventLoopGroup-3-5","class":"SyncTargetManager","message":"Best peer has wrong pivot block (#20939063) expecting 0x1e84b312979f4e6f2708e31fcc52bfbcc5a0aac7047565cbfd06fcb16a8623b8 but received 0x7466118286794e63be64f59b31665db8297d640432ea084c3d201d5097d44908.  Disconnect: PeerId: 0xa8b30e7f6a5446a3... PeerReputation score: 103, timeouts: {}, useless: 0, validated? true, disconnected? false, client: Geth/v1.14.11-stable-f3c696fa/linux-amd64/go1.23.1, [Connection with hashCode -1968126139 inboundInitiated? false initAt 1728863983228], enode://a8b30e7f6a5446a3e18414a0dd4d582c49d45d7a2a26d738727a8dac1e2a16e5942446b44ba576819c16d2336a2a5595e042c766fa34e9a3475af7ccb437faab@112.124.58.218:30303, isServingSnap true, has height 20960087, connected for 4076 ms","throwable":""}

Full log...
dev-elc-bu-tk-mainnet-simon-perf-24.10.0-RC1-snap-01.log.tar.gz

@siladu
Copy link
Contributor Author

siladu commented Oct 14, 2024

Seems like maybe pivot block was reorged:
https://beaconcha.in/slot/10148805 is 0x7466118286794e63be64f59b31665db8297d640432ea084c3d201d5097d44908

Previous slot was missed:
https://beaconcha.in/slot/10148804

@siladu
Copy link
Contributor Author

siladu commented Oct 14, 2024

Seems likely related to #7718

@siladu
Copy link
Contributor Author

siladu commented Oct 14, 2024

Restarting the node seems to have got things going again.

@siladu siladu added bug Something isn't working P2 High (ex: Degrading performance issues, unexpected behavior of core features (DevP2P, syncing, etc)) labels Oct 14, 2024
@siladu siladu mentioned this issue Oct 14, 2024
23 tasks
@siladu siladu removed their assignment Oct 15, 2024
@macfarla macfarla added P4 Low (ex: Node doesn't start up when the configuration file has unexpected "end-of-line" character) and removed P2 High (ex: Degrading performance issues, unexpected behavior of core features (DevP2P, syncing, etc)) labels Oct 28, 2024
@macfarla
Copy link
Contributor

changing to P4 since it is unlikely to occur when using the save block (ie since #7767 was reverted)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working P4 Low (ex: Node doesn't start up when the configuration file has unexpected "end-of-line" character)
Projects
None yet
Development

No branches or pull requests

3 participants
@macfarla @siladu and others