Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Validator eventually stalls on Kusama #2261

Closed
Lederstrumpf opened this issue Nov 4, 2024 · 0 comments · Fixed by #2269
Closed

[Bug]: Validator eventually stalls on Kusama #2261

Lederstrumpf opened this issue Nov 4, 2024 · 0 comments · Fixed by #2269
Labels
bug Something isn't working

Comments

@Lederstrumpf
Copy link
Contributor

Bug Summary

Validator stalls after Grandpa cannot retrieve finalized block's authorities

Bug Description

After a couple of hours remaining in sync, validator node on Kusama eventually cannot retrieve authorities for a finalized block anymore and stalls.
Example: grandpa Warning Grandpa Can't retrieve authorities for finalized block #25635897 (0xd3b4…36d7)

To resume syncing, the node must be restarted.

Info levels logs (will post trace level logs when I hit the issue again):

Nov 05 00:51:56 v2-melb kagome[192728]: 24.11.05 00:51:56.419239  dispute          Warning   DisputeCoordinator  No known peers to receive dispute request
Nov 05 00:51:56 v2-melb kagome[192728]: 24.11.05 00:51:56.419394  dispute          Warning   DisputeCoordinator  No known peers to receive dispute request
Nov 05 00:51:56 v2-melb kagome[192728]: 24.11.05 00:51:56.741768  worker.10        Info      BlockStorage  Added block #25635892 (0xb9b2…8bff) as child of #25635891 (0xa920…8b00)
Nov 05 00:51:56 v2-melb kagome[192728]: 24.11.05 00:51:56.789525  main_runner      Warning   ParachainProcessorImpl  Prospective parachains leaf update failed. (relay_parent=0xb9b2…8bff, error=N5scale11DecodeErrorE(2) SCALE decode: unexpected value occurred)
Nov 05 00:51:56 v2-melb kagome[192728]: 24.11.05 00:51:56.800488  main_runner      Info      BlockExecutor  Imported block #25635892 (0xb9b2…8bff) within 385 ms. (lag 2800 ms.)
Nov 05 00:51:56 v2-melb kagome[192728]: 24.11.05 00:51:56.800516  main_runner      Info      Timeline  Caught up block #25635892 (0xb9b2…8bff)
Nov 05 00:51:56 v2-melb kagome[192728]: 24.11.05 00:51:56.855898  dispute          Warning   DisputeCoordinator  No known peers to receive dispute request
Nov 05 00:51:56 v2-melb kagome[192728]: 24.11.05 00:51:56.856077  dispute          Warning   DisputeCoordinator  No known peers to receive dispute request
Nov 05 00:51:56 v2-melb kagome[192728]: 24.11.05 00:51:56.856239  dispute          Warning   DisputeCoordinator  No known peers to receive dispute request
Nov 05 00:51:56 v2-melb kagome[192728]: 24.11.05 00:51:56.856389  dispute          Warning   DisputeCoordinator  No known peers to receive dispute request
Nov 05 00:51:56 v2-melb kagome[192728]: 24.11.05 00:51:56.856539  dispute          Warning   DisputeCoordinator  No known peers to receive dispute request
Nov 05 00:51:56 v2-melb kagome[192728]: 24.11.05 00:51:56.856689  dispute          Warning   DisputeCoordinator  No known peers to receive dispute request
Nov 05 00:51:56 v2-melb kagome[192728]: 24.11.05 00:51:56.856855  dispute          Warning   DisputeCoordinator  No known peers to receive dispute request
Nov 05 00:51:56 v2-melb kagome[192728]: 24.11.05 00:51:56.857009  dispute          Warning   DisputeCoordinator  No known peers to receive dispute request
Nov 05 00:51:56 v2-melb kagome[192728]: 24.11.05 00:51:56.857158  dispute          Warning   DisputeCoordinator  No known peers to receive dispute request
Nov 05 00:51:56 v2-melb kagome[192728]: 24.11.05 00:51:56.857309  dispute          Warning   DisputeCoordinator  No known peers to receive dispute request
Nov 05 00:51:57 v2-melb kagome[192728]: 24.11.05 00:51:57.589173  grandpa          Info      GrandpaEnvironment  Found best chain is longer than approved: #25635892 (0xb9b2…8bff) > #25635889 (0x386b…6a84); truncate it
Nov 05 00:51:57 v2-melb kagome[192728]: 24.11.05 00:51:57.628159  grandpa          Info      GrandpaEnvironment  Found best chain is longer than approved: #25635892 (0xb9b2…8bff) > #25635889 (0x386b…6a84); truncate it
Nov 05 00:52:00 v2-melb kagome[192728]: 24.11.05 00:52:00.136069  main_runner      Info      BlockTree  Finalized block #25635890 (0x564f…7b57)
Nov 05 00:52:02 v2-melb kagome[192728]: 24.11.05 00:52:02.135182  grandpa          Info      GrandpaEnvironment  Found best chain is longer than approved: #25635892 (0xb9b2…8bff) > #25635890 (0x564f…7b57); truncate it
Nov 05 00:52:02 v2-melb kagome[192728]: 24.11.05 00:52:02.234527  main_runner      Warning   SyncProtocolObserver  cannot find a requested block with id 22092206
Nov 05 00:52:02 v2-melb kagome[192728]: 24.11.05 00:52:02.234561  main_runner      Warning   SyncProtocolObserver  cannot find a requested block with id 22092206
Nov 05 00:52:02 v2-melb kagome[192728]: 24.11.05 00:52:02.590237  main_runner      Warning   SyncProtocolObserver  cannot find a requested block with id 22092206
Nov 05 00:52:02 v2-melb kagome[192728]: 24.11.05 00:52:02.958899  worker.31        Info      BlockStorage  Added block #25635893 (0x7bad…022f) as child of #25635892 (0xb9b2…8bff)
Nov 05 00:52:03 v2-melb kagome[192728]: 24.11.05 00:52:03.020723  main_runner      Warning   ParachainProcessorImpl  Prospective parachains leaf update failed. (relay_parent=0x7bad…022f, error=N5scale11DecodeErrorE(2) SCALE decode: unexpected value occurred)
Nov 05 00:52:03 v2-melb kagome[192728]: 24.11.05 00:52:03.031867  main_runner      Warning   Fetch  candidate=0xb028…1565 chunk=255 not found
Nov 05 00:52:03 v2-melb kagome[192728]: 24.11.05 00:52:03.032271  main_runner      Info      BlockExecutor  Imported block #25635893 (0x7bad…022f) within 435 ms. (lag 3032 ms.)
Nov 05 00:52:03 v2-melb kagome[192728]: 24.11.05 00:52:03.211393  main_runner      Warning   SyncProtocolObserver  cannot find a requested block with id 22092206
Nov 05 00:52:07 v2-melb kagome[192728]: 24.11.05 00:52:07.317299  main_runner      Info      Timeline  Caught up block #25635893 (0x7bad…022f)
Nov 05 00:52:08 v2-melb kagome[192728]: 24.11.05 00:52:08.446758  worker.3         Info      BlockStorage  Added block #25635894 (0x7eff…07eb) as child of #25635893 (0x7bad…022f)
Nov 05 00:52:08 v2-melb kagome[192728]: 24.11.05 00:52:08.484886  main_runner      Warning   ParachainProcessorImpl  Prospective parachains leaf update failed. (relay_parent=0x7eff…07eb, error=N5scale11DecodeErrorE(2) SCALE decode: unexpected value occurred)
Nov 05 00:52:08 v2-melb kagome[192728]: 24.11.05 00:52:08.496805  main_runner      Info      BlockExecutor  Imported block #25635894 (0x7eff…07eb) within 372 ms. (lag 2496 ms.)
Nov 05 00:52:08 v2-melb kagome[192728]: 24.11.05 00:52:08.548732  dispute          Warning   DisputeCoordinator  No known peers to receive dispute request
Nov 05 00:52:08 v2-melb kagome[192728]: 24.11.05 00:52:08.548910  dispute          Warning   DisputeCoordinator  No known peers to receive dispute request
Nov 05 00:52:08 v2-melb kagome[192728]: 24.11.05 00:52:08.549068  dispute          Warning   DisputeCoordinator  No known peers to receive dispute request
Nov 05 00:52:08 v2-melb kagome[192728]: 24.11.05 00:52:08.549225  dispute          Warning   DisputeCoordinator  No known peers to receive dispute request
Nov 05 00:52:08 v2-melb kagome[192728]: 24.11.05 00:52:08.549379  dispute          Warning   DisputeCoordinator  No known peers to receive dispute request
Nov 05 00:52:08 v2-melb kagome[192728]: 24.11.05 00:52:08.549529  dispute          Warning   DisputeCoordinator  No known peers to receive dispute request
Nov 05 00:52:08 v2-melb kagome[192728]: 24.11.05 00:52:08.549696  dispute          Warning   DisputeCoordinator  No known peers to receive dispute request
Nov 05 00:52:08 v2-melb kagome[192728]: 24.11.05 00:52:08.549851  dispute          Warning   DisputeCoordinator  No known peers to receive dispute request
Nov 05 00:52:08 v2-melb kagome[192728]: 24.11.05 00:52:08.549995  dispute          Warning   DisputeCoordinator  No known peers to receive dispute request
Nov 05 00:52:08 v2-melb kagome[192728]: 24.11.05 00:52:08.550149  dispute          Warning   DisputeCoordinator  No known peers to receive dispute request
Nov 05 00:52:09 v2-melb kagome[192728]: 24.11.05 00:52:09.386579  main_runner      Info      ApprovalDistribution  Make exhaustive validation. Candidate hash 0x1bd6…4399, validator index 255, block hash 0x564f…7b57
Nov 05 00:52:11 v2-melb kagome[192728]: 24.11.05 00:52:11.219750  main_runner      Info      ApprovalDistribution  Make exhaustive validation. Candidate hash 0xb965…e2b0, validator index 255, block hash 0x7eff…07eb
Nov 05 00:52:53 v2-melb kagome[192728]: 24.11.05 00:52:53.102950  main_runner      Info      BlockTree  Finalized block #25635892 (0xb9b2…8bff)
Nov 05 00:52:54 v2-melb kagome[192728]: 24.11.05 00:52:54.523839  grandpa          Warning   Grandpa  Can't retrieve authorities for finalized block #25635897 (0xd3b4…36d7)
Nov 05 00:52:55 v2-melb kagome[192728]: 24.11.05 00:52:55.103178  grandpa          Info      GrandpaEnvironment  Found best chain is longer than approved: #25635894 (0x7eff…07eb) > #25635892 (0xb9b2…8bff); truncate it
Nov 05 00:52:56 v2-melb kagome[192728]: 24.11.05 00:52:56.983719  main_runner      Warning   SyncProtocolObserver  cannot find a requested block with id 22092206
Nov 05 00:53:04 v2-melb kagome[192728]: 24.11.05 00:53:04.354602  main_runner      Warning   Fetch  candidate=0xe3b8…3d9d chunk=255 not found
Nov 05 00:53:04 v2-melb kagome[192728]: 24.11.05 00:53:04.354631  main_runner      Warning   Fetch  candidate=0x9ab8…25ee chunk=255 not found
Nov 05 00:53:04 v2-melb kagome[192728]: 24.11.05 00:53:04.354642  main_runner      Warning   Fetch  candidate=0x7ef4…d7d6 chunk=255 not found
Nov 05 00:53:04 v2-melb kagome[192728]: 24.11.05 00:53:04.354650  main_runner      Warning   Fetch  candidate=0x3147…34bd chunk=255 not found

Steps to Reproduce

Mode: Validator
number of nodes: 1
Command: kagome --chain kusama -d [...] --validator --listen-addr [...] --public-addr [...] --name [...] --rpc-port [...]

Effects of the Bug

Validator stops syncing blocks and must be restarted to resume syncing up again

Expected Behavior

Validator should keep syncing blocks and should be able to retrieve authority set.

System Information

NixOS 24.5 with kernel 6.11.5
Compiler: gcc 13.2.0
CMake: cmake version 3.25.3

Built using flake from #2257

Additional Context

No response

@Lederstrumpf Lederstrumpf added the bug Something isn't working label Nov 4, 2024
@turuslan turuslan reopened this Nov 18, 2024
@kamilsa kamilsa closed this as completed Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants