Witness node cannot switch to correct fork on restart if was shut down on another long fork #1703

abitmore · 2019-04-05T15:22:35Z

Bug Description
If we shutdown a witness_node which is not on the correct fork when the chain is forking, if both forks are long, when restarting, it is unable to switch to the correct fork, and reports unlinkable_block_exception.

If the correct fork is a soft fork, if we run the soft fork code with current data, it will crash on the forking block when replaying.

Related logs:

2019-04-05T11:11:54 th_a:invoke handle_block         handle_block ] Got block: #36311740 022a12bcb9fc73ea319c5ff01f4adfb3eb545722 time: 2019-04-05T11:11:54 transaction(s): 18 latency: 449 ms from: abc123  irreversible: 36310625 (-1115)                     application.cpp:515
2019-04-05T11:11:54 th_a:invoke handle_block           push_block ] Pushing block to fork database that failed to link: 022a12bcb9fc73ea319c5ff01f4adfb3eb545722, 36311740                      fork_database.cpp:64
2019-04-05T11:11:54 th_a:invoke handle_block           push_block ] Head: 36311670, 022a12765598724d7f8c7cd5439906df0519368c                    fork_database.cpp:65
2019-04-05T11:11:54 th_a:invoke handle_block         handle_block ] Error when pushing block:
3080000 unlinkable_block_exception: unlinkable block

As analysed by @pmconrad:

It happens because replay switches to undo_db + push_block 50 blocks before end of block database. For normal operation that's ok but not after such a long fork.

That said, the fixed 50 here is probably too small:

bitshares-core/libraries/chain/db_management.cpp

Line 68 in 8c1a78b

uint32_t undo_point = last_block_num < 50 ? 0 : last_block_num - 50;

On the other hand, we probably should NOT rely on this last_block to determine usable head block number, because _block_id_to_block may contain reversible blocks. But it's a bit tricky.

bitshares-core/libraries/chain/db_management.cpp

Line 56 in 8c1a78b

auto last_block = _block_id_to_block.last();

Impacts
Describe which portion(s) of BitShares Core may be impacted by this bug. Please tick at least one box.

Expected Behavior
Able to switch forks.

CORE TEAM TASK LIST

The text was updated successfully, but these errors were encountered:

pmconrad · 2019-04-06T19:28:51Z

A simple solution would be to enable undo_db GRAPHENE_MAX_UNDO_HISTORY before the last block in the index. That should cover all possible cases.
A more complex solution would write the last known LIB on disk somewhere, either in a separate file (probably best), or by modifying the index file structure or the blocks file structure. Not sure if it's worth it.

We should measure how much slower replaying 10k block with undo_db enabled really is, then decide how to solve.

pmconrad · 2019-06-27T09:30:25Z

Resolved by #1832

abitmore added this to the Future Feature Release milestone Apr 5, 2019

abitmore mentioned this issue May 29, 2019

fork_db.set_max_size called too early, may make switching forks impossible #1679

Open

17 tasks

pmconrad modified the milestones: Future Feature Release, 3.3.0 - Feature Release Jun 25, 2019

pmconrad self-assigned this Jun 26, 2019

pmconrad added 2d Developing Status indicating currently designing and developing a solution and removed 2c Ready for Development Status indicating the Requirements are sufficent to begin designing a solution labels Jun 26, 2019

pmconrad mentioned this issue Jun 26, 2019

Enable undo_db earlier during replay #1832

Merged

pmconrad closed this as completed Jun 27, 2019

abitmore mentioned this issue Jul 19, 2019

New witness node: Failed to push new block #1853

Closed

11 tasks

MichelSantos mentioned this issue Aug 9, 2019

Release Notes: BitShares Core 3.3.0 #1892

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Witness node cannot switch to correct fork on restart if was shut down on another long fork #1703

Witness node cannot switch to correct fork on restart if was shut down on another long fork #1703

abitmore commented Apr 5, 2019

pmconrad commented Apr 6, 2019

pmconrad commented Jun 27, 2019

Witness node cannot switch to correct fork on restart if was shut down on another long fork #1703

Witness node cannot switch to correct fork on restart if was shut down on another long fork #1703

Comments

abitmore commented Apr 5, 2019

CORE TEAM TASK LIST

pmconrad commented Apr 6, 2019

pmconrad commented Jun 27, 2019