Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Witness node cannot switch to correct fork on restart if was shut down on another long fork #1703

Closed
1 of 17 tasks
abitmore opened this issue Apr 5, 2019 · 2 comments
Closed
1 of 17 tasks
Assignees
Labels
2d Developing Status indicating currently designing and developing a solution 3d Bug Classification indicating the existing implementation does not match the intention of the design 4a Low Priority Priority indicating minimal impact to system/user -OR- an inexpensive workaround exists 6 Deployment Impact flag identifying the deployment process (e.g. Docker, Travis, etc.)

Comments

@abitmore
Copy link
Member

abitmore commented Apr 5, 2019

Bug Description
If we shutdown a witness_node which is not on the correct fork when the chain is forking, if both forks are long, when restarting, it is unable to switch to the correct fork, and reports unlinkable_block_exception.

If the correct fork is a soft fork, if we run the soft fork code with current data, it will crash on the forking block when replaying.

Related logs:

2019-04-05T11:11:54 th_a:invoke handle_block         handle_block ] Got block: #36311740 022a12bcb9fc73ea319c5ff01f4adfb3eb545722 time: 2019-04-05T11:11:54 transaction(s): 18 latency: 449 ms from: abc123  irreversible: 36310625 (-1115)                     application.cpp:515
2019-04-05T11:11:54 th_a:invoke handle_block           push_block ] Pushing block to fork database that failed to link: 022a12bcb9fc73ea319c5ff01f4adfb3eb545722, 36311740                      fork_database.cpp:64
2019-04-05T11:11:54 th_a:invoke handle_block           push_block ] Head: 36311670, 022a12765598724d7f8c7cd5439906df0519368c                    fork_database.cpp:65
2019-04-05T11:11:54 th_a:invoke handle_block         handle_block ] Error when pushing block:
3080000 unlinkable_block_exception: unlinkable block

As analysed by @pmconrad:

It happens because replay switches to undo_db + push_block 50 blocks before end of block database. For normal operation that's ok but not after such a long fork.

That said, the fixed 50 here is probably too small:

uint32_t undo_point = last_block_num < 50 ? 0 : last_block_num - 50;

On the other hand, we probably should NOT rely on this last_block to determine usable head block number, because _block_id_to_block may contain reversible blocks. But it's a bit tricky.

auto last_block = _block_id_to_block.last();

Impacts
Describe which portion(s) of BitShares Core may be impacted by this bug. Please tick at least one box.

  • API (the application programming interface)
  • Build (the build process or something prior to compiled code)
  • CLI (the command line wallet)
  • Deployment (the deployment process after building such as Docker, Travis, etc.)
  • DEX (the Decentralized EXchange, market engine, etc.)
  • P2P (the peer-to-peer network for transaction/block propagation)
  • Performance (system or user efficiency, etc.)
  • Protocol (the blockchain logic, consensus, validation, etc.)
  • Security (the security of system or user data, etc.)
  • UX (the User Experience)
  • Other (please add below)

Expected Behavior
Able to switch forks.

CORE TEAM TASK LIST

  • Evaluate / Prioritize Bug Report
  • Refine User Stories / Requirements
  • Define Test Cases
  • Design / Develop Solution
  • Perform QA/Testing
  • Update Documentation
@abitmore abitmore added this to the Future Feature Release milestone Apr 5, 2019
@pmconrad
Copy link
Contributor

pmconrad commented Apr 6, 2019

A simple solution would be to enable undo_db GRAPHENE_MAX_UNDO_HISTORY before the last block in the index. That should cover all possible cases.
A more complex solution would write the last known LIB on disk somewhere, either in a separate file (probably best), or by modifying the index file structure or the blocks file structure. Not sure if it's worth it.

We should measure how much slower replaying 10k block with undo_db enabled really is, then decide how to solve.

@pmconrad pmconrad added 2c Ready for Development Status indicating the Requirements are sufficent to begin designing a solution 3d Bug Classification indicating the existing implementation does not match the intention of the design 4a Low Priority Priority indicating minimal impact to system/user -OR- an inexpensive workaround exists 6 Deployment Impact flag identifying the deployment process (e.g. Docker, Travis, etc.) labels Apr 11, 2019
@pmconrad pmconrad self-assigned this Jun 26, 2019
@pmconrad pmconrad added 2d Developing Status indicating currently designing and developing a solution and removed 2c Ready for Development Status indicating the Requirements are sufficent to begin designing a solution labels Jun 26, 2019
@pmconrad
Copy link
Contributor

Resolved by #1832

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2d Developing Status indicating currently designing and developing a solution 3d Bug Classification indicating the existing implementation does not match the intention of the design 4a Low Priority Priority indicating minimal impact to system/user -OR- an inexpensive workaround exists 6 Deployment Impact flag identifying the deployment process (e.g. Docker, Travis, etc.)
Projects
None yet
Development

No branches or pull requests

2 participants