Skip to content
This repository has been archived by the owner on Jun 11, 2024. It is now read-only.

Detected unapplied rounds in mem_round #2458

Closed
ManuGowda opened this issue Oct 9, 2018 · 2 comments
Closed

Detected unapplied rounds in mem_round #2458

ManuGowda opened this issue Oct 9, 2018 · 2 comments
Assignees
Milestone

Comments

@ManuGowda
Copy link
Contributor

ManuGowda commented Oct 9, 2018

Expected behavior

Whenever the node is restarted at any stage of syncing there shouldn't be any unapplied rounds.

Actual behavior

When a node is restarted, there are unapplied rounds in mem_round

[inf] 2018-10-09 14:02:27 | Starting lisk with "mainnet" genesis block.
[inf] 2018-10-09 14:02:30 | Socket Cluster ready for incoming connections
[inf] 2018-10-09 14:02:31 | Releasing enqueued broadcasts
[inf] 2018-10-09 14:02:31 | Queue empty
[inf] 2018-10-09 14:02:32 | Lisk started: 0.0.0.0:8004
[inf] 2018-10-09 14:02:32 | Modules ready and launched
[inf] 2018-10-09 14:02:32 | INSERT peer - 13.78.12.96:8001 success
[inf] 2018-10-09 14:02:32 | INSERT peer - 13.230.14.157:8001 success
[inf] 2018-10-09 14:02:32 | INSERT peer - 47.104.195.252:8001 success
[inf] 2018-10-09 14:02:33 | INSERT peer - 13.250.45.119:8001 success
[inf] 2018-10-09 14:02:33 | INSERT peer - 5.135.141.51:8001 success
[inf] 2018-10-09 14:02:35 | Blocks 2291618
[inf] 2018-10-09 14:02:35 | Genesis block matched with database
[WRN] 2018-10-09 14:02:35 | Detected unapplied rounds in mem_round
[WRN] 2018-10-09 14:02:35 | Recreating memory tables
[inf] 2018-10-09 14:02:35 | INSERT peer - 18.188.59.171:8001 success
[inf] 2018-10-09 14:02:35 | INSERT peer - 192.121.166.208:8001 success
[inf] 2018-10-09 14:02:36 | Releasing enqueued broadcasts
[inf] 2018-10-09 14:02:36 | Queue empty
[inf] 2018-10-09 14:02:36 | INSERT peer - 13.125.252.0:8001 success
[inf] 2018-10-09 14:02:37 | INSERT peer - 122.219.139.184:8001 success
[inf] 2018-10-09 14:02:37 | INSERT peer - 213.222.210.80:8001 success
[inf] 2018-10-09 14:02:37 | INSERT peer - 13.114.212.150:8001 success
[inf] 2018-10-09 14:02:37 | INSERT peer - 195.201.74.253:8001 success
[inf] 2018-10-09 14:02:38 | INSERT peer - 220.178.235.85:8001 success
[inf] 2018-10-09 14:02:39 | INSERT peer - 13.114.212.150:8001 success
[inf] 2018-10-09 14:02:40 | INSERT peer - 54.36.172.7:8001 success
[inf] 2018-10-09 14:02:40 | INSERT peer - 88.99.235.240:8001 success
[inf] 2018-10-09 14:02:41 | REMOVE peer - 88.99.235.240:8001 success
[inf] 2018-10-09 14:02:41 | Releasing enqueued broadcasts
[inf] 2018-10-09 14:02:41 | Queue empty
[inf] 2018-10-09 14:02:42 | INSERT peer - 213.136.93.24:8001 success
[inf] 2018-10-09 14:02:42 | INSERT peer - 80.211.194.35:8001 success
[inf] 2018-10-09 14:02:43 | Rebuilding blockchain, current block height: 1

Steps to reproduce

start syncing with the network and then try to restart the node, more often than not you will encounter the error Detected unapplied rounds in mem_round

Which version(s) does this affect? (Environment, OS, etc...)

1.1.0

@vitaly-t
Copy link
Contributor

vitaly-t commented Oct 9, 2018

This is a high-level issue with any software. If you kill a process that's executing a business-level transaction, it ends up in limbo.

Only atomic transactions can guarantee integrity in such cases, but atomic transactions are only possible when fully controlled by an outside process, like a pure-database transaction, for example, would provide such atomic integrity. But as it is often either preceded or followed by additional processing, this in turn affects the system's integrity.

Business-level transactions are inevitably susceptible to such integrity issues, and there is no easy, or much less automated solution to it. Such situations can be mediated by adding a high-level transaction support into the business logic of the app.

So what that means,... internally maintaining and persisting status for every business-level transaction, and then adding logic that can deal with transaction data with the status indicating forceful interruption.

But let's not be hasty, this is by far one of the most complex tasks in the software development, and very much error-prone. So, from the practical point of view, it makes sense most often to just deal with such cases individually, i.e. running some checks when the system starts, trying to detect and patch the related data issues. It is ugly, I know, but a proper solution, as explained above, may require a monumental effort.


In all, these are the possible solutions to such a problem....

  1. Implicit transactions, when business-level transactions are turned into generic transactions, by maintaining and persisting internal status for each transaction. This is a generic, and the most complex approach.

  2. Explicit transactions, when certain vulnerable cases are identified, and contain individual provisions to deal with such unexpected interruptions.

  3. Start-up checks + corrections/patches, provided they are even possible, which is often not the case.


I would say, that if the reported case seems isolated, then option 2 should be the most efficient. But as we often know, random process death can result to just about any conceivable integrity issue. So, it may be a bit of a long assumption.

UPDATE

In theory, sometimes Two-Phase Transactions can be of help. Note however that pg-promise does not have any automatic support for it, because it is an extension, not a standard.

@4miners
Copy link
Contributor

4miners commented Oct 9, 2018

When restarting the application we're waiting for all the processing to be finished gracefully. Secondly, even if we kill the process when there is an open SQL transaction (atomic block save) the transaction should be reverted and database state not affected. I bet more on some bug/inconsistency in application logic itself.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants