Small NodeManager refactoring #253

MarinX · 2017-08-15T08:44:46Z

Changes:

introduced NodeManager.isNodeAvailable(),
clean up mutexes usage,
removed failing tests.

influx6 · 2017-08-16T13:13:58Z

cmd/statusd/utils.go

+	//time to sync
+	time.Sleep(10 * time.Second)
+
+	loginResponse := common.APIResponse{}


var loginResponse common.APIResponse is an equivalent and more reader friendly style. ;)

Arguable. I prefer := style, for example.

influx6 · 2017-08-16T13:18:29Z

geth/node/manager.go

@@ -140,7 +141,11 @@ func (m *NodeManager) StopNode() (<-chan struct{}, error) {

 // stopNode stop Status node. Stopped node cannot be resumed.
 func (m *NodeManager) stopNode() (<-chan struct{}, error) {
-	if m.node == nil || m.nodeStarted == nil || m.nodeStopped == nil {
+	if m.node == nil {


Why do we have direct access to the NodeManager.node unguarded by mutex. We might need to create a new issue to refactor this properly, but we can't handle it here.

Created issue for this #258

influx6

PR looks good, we can atleast for now ensure any changes we make should follow the proper guard of the mutex, we should create a seperate issue for the overal refactoring of the package to ensure proper guarding of NodeManager.node.

Thanks.

adambabik

What worries me the most is that in NodeManager methods, we sometimes use a mutex to guard NodeManager.node access and sometimes not.

Also, there is a lot of repetition with checking if node exists, is started or stopped. I am not sure it's possible to extract these logic to a separate method though...

adambabik · 2017-08-17T07:16:16Z

cmd/statusd/utils.go

@@ -1414,6 +1418,29 @@ func testJailFunctionCall(t *testing.T) bool {
 	return true
 }

+func testNodeOffline(t *testing.T) bool {
+


Style: an unnecessary blank line. The same at the end of the function's body.

adambabik · 2017-08-17T07:21:17Z

geth/node/manager.go

 	// make sure that node is fully started
-	if m.node == nil || m.nodeStarted == nil {
+	if m.nodeStarted == nil {


Above you use this condition m.nodeStarted == nil || m.nodeStopped == nil to return ErrNoRunningNode. Is there a reason why it's different here?

Also, if this pattern:

m.RLock() defer m.RUnlock() if m.node == nil { return nil, ErrNodeOffline } if m.nodeStarted == nil || m.nodeStopped == nil { return nil, ErrNoRunningNode }

repeats so frequently, maybe it would be possible to extract it to a new private method?

MarinX · 2017-08-28T08:26:44Z

@adambabik I exported it to a private func so we dont repeat that check.

adambabik

Requesting last small changes.

adambabik · 2017-08-28T08:46:11Z

geth/node/manager.go

@@ -351,8 +358,13 @@ func (m *NodeManager) RestartNode() (<-chan struct{}, error) {

 // restartNode restart running Status node, fails if node is not running
 func (m *NodeManager) restartNode() (<-chan struct{}, error) {
+	// check if we have a node running
+	if m.node == nil {


Could it be written like

if err := m.isNodeAvailable(); err != nil { return nil, err }

here as well?

adambabik · 2017-08-28T08:46:13Z

geth/node/manager.go

-	if m.node == nil || m.nodeStarted == nil {
-		return nil, ErrNoRunningNode
+	// check if we have a node running
+	if m.node == nil {


This if is unnecessary as m.isNodeAvailable() is just below.

adambabik · 2017-08-28T08:47:10Z

geth/node/manager.go

@@ -563,3 +575,16 @@ func (m *NodeManager) RPCServer() (*rpc.Server, error) {

 	return m.rpcServer, nil
 }
+
+// Check if we have a node running and make sure is fully started


The comment should have a format isNodeAvailable checks if we have a running node and....

MarinX · 2017-08-28T10:24:39Z

@adambabik fixed small changes.

influx6 · 2017-08-28T11:52:24Z

geth/node/manager.go

+
+// isNodeAvailable check if we have a node running and make sure is fully started
+func (m *NodeManager) isNodeAvailable() error {
+	if m.node == nil {


Since the node field is supposed to be secured through the mutex, this should call:

defer m.Unlock() m.Lock()

Before it's contents.

I agree, it would be the safest to add it here. But NodeManager's methods need a closer look anyway as some of them don't use the mutex without a reason.

adambabik · 2017-08-30T14:50:02Z

@influx6 could you review again? I made sure that locks are only in the main methods, not in helper methods.

tiabc

There're a number of fixes I've requested but there's a question that worries me the most.
Why is this at all happening? In the test we're checking Login after we stopped the node but that bug isn't actually about that. It's about what happens when we've started the node without an internet connection and tried to Login.
Could you try to reproduce it this way, please, if you haven't done it yet?

tiabc · 2017-08-30T15:06:56Z

geth/node/manager.go

+
+	if m.node == nil {
+		return ErrNodeOffline
+	}


Probably, the check for starting node should be before checking its nodeStarted value. However, I'm not a proponent for such kind of defensive programming.

Actually, m.nodeStarted is set before m.node.

You're right. Was too sleepy and thought that check was m == nil somehow %)

tiabc · 2017-08-30T15:07:21Z

geth/node/manager.go

-	if m.node == nil || m.nodeStarted == nil || m.nodeStopped == nil {
-		return nil, ErrNoRunningNode
+	if err := m.isNodeAvailable(); err != nil {
+		return nil, err
 	}


These 2 conditions are not equivalent.

Also, I think this check should rather be in m.StopNode, right?

Not trivial, but it looks it is. If m.nodeStopped is nil, then m.node must be nil as well. It's because m.nodeStopped is nil only before the node is started and is never set to nil afterwards. It gets its value in a lock right after m.node is assigned. Generally, this whole logic to start/stop a node is overcomplicated with locks and channels floating around. We can refactor this file and save 50% LOC.

Correct.

Well, my concern is about this:

status-go/geth/node/manager.go

Line 101 in f445b11

m.nodeStopped = make(chan struct{}, 1)

First, node is assigned and only then nodeStopped. So if somebody calls stopNode in parallel with startNode, there may be a race condition that node is already assigned while nodeStopped is not yet.
The initial if condition would return true here while yours will return false.

node and nodeStopped assignments are guarded with a Lock(). StopNode() also makes a Lock() so it's not possible. But I think we can just add this one condition, it won't hurt anything, it just will be an additional check.

Well, yeah, now they are. But when you move isNodeAvailable to StopNode they won't be, right?

They will, because isNodeAvailable() call will be after acquiring a lock. Anyway, I will add that condition.

tiabc · 2017-08-30T15:12:36Z

geth/node/manager.go

+	}
+
+	if m.node == nil {
+		return ErrNodeOffline


Hm. I may be wrong but I believe both these cases mean ErrNoRunningNode.

Yeah, I was wondering about this one too. I am not sure we can tell if a node is offline from here. Even if there is no Internet access m.node probably is not nil. I will check that as I guess it was not verified.

Follow up: #242 (comment)

After testing against the current develop using iOS simulator: status-im/status-mobile#1295 (comment)

tiabc · 2017-08-30T15:20:42Z

geth/node/manager.go

@@ -563,3 +565,16 @@ func (m *NodeManager) RPCServer() (*rpc.Server, error) {

 	return m.rpcServer, nil
 }
+
+// isNodeAvailable check if we have a node running and make sure is fully started
+func (m *NodeManager) isNodeAvailable() error {


I'm thinking about renaming this into isNodeStarted as it may be more expressive about what it actually does. What do you think?

It's also not entirely true because we have that <-m.nodeStarted floating around which only guarantees that node is started... We had a problem with naming this actually :D

So think I that it's not entirely true. Hm. Added a refactoring note.

tiabc · 2017-09-09T16:22:39Z

@adambabik as #242 has been closed, what do you think about applying this issue as a small refactoring and reverting changes related to the bug about an offline node? I like that checks got more structured.

adambabik · 2017-09-09T22:07:49Z

@tiabc it sounds good. However, NodeManager still requires more refactoring.

I will leave only checks and remove the rest.

Added check to see if there is an node

adambabik · 2017-09-11T13:19:11Z

@tiabc I removed all code related to the offline bug.

tiabc

As this part will be refactored, I don't insist on making m.isNodeAvailable() return bool.

MarinX requested review from influx6, adambabik and tiabc August 15, 2017 08:44

influx6 reviewed Aug 16, 2017

View reviewed changes

influx6 approved these changes Aug 16, 2017

View reviewed changes

adambabik suggested changes Aug 17, 2017

View reviewed changes

adambabik suggested changes Aug 28, 2017

View reviewed changes

influx6 suggested changes Aug 28, 2017

View reviewed changes

tiabc self-assigned this Aug 29, 2017

tiabc suggested changes Aug 30, 2017

View reviewed changes

adambabik force-pushed the bugfix/node-offline-#242 branch from 1c4ee41 to 3ef9a88 Compare August 31, 2017 11:19

MarinX and others added 9 commits September 11, 2017 14:40

Added new error

a517078

Added check to see if there is an node

Added tests for node check (online/offline)

aebef5a

Changes to PR by request

c99521d

Initial space remove

f04b813

Add private method for checking if node is available

cd45a30

Removed blank space

ae7204e

Small fixes

d0e0ac0

fix locks and tests

e1f78f7

add and move conditions in StopNode

f5c5062

adambabik force-pushed the bugfix/node-offline-#242 branch from 3ef9a88 to 65b7fcb Compare September 11, 2017 12:45

adambabik added 2 commits September 11, 2017 14:45

remove code testing offline

65b7fcb

remove unnecessary tests

e093f1a

adambabik approved these changes Sep 11, 2017

View reviewed changes

adambabik changed the title ~~Bugfix/node offline #242~~ Small NodeManager refactoring Sep 11, 2017

tiabc approved these changes Sep 11, 2017

View reviewed changes

tiabc merged commit 4fb0faa into develop Sep 11, 2017

adambabik deleted the bugfix/node-offline-#242 branch October 29, 2017 22:35

Small NodeManager refactoring #253

Small NodeManager refactoring #253

Conversation

MarinX commented Aug 15, 2017 • edited by adambabik Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

influx6 left a comment

Choose a reason for hiding this comment

adambabik left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MarinX commented Aug 28, 2017

adambabik left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MarinX commented Aug 28, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adambabik commented Aug 30, 2017

tiabc left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tiabc Aug 31, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tiabc commented Sep 9, 2017

adambabik commented Sep 9, 2017 • edited Loading

adambabik commented Sep 11, 2017

tiabc left a comment

Choose a reason for hiding this comment

MarinX commented Aug 15, 2017 •

edited by adambabik

Loading

tiabc left a comment •

edited

Loading

tiabc Aug 31, 2017 •

edited

Loading

adambabik commented Sep 9, 2017 •

edited

Loading