Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Small NodeManager refactoring #253

Merged
merged 11 commits into from
Sep 11, 2017
Merged

Small NodeManager refactoring #253

merged 11 commits into from
Sep 11, 2017

Conversation

MarinX
Copy link
Contributor

@MarinX MarinX commented Aug 15, 2017

Changes:

  • introduced NodeManager.isNodeAvailable(),
  • clean up mutexes usage,
  • removed failing tests.

@MarinX MarinX requested review from influx6, adambabik and tiabc August 15, 2017 08:44
//time to sync
time.Sleep(10 * time.Second)

loginResponse := common.APIResponse{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

var loginResponse common.APIResponse is an equivalent and more reader friendly style. ;)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arguable. I prefer := style, for example.

@@ -140,7 +141,11 @@ func (m *NodeManager) StopNode() (<-chan struct{}, error) {

// stopNode stop Status node. Stopped node cannot be resumed.
func (m *NodeManager) stopNode() (<-chan struct{}, error) {
if m.node == nil || m.nodeStarted == nil || m.nodeStopped == nil {
if m.node == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have direct access to the NodeManager.node unguarded by mutex. We might need to create a new issue to refactor this properly, but we can't handle it here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created issue for this #258

Copy link
Contributor

@influx6 influx6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR looks good, we can atleast for now ensure any changes we make should follow the proper guard of the mutex, we should create a seperate issue for the overal refactoring of the package to ensure proper guarding of NodeManager.node.

Thanks.

Copy link
Contributor

@adambabik adambabik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What worries me the most is that in NodeManager methods, we sometimes use a mutex to guard NodeManager.node access and sometimes not.

Also, there is a lot of repetition with checking if node exists, is started or stopped. I am not sure it's possible to extract these logic to a separate method though...

@@ -1414,6 +1418,29 @@ func testJailFunctionCall(t *testing.T) bool {
return true
}

func testNodeOffline(t *testing.T) bool {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style: an unnecessary blank line. The same at the end of the function's body.

// make sure that node is fully started
if m.node == nil || m.nodeStarted == nil {
if m.nodeStarted == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Above you use this condition m.nodeStarted == nil || m.nodeStopped == nil to return ErrNoRunningNode. Is there a reason why it's different here?

Also, if this pattern:

m.RLock()
defer m.RUnlock()

if m.node == nil {
  return nil, ErrNodeOffline
}

if m.nodeStarted == nil || m.nodeStopped == nil {
  return nil, ErrNoRunningNode
}

repeats so frequently, maybe it would be possible to extract it to a new private method?

@MarinX
Copy link
Contributor Author

MarinX commented Aug 28, 2017

@adambabik I exported it to a private func so we dont repeat that check.

Copy link
Contributor

@adambabik adambabik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requesting last small changes.

@@ -351,8 +358,13 @@ func (m *NodeManager) RestartNode() (<-chan struct{}, error) {

// restartNode restart running Status node, fails if node is not running
func (m *NodeManager) restartNode() (<-chan struct{}, error) {
// check if we have a node running
if m.node == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could it be written like

if err := m.isNodeAvailable(); err != nil {
  return nil, err
}

here as well?

if m.node == nil || m.nodeStarted == nil {
return nil, ErrNoRunningNode
// check if we have a node running
if m.node == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This if is unnecessary as m.isNodeAvailable() is just below.

@@ -563,3 +575,16 @@ func (m *NodeManager) RPCServer() (*rpc.Server, error) {

return m.rpcServer, nil
}

// Check if we have a node running and make sure is fully started
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment should have a format isNodeAvailable checks if we have a running node and....

@MarinX
Copy link
Contributor Author

MarinX commented Aug 28, 2017

@adambabik fixed small changes.


// isNodeAvailable check if we have a node running and make sure is fully started
func (m *NodeManager) isNodeAvailable() error {
if m.node == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the node field is supposed to be secured through the mutex, this should call:

defer m.Unlock()
m.Lock()

Before it's contents.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, it would be the safest to add it here. But NodeManager's methods need a closer look anyway as some of them don't use the mutex without a reason.

@tiabc tiabc self-assigned this Aug 29, 2017
@adambabik
Copy link
Contributor

@influx6 could you review again? I made sure that locks are only in the main methods, not in helper methods.

Copy link
Contributor

@tiabc tiabc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There're a number of fixes I've requested but there's a question that worries me the most.
Why is this at all happening? In the test we're checking Login after we stopped the node but that bug isn't actually about that. It's about what happens when we've started the node without an internet connection and tried to Login.
Could you try to reproduce it this way, please, if you haven't done it yet?


if m.node == nil {
return ErrNodeOffline
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably, the check for starting node should be before checking its nodeStarted value. However, I'm not a proponent for such kind of defensive programming.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, m.nodeStarted is set before m.node.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. Was too sleepy and thought that check was m == nil somehow %)

if m.node == nil || m.nodeStarted == nil || m.nodeStopped == nil {
return nil, ErrNoRunningNode
if err := m.isNodeAvailable(); err != nil {
return nil, err
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. These 2 conditions are not equivalent.
  2. Also, I think this check should rather be in m.StopNode, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Not trivial, but it looks it is. If m.nodeStopped is nil, then m.node must be nil as well. It's because m.nodeStopped is nil only before the node is started and is never set to nil afterwards. It gets its value in a lock right after m.node is assigned. Generally, this whole logic to start/stop a node is overcomplicated with locks and channels floating around. We can refactor this file and save 50% LOC.
  2. Correct.

Copy link
Contributor

@tiabc tiabc Aug 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, my concern is about this:

m.nodeStopped = make(chan struct{}, 1)

First, node is assigned and only then nodeStopped. So if somebody calls stopNode in parallel with startNode, there may be a race condition that node is already assigned while nodeStopped is not yet.
The initial if condition would return true here while yours will return false.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

node and nodeStopped assignments are guarded with a Lock(). StopNode() also makes a Lock() so it's not possible. But I think we can just add this one condition, it won't hurt anything, it just will be an additional check.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, yeah, now they are. But when you move isNodeAvailable to StopNode they won't be, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They will, because isNodeAvailable() call will be after acquiring a lock. Anyway, I will add that condition.

}

if m.node == nil {
return ErrNodeOffline
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. I may be wrong but I believe both these cases mean ErrNoRunningNode.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was wondering about this one too. I am not sure we can tell if a node is offline from here. Even if there is no Internet access m.node probably is not nil. I will check that as I guess it was not verified.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow up: #242 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After testing against the current develop using iOS simulator: status-im/status-mobile#1295 (comment)

@@ -563,3 +565,16 @@ func (m *NodeManager) RPCServer() (*rpc.Server, error) {

return m.rpcServer, nil
}

// isNodeAvailable check if we have a node running and make sure is fully started
func (m *NodeManager) isNodeAvailable() error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking about renaming this into isNodeStarted as it may be more expressive about what it actually does. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also not entirely true because we have that <-m.nodeStarted floating around which only guarantees that node is started... We had a problem with naming this actually :D

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So think I that it's not entirely true. Hm. Added a refactoring note.

@adambabik adambabik force-pushed the bugfix/node-offline-#242 branch from 1c4ee41 to 3ef9a88 Compare August 31, 2017 11:19
@tiabc
Copy link
Contributor

tiabc commented Sep 9, 2017

@adambabik as #242 has been closed, what do you think about applying this issue as a small refactoring and reverting changes related to the bug about an offline node? I like that checks got more structured.

@adambabik
Copy link
Contributor

adambabik commented Sep 9, 2017

@tiabc it sounds good. However, NodeManager still requires more refactoring.

I will leave only checks and remove the rest.

@adambabik adambabik force-pushed the bugfix/node-offline-#242 branch from 3ef9a88 to 65b7fcb Compare September 11, 2017 12:45
@adambabik
Copy link
Contributor

@tiabc I removed all code related to the offline bug.

@adambabik adambabik changed the title Bugfix/node offline #242 Small NodeManager refactoring Sep 11, 2017
Copy link
Contributor

@tiabc tiabc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this part will be refactored, I don't insist on making m.isNodeAvailable() return bool.

@tiabc tiabc merged commit 4fb0faa into develop Sep 11, 2017
@adambabik adambabik deleted the bugfix/node-offline-#242 branch October 29, 2017 22:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants