Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Insteon: Attempt to Reconnect PLM if it Appears to be Down #401

Merged
merged 2 commits into from
Apr 26, 2014

Conversation

krkeegan
Copy link
Collaborator

  • Add flag to track if a message sent to the PLM has been acknowledged by the PLM
  • On retry, if no receipt was acknowledged by the PLM, attempt to re-open the PLM serial port

Should fix #397

- Add flag to track if a message sent to the PLM has been acknowledged by the PLM
- On retry, if no receipt was acknowledged by the PLM, attempt to re-open the PLM serial port

Should fix hollie#397
…rting

Add simple routine to mh "binary" that closes the serial port in the manner recommended by the Device::Serial_Port documentation.

Change Insteon_PLM serial port restart routine to close the port before re-creating it
@rudybrian
Copy link
Contributor

I'm running this now and will report back on what I see.

@rudybrian
Copy link
Contributor

Initial testing with usbreset looks good, and MH recovers correctly. I will continue to run this to see if my MH system's intermittent USB reset issue causes different behavior.

@rudybrian
Copy link
Contributor

It took a while, but my USB PLM finally reset on it's own early this morning. MH correctly detected this, reinitialized the PLM and recovered.

Looks good!

@jsiddall
Copy link
Contributor

Agreed, looks good. Once I figured out how to get git to do what I wanted the code seems to work great.

@krkeegan
Copy link
Collaborator Author

Awesome, thanks for testing this.

krkeegan added a commit that referenced this pull request Apr 26, 2014
Insteon: Attempt to Reconnect PLM if it Appears to be Down
@krkeegan krkeegan merged commit aa60bbe into hollie:master Apr 26, 2014
@jsiddall
Copy link
Contributor

One thing I have noticed is that there have been a couple of times where nothing immediately came back from a transmit (not sure the root cause of that) and the reconnect code was triggered even though the port had not closed. Nothing bad came of it so I think this is acceptable behavior for now. A future enhancement might be to attempt to contact the PLM itself (not sure if that is feasible) upon suspicion of a port closure to confirm whether the PLM is really gone before re-opening the port.

@krkeegan
Copy link
Collaborator Author

Yeah, I thought that might happen. The bad thing is that we may dump
incoming data by restarting the port.

Not really sure how to check if the port is still open. What I could do is
not reconnect after a single instance of a failure to respond, but instead
require two in a row. That would likely do it.
On Apr 25, 2014 8:58 PM, "jsiddall" [email protected] wrote:

One thing I have noticed is that there have been a couple of times where
nothing immediately came back from a transmit (not sure the root cause of
that) and the reconnect code was triggered even though the port had not
closed. Nothing bad came of it so I think this is acceptable behavior for
now. A future enhancement might be to attempt to contact the PLM itself
(not sure if that is feasible) upon suspicion of a port closure to confirm
whether the PLM is really gone before re-opening the port.


Reply to this email directly or view it on GitHubhttps://github.com//pull/401#issuecomment-41459142
.

@jsiddall
Copy link
Contributor

I understand what you are saying but failures to a single device aren't a good indicator of port state. Ex: say a breaker has tripped that has an insteon device on it. Transmits to that device will always fail even though there is nothing wrong with the port. Which is why I was thinking a better test is the ability to talk to the PLM itself. I have not looked through the code but something is generating this in the logs on startup:

[Insteon_PLM] PLM id: 24150e firmware: 9b

Which implies it is possible to talk to the PLM without any other Insteon network dependencies. So perhaps whatever code generated that message could be used as a test to see if the PLM is still connected to the port or if it should be re-opened.

@krkeegan
Copy link
Collaborator Author

Oh, sorry, we have a mis-communication. Some technical details:

Whenever MH sends a command, the PLM first confirms receipt of the command,
and then we get a confirmation from a device once that message is received
by the device.

This code doesn't look at the confirmation from the device, but rather
confirmation from the PLM.

In the instance you described, your PLM did not respond to the command.
Now I agree, that maybe a single failure of the PLM to respond shouldn't
cause MH to reopen the port, but to my knowledge there is nothing else we
can do to "test" whether the port is still open rather than just attempt to
send a command to the PLM again.

On Sat, Apr 26, 2014 at 9:37 AM, jsiddall [email protected] wrote:

I understand what you are saying but failures to a single device aren't a
good indicator of port state. Ex: say a breaker has tripped that has an
insteon device on it. Transmits to that device will always fail even
though there is nothing wrong with the port. Which is why I was thinking a
better test is the ability to talk to the PLM itself. I have not looked
through the code but something is generating this in the logs on startup:

[Insteon_PLM] PLM id: 24150e firmware: 9b

Which implies it is possible to talk to the PLM without any other Insteon
network dependencies. So perhaps whatever code generated that message could
be used as a test to see if the PLM is still connected to the port or if it
should be re-opened.


Reply to this email directly or view it on GitHubhttps://github.com//pull/401#issuecomment-41473444
.

@jsiddall
Copy link
Contributor

OK, I understand. I thought there was no response from the PLM after sending a message while waiting for a response from the module. Given that there is supposed to be some response to a command waiting for a retry after a failed attempt before trying to re-open the port is probably a good idea since it appears sometimes the acknowledgement from the PLM gets lost.

@krkeegan krkeegan deleted the fix_issue_397 branch June 13, 2014 23:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants