Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: implement health reporting #2205

Merged
merged 21 commits into from
Jan 31, 2022
Merged

feat: implement health reporting #2205

merged 21 commits into from
Jan 31, 2022

Conversation

robertsLando
Copy link
Member

@robertsLando robertsLando commented Jan 24, 2022

Fixes #2113

Peek 24-01-2022 15-31

@coveralls
Copy link

coveralls commented Jan 24, 2022

Pull Request Test Coverage Report for Build 1772814093

  • 0 of 73 (0.0%) changed or added relevant lines in 4 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-0.1%) to 27.069%

Changes Missing Coverage Covered Lines Changed/Added Lines %
lib/SocketManager.ts 0 1 0.0%
src/plugins/socket.js 0 1 0.0%
src/components/nodes-table/nodes-table.js 0 2 0.0%
lib/ZwaveClient.ts 0 69 0.0%
Totals Coverage Status
Change from base Build 1738891310: -0.1%
Covered Lines: 3636
Relevant Lines: 14163

💛 - Coveralls

Copy link
Member

@AlCalzone AlCalzone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the user needs to distinguish between the lifeline and the route health check. Just do the right thing™ depending on the target node

@LordMike
Copy link
Contributor

  • "Failed pings" says "Node: 0", which is odd - as there is no Node 0 (I think this means "the node has 0 failed pings", but it reads weird).
  • I think the test doesn't route messages, but checks for line-of-sight communication? .. Maybe a small text on that, as I'm trying to check the health of far-away nodes, which have no ability to communicate with the controller directly.
    • The panel waits a long time to determine that a node has a health of "0", could this be done faster?
  • Does this also include the checkLifelineHealth call somehow?

@robertsLando
Copy link
Member Author

robertsLando commented Jan 25, 2022

"Failed pings" says "Node: 0",

This means the failed pings to node are 0, this because in the health response there could also be controller failed pings (in this case them are undefined. I noticed there are many props set to undefined in the response dunno if this is an error on zwave-js side or what else. cc @AlCalzone

I think the test doesn't route messages, but checks for line-of-sight communication? .. Maybe a small text on that, as I'm trying to check the health of far-away nodes, which have no ability to communicate with the controller directly.
The panel waits a long time to determine that a node has a health of "0", could this be done faster?

@AlCalzone ?

Does this also include the checkLifelineHealth call somehow?

If you check @AlCalzone comment in my original implementation there were 2 buttons one for lifeline check and one for the route check, I have leave just one button and trigger the correct api call based on the target node, if it is a controller I will use the lifeline otherwise I use the route check

@AlCalzone
Copy link
Member

AlCalzone commented Jan 25, 2022

I noticed there are many props set to undefined in the response

Some statistics can't be determined for all nodes because they requires support for the Powerlevel CC. Also, if you check the type definitions - route health has different properties than lifeline health.

I think the test doesn't route messages, but checks for line-of-sight communication?

It should route - got any driver logs? @LordMike

The panel waits a long time to determine that a node has a health of "0", could this be done faster?

This would have to go in the driver. Currently it does a fixed amount of checks and when the route health is low, this takes longer. It would have to stop quicker when the intermediate results are bad.

@robertsLando
Copy link
Member Author

Also, if you check the type definitions - route health has different properties than lifeline health.

Yeah I know that, in fact the table headers change based on the api called

@LordMike
Copy link
Contributor

I think the test doesn't route messages, but checks for line-of-sight communication?

It should route - got any driver logs? @LordMike

The panel waits a long time to determine that a node has a health of "0", could this be done faster?

This would have to go in the driver. Currently it does a fixed amount of checks and when the route health is low, this takes longer. It would have to stop quicker when the intermediate results are bad.

I see - so it's actually a bug... I just assumed it was direct, hence I didn't grab anything. Will look into this later today. The Node I chose as target was Node 1, which is the controller.

"Failed pings" says "Node: 0",

This means the failed pings to node are 0, this because in the health response there could also be controller failed pings (in this case them are undefined. I noticed there are many props set to undefined in the response dunno if this is an error on zwave-js side or what else. cc @AlCalzone

I confirmed this when I then chose another, closer, node, which had 10 failed pings from the controller instead. Something stupid like suffixing "failures" could mitigate this: Node: 0 failures.

Imagine if I had two failed pings, it would say "Node: 2", which could leave me wondering why Node 2 has anything to do with the test I just made.

@robertsLando
Copy link
Member Author

Fails suffix added

@AlCalzone
Copy link
Member

Or you could be specific, like node -> ctrlr: 10/10
Ctrlr -> node: 1/10

I have thought about this quite a bit when developing this feature and the reports in the driver logs could serve as a guideline for the UI.

@robertsLando
Copy link
Member Author

Ctrlr -> node: 1/10

10 is the total number of pings you do, right?

@AlCalzone
Copy link
Member

AlCalzone commented Jan 25, 2022

Yup. To show an example, I'm referring to this:

[Node 014] Lifeline health check complete in 32339 ms
rating:                   8 (good)
no. of routing neighbors: 1
 
Check rounds:
· round 1 - rating: 8 (good)
  failed pings → node:             0/10
  max. latency:                    10.0 ms
  route changes:                   0
  SNR margin:                      23 dBm
  min. node powerlevel w/o errors: -8 dBm
 
· round 2 - rating: 8 (good)
  failed pings → node:             0/10
  max. latency:                    10.0 ms
  route changes:                   0
  SNR margin:                      23 dBm
  min. node powerlevel w/o errors: -8 dBm
 
· round 3 - rating: 8 (good)
  failed pings → node:             0/10
  max. latency:                    10.0 ms
  route changes:                   0
  SNR margin:                      22 dBm
  min. node powerlevel w/o errors: -8 dBm

And if there were failures pinging the controller at normal power (here -8dBm was ok), it would display this aswell:

failed pings → controller:       3/10 at normal power

@robertsLando
Copy link
Member Author

@AlCalzone That helps a lot! Thanks

@LordMike
Copy link
Contributor

LordMike commented Jan 25, 2022

I've also made zwave-js/node-zwave-js#4130, which we might wanna look at first. But for completeness sake, here is a driver output for my node 32, which is my most-far-away node according to the map. Starting a check for that, yields this (and no results).

Log snippet
2022-01-25T19:48:37.461Z CNTRLR   [Node 032] Starting lifeline health check (1 round)...
2022-01-25T19:48:37.461Z CNTRLR » [Node 032] requesting node neighbors...
2022-01-25T19:48:37.469Z SERIAL » 0x010700802001000059                                                 (9 bytes)
2022-01-25T19:48:37.469Z DRIVER » [REQ] [GetRoutingInfo]
                                    remove non-repeaters: true
                                    remove bad links:     false
2022-01-25T19:48:37.471Z SERIAL « [ACK]                                                                   (0x06)
2022-01-25T19:48:37.474Z SERIAL « 0x01200180240000600000020000000000000000000000000000000000000000000 (34 bytes)
                                  018
2022-01-25T19:48:37.474Z SERIAL » [ACK]                                                                   (0x06)
2022-01-25T19:48:37.475Z DRIVER « [RES] [GetRoutingInfo]
                                    node ids: 3, 6, 30, 31, 50
2022-01-25T19:48:37.476Z CNTRLR « [Node 032] node neighbors received: 3, 6, 30, 31, 50
2022-01-25T19:48:37.481Z SERIAL » 0x010800132001002513f3                                              (10 bytes)
2022-01-25T19:48:37.481Z DRIVER » [Node 032] [REQ] [SendData]
                                  │ transmit options: 0x25
                                  │ callback id:      19
                                  └─[NoOperationCC]
2022-01-25T19:48:37.482Z SERIAL « [ACK]                                                                   (0x06)
2022-01-25T19:48:37.489Z SERIAL « 0x0104011301e8                                                       (6 bytes)
2022-01-25T19:48:37.489Z SERIAL » [ACK]                                                                   (0x06)
2022-01-25T19:48:37.489Z DRIVER « [RES] [SendData]
                                    was sent: true
2022-01-25T19:48:37.571Z SERIAL « 0x0107001313000008f0                                                 (9 bytes)
2022-01-25T19:48:37.572Z SERIAL » [ACK]                                                                   (0x06)
2022-01-25T19:48:37.573Z DRIVER « [REQ] [SendData]
                                    callback id:     19
                                    transmit status: OK
2022-01-25T19:48:37.585Z SERIAL » 0x010800132001002514f4                                              (10 bytes)
2022-01-25T19:48:37.585Z DRIVER » [Node 032] [REQ] [SendData]
                                  │ transmit options: 0x25
                                  │ callback id:      20
                                  └─[NoOperationCC]
2022-01-25T19:48:37.587Z SERIAL « [ACK]                                                                   (0x06)
2022-01-25T19:48:37.593Z SERIAL « 0x0104011301e8                                                       (6 bytes)
2022-01-25T19:48:37.593Z SERIAL » [ACK]                                                                   (0x06)
2022-01-25T19:48:37.593Z DRIVER « [RES] [SendData]
                                    was sent: true
2022-01-25T19:48:37.674Z SERIAL « 0x0107001314000009f6                                                 (9 bytes)
2022-01-25T19:48:37.675Z SERIAL » [ACK]                                                                   (0x06)
2022-01-25T19:48:37.676Z DRIVER « [REQ] [SendData]
                                    callback id:     20
                                    transmit status: OK
2022-01-25T19:48:37.684Z SERIAL » 0x010800132001002515f5                                              (10 bytes)
2022-01-25T19:48:37.685Z DRIVER » [Node 032] [REQ] [SendData]
                                  │ transmit options: 0x25
                                  │ callback id:      21
                                  └─[NoOperationCC]
2022-01-25T19:48:37.686Z SERIAL « [ACK]                                                                   (0x06)
2022-01-25T19:48:37.692Z SERIAL « 0x0104011301e8                                                       (6 bytes)
2022-01-25T19:48:37.693Z SERIAL » [ACK]                                                                   (0x06)
2022-01-25T19:48:37.693Z DRIVER « [RES] [SendData]
                                    was sent: true
2022-01-25T19:48:37.774Z SERIAL « 0x0107001315000009f7                                                 (9 bytes)
2022-01-25T19:48:37.775Z SERIAL » [ACK]                                                                   (0x06)
2022-01-25T19:48:37.776Z DRIVER « [REQ] [SendData]
                                    callback id:     21
                                    transmit status: OK
2022-01-25T19:48:37.786Z SERIAL » 0x010800132001002516f6                                              (10 bytes)
2022-01-25T19:48:37.786Z DRIVER » [Node 032] [REQ] [SendData]
                                  │ transmit options: 0x25
                                  │ callback id:      22
                                  └─[NoOperationCC]
2022-01-25T19:48:37.787Z SERIAL « [ACK]                                                                   (0x06)
2022-01-25T19:48:37.793Z SERIAL « 0x0104011301e8                                                       (6 bytes)
2022-01-25T19:48:37.793Z SERIAL » [ACK]                                                                   (0x06)
2022-01-25T19:48:37.794Z DRIVER « [RES] [SendData]
                                    was sent: true
2022-01-25T19:48:37.875Z SERIAL « 0x0107001316000009f4                                                 (9 bytes)
2022-01-25T19:48:37.876Z SERIAL » [ACK]                                                                   (0x06)
2022-01-25T19:48:37.876Z DRIVER « [REQ] [SendData]
                                    callback id:     22
                                    transmit status: OK
2022-01-25T19:48:37.880Z SERIAL » 0x010800132001002517f7                                              (10 bytes)
2022-01-25T19:48:37.881Z DRIVER » [Node 032] [REQ] [SendData]
                                  │ transmit options: 0x25
                                  │ callback id:      23
                                  └─[NoOperationCC]
2022-01-25T19:48:37.882Z SERIAL « [ACK]                                                                   (0x06)
2022-01-25T19:48:37.888Z SERIAL « 0x0104011301e8                                                       (6 bytes)
2022-01-25T19:48:37.888Z SERIAL » [ACK]                                                                   (0x06)
2022-01-25T19:48:37.889Z DRIVER « [RES] [SendData]
                                    was sent: true
2022-01-25T19:48:37.970Z SERIAL « 0x0107001317000008f4                                                 (9 bytes)
2022-01-25T19:48:37.971Z SERIAL » [ACK]                                                                   (0x06)
2022-01-25T19:48:37.971Z DRIVER « [REQ] [SendData]
                                    callback id:     23
                                    transmit status: OK
2022-01-25T19:48:37.977Z SERIAL » 0x010800132001002518f8                                              (10 bytes)
2022-01-25T19:48:37.978Z DRIVER » [Node 032] [REQ] [SendData]
                                  │ transmit options: 0x25
                                  │ callback id:      24
                                  └─[NoOperationCC]
2022-01-25T19:48:37.979Z SERIAL « [ACK]                                                                   (0x06)
2022-01-25T19:48:37.985Z SERIAL « 0x0104011301e8                                                       (6 bytes)
2022-01-25T19:48:37.986Z SERIAL » [ACK]                                                                   (0x06)
2022-01-25T19:48:37.986Z DRIVER « [RES] [SendData]
                                    was sent: true
2022-01-25T19:48:38.067Z SERIAL « 0x0107001318000009fa                                                 (9 bytes)
2022-01-25T19:48:38.068Z SERIAL » [ACK]                                                                   (0x06)
2022-01-25T19:48:38.069Z DRIVER « [REQ] [SendData]
                                    callback id:     24
                                    transmit status: OK
2022-01-25T19:48:38.081Z SERIAL » 0x010800132001002519f9                                              (10 bytes)
2022-01-25T19:48:38.081Z DRIVER » [Node 032] [REQ] [SendData]
                                  │ transmit options: 0x25
                                  │ callback id:      25
                                  └─[NoOperationCC]
2022-01-25T19:48:38.083Z SERIAL « [ACK]                                                                   (0x06)
2022-01-25T19:48:38.089Z SERIAL « 0x0104011301e8                                                       (6 bytes)
2022-01-25T19:48:38.089Z SERIAL » [ACK]                                                                   (0x06)
2022-01-25T19:48:38.089Z DRIVER « [RES] [SendData]
                                    was sent: true
2022-01-25T19:48:38.172Z SERIAL « 0x0107001319000009fb                                                 (9 bytes)
2022-01-25T19:48:38.173Z SERIAL » [ACK]                                                                   (0x06)
2022-01-25T19:48:38.174Z DRIVER « [REQ] [SendData]
                                    callback id:     25
                                    transmit status: OK
2022-01-25T19:48:38.189Z SERIAL » 0x01080013200100251afa                                              (10 bytes)
2022-01-25T19:48:38.190Z DRIVER » [Node 032] [REQ] [SendData]
                                  │ transmit options: 0x25
                                  │ callback id:      26
                                  └─[NoOperationCC]
2022-01-25T19:48:38.191Z SERIAL « [ACK]                                                                   (0x06)
2022-01-25T19:48:38.197Z SERIAL « 0x0104011301e8                                                       (6 bytes)
2022-01-25T19:48:38.197Z SERIAL » [ACK]                                                                   (0x06)
2022-01-25T19:48:38.198Z DRIVER « [RES] [SendData]
                                    was sent: true
2022-01-25T19:48:38.279Z SERIAL « 0x010700131a000008f9                                                 (9 bytes)
2022-01-25T19:48:38.280Z SERIAL » [ACK]                                                                   (0x06)
2022-01-25T19:48:38.281Z DRIVER « [REQ] [SendData]
                                    callback id:     26
                                    transmit status: OK
2022-01-25T19:48:38.288Z SERIAL » 0x01080013200100251bfb                                              (10 bytes)
2022-01-25T19:48:38.289Z DRIVER » [Node 032] [REQ] [SendData]
                                  │ transmit options: 0x25
                                  │ callback id:      27
                                  └─[NoOperationCC]
2022-01-25T19:48:38.290Z SERIAL « [ACK]                                                                   (0x06)
2022-01-25T19:48:38.296Z SERIAL « 0x0104011301e8                                                       (6 bytes)
2022-01-25T19:48:38.296Z SERIAL » [ACK]                                                                   (0x06)
2022-01-25T19:48:38.297Z DRIVER « [RES] [SendData]
                                    was sent: true
2022-01-25T19:48:38.378Z SERIAL « 0x010700131b000009f9                                                 (9 bytes)
2022-01-25T19:48:38.379Z SERIAL » [ACK]                                                                   (0x06)
2022-01-25T19:48:38.380Z DRIVER « [REQ] [SendData]
                                    callback id:     27
                                    transmit status: OK
2022-01-25T19:48:38.395Z SERIAL » 0x01080013200100251cfc                                              (10 bytes)
2022-01-25T19:48:38.395Z DRIVER » [Node 032] [REQ] [SendData]
                                  │ transmit options: 0x25
                                  │ callback id:      28
                                  └─[NoOperationCC]
2022-01-25T19:48:38.396Z SERIAL « [ACK]                                                                   (0x06)
2022-01-25T19:48:38.403Z SERIAL « 0x0104011301e8                                                       (6 bytes)
2022-01-25T19:48:38.403Z SERIAL » [ACK]                                                                   (0x06)
2022-01-25T19:48:38.403Z DRIVER « [RES] [SendData]
                                    was sent: true
2022-01-25T19:48:38.514Z SERIAL « 0x010700131c00000cfb                                                 (9 bytes)
2022-01-25T19:48:38.515Z SERIAL » [ACK]                                                                   (0x06)
2022-01-25T19:48:38.516Z DRIVER « [REQ] [SendData]
                                    callback id:     28
                                    transmit status: OK
2022-01-25T19:48:38.520Z CNTRLR   [Node 032] Sending 10 pings to controller at -5 dBm...
2022-01-25T19:48:38.548Z SERIAL » 0x010d0013200673040105000a251d86                                    (15 bytes)
2022-01-25T19:48:38.548Z DRIVER » [Node 032] [REQ] [SendData]
                                  │ transmit options: 0x25
                                  │ callback id:      29
                                  └─[PowerlevelCCTestNodeSet]
                                      test node id:     1
                                      power level:      -5 dBm
                                      test frame count: 10
2022-01-25T19:48:38.550Z SERIAL « [ACK]                                                                   (0x06)
2022-01-25T19:48:38.556Z SERIAL « 0x0104011301e8                                                       (6 bytes)
2022-01-25T19:48:38.557Z SERIAL » [ACK]                                                                   (0x06)
2022-01-25T19:48:38.557Z DRIVER « [RES] [SendData]
                                    was sent: true
2022-01-25T19:48:40.111Z SERIAL « 0x010700131d00009c6a                                                 (9 bytes)
2022-01-25T19:48:40.111Z SERIAL » [ACK]                                                                   (0x06)
2022-01-25T19:48:40.112Z DRIVER « [REQ] [SendData]
                                    callback id:     29
                                    transmit status: OK
2022-01-25T19:48:41.119Z SERIAL » 0x0109001320027305251e8a                                            (11 bytes)
2022-01-25T19:48:41.120Z DRIVER » [Node 032] [REQ] [SendData]
                                  │ transmit options: 0x25
                                  │ callback id:      30
                                  └─[PowerlevelCCTestNodeGet]
2022-01-25T19:48:41.121Z SERIAL « [ACK]                                                                   (0x06)
2022-01-25T19:48:41.128Z SERIAL « 0x0104011301e8                                                       (6 bytes)
2022-01-25T19:48:41.128Z SERIAL » [ACK]                                                                   (0x06)
2022-01-25T19:48:41.128Z DRIVER « [RES] [SendData]
                                    was sent: true
2022-01-25T19:48:41.211Z SERIAL « 0x010700131e000008fd                                                 (9 bytes)
2022-01-25T19:48:41.212Z SERIAL » [ACK]                                                                   (0x06)
2022-01-25T19:48:41.213Z DRIVER « [REQ] [SendData]
                                    callback id:     30
                                    transmit status: OK
2022-01-25T19:48:41.288Z SERIAL « 0x010c0004002006730601020000a7                                      (14 bytes)
2022-01-25T19:48:41.290Z SERIAL » [ACK]                                                                   (0x06)
2022-01-25T19:48:41.291Z DRIVER « [Node 032] [REQ] [ApplicationCommand]
                                  └─[PowerlevelCCTestNodeReport]
                                      test node id:        1
                                      status:              In Progress
                                      acknowledged frames: 0
2022-01-25T19:48:42.296Z SERIAL » 0x0109001320027305251f8b                                            (11 bytes)
2022-01-25T19:48:42.297Z DRIVER » [Node 032] [REQ] [SendData]
                                  │ transmit options: 0x25
                                  │ callback id:      31
                                  └─[PowerlevelCCTestNodeGet]
2022-01-25T19:48:42.298Z SERIAL « [ACK]                                                                   (0x06)
2022-01-25T19:48:42.304Z SERIAL « 0x0104011301e8                                                       (6 bytes)
2022-01-25T19:48:42.305Z SERIAL » [ACK]                                                                   (0x06)
2022-01-25T19:48:42.305Z DRIVER « [RES] [SendData]
                                    was sent: true
2022-01-25T19:48:42.389Z SERIAL « 0x010700131f000009fd                                                 (9 bytes)
2022-01-25T19:48:42.390Z SERIAL » [ACK]                                                                   (0x06)
2022-01-25T19:48:42.391Z DRIVER « [REQ] [SendData]
                                    callback id:     31
                                    transmit status: OK
2022-01-25T19:48:42.468Z SERIAL « 0x010c0004002006730601020000a7                                      (14 bytes)
2022-01-25T19:48:42.469Z SERIAL » [ACK]                                                                   (0x06)
2022-01-25T19:48:42.470Z DRIVER « [Node 032] [REQ] [ApplicationCommand]
                                  └─[PowerlevelCCTestNodeReport]
                                      test node id:        1
                                      status:              In Progress
                                      acknowledged frames: 0
2022-01-25T19:48:43.477Z SERIAL » 0x01090013200273052520b4                                            (11 bytes)
2022-01-25T19:48:43.477Z DRIVER » [Node 032] [REQ] [SendData]
                                  │ transmit options: 0x25
                                  │ callback id:      32
                                  └─[PowerlevelCCTestNodeGet]
2022-01-25T19:48:43.479Z SERIAL « [ACK]                                                                   (0x06)
2022-01-25T19:48:43.485Z SERIAL « 0x0104011301e8                                                       (6 bytes)
2022-01-25T19:48:43.485Z SERIAL » [ACK]                                                                   (0x06)
2022-01-25T19:48:43.486Z DRIVER « [RES] [SendData]
                                    was sent: true
2022-01-25T19:48:43.569Z SERIAL « 0x0107001320000009c2                                                 (9 bytes)
2022-01-25T19:48:43.570Z SERIAL » [ACK]                                                                   (0x06)
2022-01-25T19:48:43.571Z DRIVER « [REQ] [SendData]
                                    callback id:     32
                                    transmit status: OK
2022-01-25T19:48:43.648Z SERIAL « 0x010c0004002006730601020000a7                                      (14 bytes)
2022-01-25T19:48:43.649Z SERIAL » [ACK]                                                                   (0x06)
2022-01-25T19:48:43.650Z DRIVER « [Node 032] [REQ] [ApplicationCommand]
                                  └─[PowerlevelCCTestNodeReport]
                                      test node id:        1
                                      status:              In Progress
                                      acknowledged frames: 0
2022-01-25T19:48:44.661Z SERIAL » 0x01090013200273052521b5                                            (11 bytes)
2022-01-25T19:48:44.662Z DRIVER » [Node 032] [REQ] [SendData]
                                  │ transmit options: 0x25
                                  │ callback id:      33
                                  └─[PowerlevelCCTestNodeGet]
2022-01-25T19:48:44.663Z SERIAL « [ACK]                                                                   (0x06)
2022-01-25T19:48:44.670Z SERIAL « 0x0104011301e8                                                       (6 bytes)
2022-01-25T19:48:44.670Z SERIAL » [ACK]                                                                   (0x06)
2022-01-25T19:48:44.670Z DRIVER « [RES] [SendData]
                                    was sent: true
2022-01-25T19:48:44.754Z SERIAL « 0x0107001321000009c3                                                 (9 bytes)
2022-01-25T19:48:44.755Z SERIAL » [ACK]                                                                   (0x06)
2022-01-25T19:48:44.756Z DRIVER « [REQ] [SendData]
                                    callback id:     33
                                    transmit status: OK
2022-01-25T19:48:44.828Z SERIAL « 0x010c0004002006730601020000a7                                      (14 bytes)
2022-01-25T19:48:44.829Z SERIAL » [ACK]                                                                   (0x06)
2022-01-25T19:48:44.829Z DRIVER « [Node 032] [REQ] [ApplicationCommand]
                                  └─[PowerlevelCCTestNodeReport]
                                      test node id:        1
                                      status:              In Progress
                                      acknowledged frames: 0
2022-01-25T19:48:45.839Z SERIAL » 0x01090013200273052522b6                                            (11 bytes)
2022-01-25T19:48:45.839Z DRIVER » [Node 032] [REQ] [SendData]
                                  │ transmit options: 0x25
                                  │ callback id:      34
                                  └─[PowerlevelCCTestNodeGet]
2022-01-25T19:48:45.841Z SERIAL « [ACK]                                                                   (0x06)
2022-01-25T19:48:45.847Z SERIAL « 0x0104011301e8                                                       (6 bytes)
2022-01-25T19:48:45.848Z SERIAL » [ACK]                                                                   (0x06)
2022-01-25T19:48:45.848Z DRIVER « [RES] [SendData]
                                    was sent: true
2022-01-25T19:48:45.931Z SERIAL « 0x0107001322000008c1                                                 (9 bytes)
2022-01-25T19:48:45.932Z SERIAL » [ACK]                                                                   (0x06)
2022-01-25T19:48:45.933Z DRIVER « [REQ] [SendData]
                                    callback id:     34
                                    transmit status: OK
2022-01-25T19:48:46.008Z SERIAL « 0x010c0004002006730601020000a7                                      (14 bytes)
2022-01-25T19:48:46.009Z SERIAL » [ACK]                                                                   (0x06)
2022-01-25T19:48:46.010Z DRIVER « [Node 032] [REQ] [ApplicationCommand]
                                  └─[PowerlevelCCTestNodeReport]
                                      test node id:        1
                                      status:              In Progress
                                      acknowledged frames: 0
2022-01-25T19:48:47.026Z SERIAL » 0x01090013200273052523b7                                            (11 bytes)
2022-01-25T19:48:47.027Z DRIVER » [Node 032] [REQ] [SendData]
                                  │ transmit options: 0x25
                                  │ callback id:      35
                                  └─[PowerlevelCCTestNodeGet]
2022-01-25T19:48:47.029Z SERIAL « [ACK]                                                                   (0x06)
2022-01-25T19:48:47.035Z SERIAL « 0x0104011301e8                                                       (6 bytes)
2022-01-25T19:48:47.035Z SERIAL » [ACK]                                                                   (0x06)
2022-01-25T19:48:47.036Z DRIVER « [RES] [SendData]
                                    was sent: true
2022-01-25T19:48:47.119Z SERIAL « 0x0107001323000009c1                                                 (9 bytes)
2022-01-25T19:48:47.121Z SERIAL » [ACK]                                                                   (0x06)
2022-01-25T19:48:47.121Z DRIVER « [REQ] [SendData]
                                    callback id:     35
                                    transmit status: OK
2022-01-25T19:48:47.198Z SERIAL « 0x010c0004002006730601020000a7                                      (14 bytes)
2022-01-25T19:48:47.199Z SERIAL » [ACK]                                                                   (0x06)
2022-01-25T19:48:47.200Z DRIVER « [Node 032] [REQ] [ApplicationCommand]
                                  └─[PowerlevelCCTestNodeReport]
                                      test node id:        1
                                      status:              In Progress
                                      acknowledged frames: 0

(repeats)

@AlCalzone
Copy link
Member

2022-01-25T19:48:47.200Z DRIVER « [Node 032] [REQ] [ApplicationCommand]
                                  └─[PowerlevelCCTestNodeReport]
                                      test node id:        1
                                      status:              In Progress
                                      acknowledged frames: 0

Ahh ok I see what's happening. The node is instructed to ping the controller but it never makes any progress. The driver expects it to either make some progress or report failure at some point.

@AlCalzone
Copy link
Member

AlCalzone commented Jan 25, 2022

Some thoughts while testing this:

UIs need more feedback from the driver to show what's happening. The driver does a lot of different things but UI's can't show any actual progress. This is my 🚧👷🏻‍♂️.


@robertsLando The current results table is suboptimal, compare this

[Node 009] Lifeline health check complete in 205590 ms
rating:                   5 (acceptable)
no. of routing neighbors: 15
 
Check rounds:
· round 1 - rating: 5 (acceptable)
  failed pings → node:             0/10
  max. latency:                    164.0 ms
  min. node powerlevel w/o errors: -9 dBm
 
· round 2 - rating: 5 (acceptable)
  failed pings → node:             0/10
  max. latency:                    235.0 ms
  min. node powerlevel w/o errors: -9 dBm
 
· round 3 - rating: 5 (acceptable)
  failed pings → node:             0/10
  max. latency:                    240.0 ms
  min. node powerlevel w/o errors: -9 dBm
 
· round 4 - rating: 5 (acceptable)
  failed pings → node:             0/10
  max. latency:                    239.0 ms
  min. node powerlevel w/o errors: -9 dBm
 
· round 5 - rating: 5 (acceptable)
  failed pings → node:             0/10
  max. latency:                    209.0 ms
  min. node powerlevel w/o errors: -9 dBm

to the information included in the table:
grafik

  1. The rating should be explained to the user, including some hints what to aim for (and what a 5 for example means). The docs have some hints that should be included in the UI IMO.
  2. powerlevel is an enum and the actual value should be displayed. 9 actually means -9 dBm and could be misunderstood. Also it is not clear what this value means (the minimum powerlevel compared to normal power where there was no problem pinging the controller).
  3. Latency is the maximum measured latency
  4. Failed pings still doesn't make the direction clear - is this to the node or from the node?
  5. The no. of routing neighbors gets factored into the rating but isn't shown at all here.
  6. Overall rating isn't shown either
  7. And I'd hide columns without any values

grafik
What am I canceling after the test is done?

@AlCalzone
Copy link
Member

AlCalzone commented Jan 25, 2022

The route health check shows a bit different - including powerlevel, which is wrong too (this is also -9 dBm):
grafik

again compare to the driver logs which seem to be more informative:

[Node 002] Route health check to node 4 complete in 28667 ms
rating:                   10 (perfect)
no. of routing neighbors: 15
 
Check rounds:
· round 1 - rating: 10 (perfect)
  Node 2 min. powerlevel w/o errors: -9 dBm
  Node 4 min. powerlevel w/o errors: -9 dBm
 
· round 2 - rating: 10 (perfect)
  Node 2 min. powerlevel w/o errors: -9 dBm
  Node 4 min. powerlevel w/o errors: -9 dBm
 
· round 3 - rating: 10 (perfect)
  Node 2 min. powerlevel w/o errors: -9 dBm
  Node 4 min. powerlevel w/o errors: -9 dBm
  1. Failed pings are not relevant here, only if one of the nodes needs normal power to communicate
  2. source and target should be replaced with the actual node IDs.
  3. Again, the users should have some help interpreting the results.

@robertsLando
Copy link
Member Author

@AlCalzone Ready for new review

@AlCalzone
Copy link
Member

AlCalzone commented Jan 26, 2022

Updated, this is still open:

  • powerlevel is an enum and the actual value should be displayed. 9 actually means -9 dBm and could be misunderstood.
  • The no. of routing neighbors gets factored into the rating but isn't shown at all in the Lifeline health check.
  • Overall rating isn't shown either

What about this for the failed ping display?
Instead of Node 15 ← 10/10, you could do 14 → 15: 10/10 and potentially also 14 ← 15: 10/10 and colorize the 10/10 according to the rating. Similar for lifeline: 14 → 1: 0/10 (0/10 colored green).

@LordMike
Copy link
Contributor

I also gave this a run wtih a new test image (I assume that image is this PR ... but in theory it could be any PR ... now that I checked, many PR's are merged into one test branch, so in theory a bug in test could be due to any number of active PR's.. Hmm..):

  • Testing my node 32<->1 (far away) still gives an infinite loop - didn't see if Al did anything there.
  • The health check modal could use the name of the node, it currently just says "Node NN - Health Check"
  • Are there any plans to save these results?
    • I imagine for a stable network, it could be great to see a full-picture with healths between many pairs or nodes & controller
    • When I move devices, it could make sense to clear the check results, either all or this node only
    • When the map is made, the signal strenght between nodes can be used to place them closer or further from each other

@robertsLando
Copy link
Member Author

robertsLando commented Jan 28, 2022

Are there any plans to save these results

Actually nope, I have only added an export button to allow users export them

BTW I have synced test pr with this so fell free to give it a try with latest changes

@AlCalzone
Copy link
Member

  • didn't see if Al did anything there

I didn't yet: zwave-js/node-zwave-js#4075 for updates

  • When the map is made, the signal strenght between nodes can be used to place them closer or further from each other

Not sure if this is practical. Testing the signal takes long and if you have many nodes, this exponentially takes longer if you were to measure each possible connection.
The WIP v9 version of the driver will expose the actual routes taken (if that information is accessible) between the controller and nodes as well as RSSI on each node (if that information is accessible), which seems like a better solution IMO.

@robertsLando robertsLando merged commit 4ca3403 into master Jan 31, 2022
@robertsLando robertsLando deleted the feat#2113 branch January 31, 2022 13:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[feat] Implement health reporting between controller<->nodes or node<->node
4 participants