Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Z2M is very unforgiving if the ZigbeeAdapter is connected over a network: Its completly crashes the HA-Z2M and is not waiting for a reconnect #24924

Open
bachmarc opened this issue Nov 26, 2024 · 10 comments
Labels
problem Something isn't working

Comments

@bachmarc
Copy link

What happened?

If I connect a Network2Zigbee to Z2M then it works fine if the network is rock solid.

If the Adapter is disconnected for a short time or network unstable for a few seconds the whole home assistance add-on container crashes/stops

This is with a provoked IP conflict:
06:58:10] INFO: Preparing to start... [06:58:10] INFO: Socat not enabled [06:58:10] INFO: Starting Zigbee2MQTT... Starting Zigbee2MQTT without watchdog. [2024-11-26 07:00:59] error: zh:zstack:znp: Socket error Error: read ECONNRESET [2024-11-26 07:00:59] error: z2m: Adapter disconnected, stopping

Z2M is stopping ....

What did you expect to happen?

I would expect that Z2M is catching the event and retries to reconnect to Network2Zigbee.
The chances are good that it will be available again soon and is only a hick-up because VPN reconnects or what ever.
Even an IP conflict shutdown Z2M...

Network is not as reliable as a plugged USB that should be considered in error handling.

How to reproduce it (minimal and precise)

Connect a SLZB-06 net2zigbee adapter to Z2M and disrupt the connection for short or provoke a IP conflict

Zigbee2MQTT version

1.41.0-1

Adapter firmware version

20221226

Adapter

SLZB-06

Setup

Home assistant Add-on

Debug log

06:58:10] INFO: Preparing to start...
[06:58:10] INFO: Socat not enabled
[06:58:10] INFO: Starting Zigbee2MQTT...
Starting Zigbee2MQTT without watchdog.
[2024-11-26 07:00:59] error: zh:zstack:znp: Socket error Error: read ECONNRESET
[2024-11-26 07:00:59] error: z2m: Adapter disconnected, stopping

@bachmarc bachmarc added the problem Something isn't working label Nov 26, 2024
@schtack
Copy link

schtack commented Nov 26, 2024

I think, I have the same kind of problem.
I'm using a SLZB-06 adapter over cellular 4g link (and over a wireguard vpn), z2m stop working when adapter is disconected for few seconds.

@Arn0uDz
Copy link

Arn0uDz commented Nov 27, 2024

Also have this problem and would love a wait for reconnect option. Also SLZB-06.

@Koenkk
Copy link
Owner

Koenkk commented Nov 27, 2024

You can use the watchdog option to automatically restart z2m

@schtack
Copy link

schtack commented Nov 27, 2024

Hi @Koenkk, thanks for your reply.
From my understanding, using the watchdog is not a reliable solution, my adapter disconnects at least once every 2 to 6 hours.

Or can you explain to us how to configure the watchdog reliably please?

kind regards.

@Koenkk
Copy link
Owner

Koenkk commented Nov 27, 2024

@schtack you can set the watchdog from the addon configuration page, use e.g. 0.1,3,6,15, for more info see https://www.zigbee2mqtt.io/guide/installation/15_watchdog.html . But for a better experience, I would really recommend to stabilise your connection.

@bachmarc
Copy link
Author

In my case that was easy. But how would one stabilize a Wireguard connection to beach house?
If my USB stick crashes due to OTA attempt Z2M reports it and reconnects it.

I would consider a network the classical use case for error tolerance. I am not talking about ages just a small timeout that it waits for a re-connection like 60s.

@peschee
Copy link

peschee commented Jan 8, 2025

@bachmarc what did you end up using? I’m running z2m in a LXC on proxmox. As soon as my SLZB disappears for a short time, z2m crashes.

@Koenkk is the watchdog setting also the way to go when using LXC? This just starts the node process afaik.

@Koenkk
Copy link
Owner

Koenkk commented Jan 9, 2025

@peschee yes watchdog is the way to go, it just restarts the process after some backoff time.

@peschee
Copy link

peschee commented Jan 12, 2025

@Koenkk this does not seem to work. I've setup the sysctl service to use the watchdog setting. as soon as the coordinator goes away, the process is exited and sysctl restarts it:

Jan 12 09:40:34 zigbee2mqtt pnpm[4901]: Starting Zigbee2MQTT with watchdog (2000,60000,300000,900000,1800000,3600000).
Jan 12 09:40:35 zigbee2mqtt pnpm[4901]: [2025-01-12 09:40:35] info:         z2m: Logging to console, file (filename: log.log)
Jan 12 09:40:35 zigbee2mqtt pnpm[4901]: [2025-01-12 09:40:35] info:         z2m: Starting Zigbee2MQTT version 2.0.0 (commit #unknown)
Jan 12 09:40:35 zigbee2mqtt pnpm[4901]: [2025-01-12 09:40:35] info:         z2m: Starting zigbee-herdsman (3.2.1)
Jan 12 09:40:35 zigbee2mqtt pnpm[4901]: [2025-01-12 09:40:35] info:         zh:zstack:znp: Opening TCP socket with 10.46.0.63:6638
Jan 12 09:40:35 zigbee2mqtt pnpm[4901]: [2025-01-12 09:40:35] error:         zh:zstack:znp: Socket error Error: connect ECONNREFUSED 10.46.0.63:6638
Jan 12 09:40:35 zigbee2mqtt pnpm[4901]: [2025-01-12 09:40:35] error:         z2m: Error while starting zigbee-herdsman
Jan 12 09:40:35 zigbee2mqtt pnpm[4901]: [2025-01-12 09:40:35] error:         z2m: Failed to start zigbee-herdsman
Jan 12 09:40:35 zigbee2mqtt pnpm[4901]: [2025-01-12 09:40:35] error:         z2m: Check https://www.zigbee2mqtt.io/guide/installation/20_zigbee2mqtt-fails-to-start_crashes-runtime.html for possible solutions
Jan 12 09:40:35 zigbee2mqtt pnpm[4901]: [2025-01-12 09:40:35] error:         z2m: Exiting...
Jan 12 09:40:35 zigbee2mqtt pnpm[4901]: [2025-01-12 09:40:35] error:         z2m: Error: Error while opening socket
Jan 12 09:40:35 zigbee2mqtt pnpm[4901]:     at Socket.<anonymous> (/opt/zigbee2mqtt/node_modules/.pnpm/[email protected]/node_modules/zigbee-herdsman/src/adapter/z-stack/znp/znp.ts:158:24)
Jan 12 09:40:35 zigbee2mqtt pnpm[4901]:     at Socket.emit (node:events:536:35)
Jan 12 09:40:35 zigbee2mqtt pnpm[4901]:     at emitErrorNT (node:internal/streams/destroy:170:8)
Jan 12 09:40:35 zigbee2mqtt pnpm[4901]:     at emitErrorCloseNT (node:internal/streams/destroy:129:3)
Jan 12 09:40:35 zigbee2mqtt pnpm[4901]:     at processTicksAndRejections (node:internal/process/task_queues:90:21)
Jan 12 09:40:35 zigbee2mqtt pnpm[4901]: /opt/zigbee2mqtt/node_modules/.pnpm/[email protected]/node_modules/readable-stream/lib/_stream_writable.js:264
Jan 12 09:40:35 zigbee2mqtt pnpm[4901]:   var er = new ERR_STREAM_WRITE_AFTER_END();
Jan 12 09:40:35 zigbee2mqtt pnpm[4901]:            ^
Jan 12 09:40:35 zigbee2mqtt pnpm[4901]: Error: write after end
Jan 12 09:40:35 zigbee2mqtt pnpm[4901]:     at writeAfterEnd (/opt/zigbee2mqtt/node_modules/.pnpm/[email protected]/node_modules/readable-stream/lib/_stream_writable.js:264:12)
Jan 12 09:40:35 zigbee2mqtt pnpm[4901]:     at DerivedLogger.Writable.write (/opt/zigbee2mqtt/node_modules/.pnpm/[email protected]/node_modules/readable-stream/lib/_stream_writable.js:300:21)
Jan 12 09:40:35 zigbee2mqtt pnpm[4901]:     at DerivedLogger.log (/opt/zigbee2mqtt/node_modules/.pnpm/[email protected]/node_modules/winston/lib/winston/logger.js:231:12)
Jan 12 09:40:35 zigbee2mqtt pnpm[4901]:     at Logger.log (/opt/zigbee2mqtt/lib/util/logger.ts:198:25)
Jan 12 09:40:35 zigbee2mqtt pnpm[4901]:     at Logger.info (/opt/zigbee2mqtt/lib/util/logger.ts:211:14)
Jan 12 09:40:35 zigbee2mqtt pnpm[4901]:     at Znp.onPortClose (/opt/zigbee2mqtt/node_modules/.pnpm/[email protected]/node_modules/zigbee-herdsman/src/adapter/z-stack/znp/znp.ts:88:16)
Jan 12 09:40:35 zigbee2mqtt pnpm[4901]:     at Object.onceWrapper (node:events:639:26)
Jan 12 09:40:35 zigbee2mqtt pnpm[4901]:     at Socket.emit (node:events:524:28)
Jan 12 09:40:35 zigbee2mqtt pnpm[4901]:     at TCP.<anonymous> (node:net:351:12)
Jan 12 09:40:35 zigbee2mqtt pnpm[4889]:  ELIFECYCLE  Command failed with exit code 1.

shouldn't the process be kept running to the watchdog can coordinate the restarts? or do I need to setup the sysctl service differently?

@Koenkk
Copy link
Owner

Koenkk commented Jan 12, 2025

You shouldn't, but the watchdog fails due to the write after end error which has been fixed in #25737

The fix is available in the dev branch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
problem Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants