-
-
Notifications
You must be signed in to change notification settings - Fork 32.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ZHA becomes unavailable. Fails to trigger lights or other zigbee devices #99305
Comments
Hey there @dmulcahey, @Adminiuga, @puddly, mind taking a look at this issue as it has been labeled with an integration ( Code owner commandsCode owners of
(message by CodeOwnersMention) zha documentation |
Please include the diagnostics JSON for the ZHA integration. A delivery failure after three retries is just that: a delivery failure. It's not something that can really be fixed within ZHA because individual device connectivity is entirely controlled by your mesh and outside of ZHA's control. |
Hi ok, it's just weird that after an update i've begun experiencing very frequent delivery failures where before the zigbee mesh was solid and stable. Attaching Diagnostics JSON here. |
Same here. ZHA was working smoothly until this most recent version. Now I'm having issues losing control over lights on a daily basis. |
same here. for me its a door sensor and temp sensor. they are solid and stable for like 2-3 days, then they go unavailable. Sometimes I get lucky and can Re-Pair them and they will work again for a day or 2, sometimes they wont even Re-Pair and I have to delete the entire network, readd everything from scratch. |
I usually dont have to do anything, eventually stability comes back and lights work again. or I just reload the ZHA integration and everything works again for a random amount of time |
@GuyLewin @somerandomuser1 can you edit your comments to include the same diagnostic information for the ZHA integration (not individual devices, the integration itself!)? |
same issue with home assistant yellow. multiple lumi device become unavailable with error log: Traceback (most recent call last): The above exception was the direct cause of the following exception: Traceback (most recent call last): |
Aqara/Lumi devices require a "compatible" router device as a parent, otherwise those are dropped from the network and do not reconnect. There was a list of compatible aqara routers posted on forums. |
that is something I don't know about. when I was using zigbee2Mqtt I don't have this issue. this issue only present after I switch to HASS Yellow and ZHA |
Same Issue here had been working for years. What is the workaround here? |
Similar issue, lots of network failure messages nwk invalid request, mix of device manufacturers as they are all lights on this network. Moved my WIFI to avoid overlap, WIFI 1 and 6, ZHA is currently on Band 25. Which is showing only 20% utilisation.
Removed coordinator (Slae.sh CC2652R) for an hour then reconnected. Errors started re-appearing. Example below:
Anecdotal - First noticed issues when automations starting failing. |
some new stuff from my logs. it fails to start zigpy.application. and then says the <EmberStatus.NETWORK_BUSY: 161>. So it seems like since 2023.7 zigbee/ZHA is polling devices too often? and that causes a flooding of the zigbee mesh if there are a lot of devices? I have 106 zigbee devices, but it was all running super smooth before 2023.7, now it's at the point where I need to restart HA every day due to ZHA not being able to send commands to devices, due to (my guess) the zigbee mesh network being overloaded from polling or other checks? Here's an updated log excerpt from yesterday when I had issues where on my phone in the HA app it said Application Controller Not running instead of the usual EmberStatus.DELIVERY_FAILED 102 notificiation at the bottom of the screen when trying to turn on/off a light. Attaching a full log file here, along with some excerpts below. Starts off with this:
Then it keeps doing these clusterhandlers for a long time. Then it does this over and over:
Then in the end I get:
|
my ZHA just broke again. tried repairing, nothing, and i mean NOTHING, will pair, not my Endpoints, not my Router devices, nothing. I tried rolling back to the last stable build which usually fixes this for me, but its still busted. ETA on a fix for this nonsense plz? |
After a bit of testing, incase anyone else has this issue, the ONLY thing that seems to fix it for me is to move my Coordinator to a different USB port, literally any port, as long as its a different one from the one that it was previously in. After doing that I can now get pairing to work again. |
I also have the same issue. |
This is a newly acquired bug for me too. I have been fighting this problem for over a month now. I need to figure out how to revert and keep it from auto-updating. Ugh |
I have this problem since HA 2023.8.4 |
I might be ve completely off here but I spent quite some time with similar errors today (delivery errors, extremely unresponsive network) and after hours of debugging and checking interference and what else, it seems like.. I found a misbehaving device (Philips Hue light strip). As soon as I unplugged it for a few minutes everything instantly got snappy again and worked perfectly fine. Even though it seems like your issues seem to be related to HA versions, I thought I might just put this here, maybe someone is having the same issue and is spared a few very frustrating hours. |
I did find a light (ikea trådfri driver) that looked like it was fine in the interface, but in reality was not responding. (Closet we rarely open) so tried reconnecting it so its working again and will see if things improve |
I have downgraded from 2023.8 to 2023.7.3 as I note that in 2023.8 there is a fix #97539 which purportedly resolves an issue with service light.turn_on. Its this service failing in automations that alerted me to the original problem. I will report back any observations. |
How did you determine the defective device? |
A little better stability but still some devices failing |
Reverting back to 2023.7.2 fixes the issue for me finally. |
well, once again, my endpoints and a few routers became unavailable and I could not get them to pair. This time I decided to pull the trigger and downgrade to 7.1. Doing so enabled me to Pair my devices instantly with no issues, no changing USB ports, no wiping out the network and starting from scratch, it just pairs right away the first try. Will report back if I have the issue anymore, hopefully this fixes it for me. Incase anyone else is wanting to downgrade to an earlier version to solve this issue, here's how. In the WebGUI go to settings>addons> install the "Terminal and SSH" addon. Wait for it to finish installing, theres an option in the addon to enable viewing it in the sidebar, enable that, and Start the addon. Wait a few minutes, then on the left hand side you'll see "Terminal" appear. Click it, once in the Terminal, this is the command to type in: ha core update --version=2023.7.1 Hope this helps. |
The Thread integration with Zigbeed is no problem if its enabled or not then one of my 3 test system is running with over 30 devices most IKEA controllers also blinds and have Matter server on the same install and the thread and Zigbee is rock solid. |
I've added the lines with the ZHA logging to the config file and restarted HA. God is a clown so naturally it started to behave well again and I can't even provoke the problem. Well, sooner or later it will and then I will post the file for you. |
Now the ZHA has failed. Nothing Zigbee is working but all non Zigbee (wifi and Telldus) are still ok, so it's not HA that has crashed. |
ZIP them both and email them to me please, if they're too big to post here. |
Shouldve kept mybstupid mouth shut. After updating from 2023.9.2 to 2023.9.3 I get the same error 102 every day and all zigbee devices stop responding until restart of the system. |
I disabled thread, openthreadborderrouter, and matter and enabled them after boot and now it seems to work more reliably. |
Problem accelerating again. |
I located the issue to my problem happening again. My wife had moved the Zigbee SKyConnect stick on top of the HA NUC :) moved it away again and now it is stable once more. My own fault since I never informed her it cant lie on top of the NUC due to interference. So now all seems to be ok again since at least 4 days. |
Sirlabs have (yesterday evening CET) released one update with some bug fixes that can fitting in this case but we will see if the devs will cooking one updated firmware that fixing some of the network steering problems. |
I too have had the same issues, where Zigbee devices become unavailable/offline in Home Assistant. I deleted the ZHA integration and started again from scratch, using channel 25 which had a consistent energy_scan of 1-2%, re-pairing every device manually, but what I noticed was that as soon as I paired around 8-10 devices, the warning message "Zigbee channel 25 utilization is 89.67%!" would be between 85-95% and then devices would start to stuggle pairing and going offline. I fought this for a few days and tried 3 separate channels, using different USB extension cables (0.5m, 1m, 2m, 3m ,5m, 15m) with no difference in behaviour before I deleted the ZHA integration and set up the Zigbee2MQTT integration/addon which is working perfectly with the short 0.5m USB extension that comes with the SkyConnect (possibly even more responsive than ZHA was and returns more details about the Zigbee devices, with a better map showing the device connections), so I can definitely recommend Zigbee2MQTT for those considering another option (you can always take a backup an migrate back to ZHA once the issue is resolved) I wanted to share my information as there certainly appears to have been an issue introduced in the ZHA integration, which has gotten worse over time and all debugging and reconfiguration cannot resolve the issue. |
@hcross13 45 devices and only 3 routers |
Sorry but I seriously doubt that you can build a 48 device Zigbee network and only have 3 router, as nearly everything not running on battery is a router. |
@ekalle-swe I'm using powered Tuya Zigbee light switches, but they are 2 wire switches (no neutral wire), which do not function as routers, so I need the 3 repeaters to ensure a smooth and reliable network coverage within the house. As I mentioned, everything was working fine for over a year with the exact same hardware in ZHA, then I started getting devices going offline, so I included the details for reference and @MattWestb I have read through the articles to try and reduce interference since encountering the recent problems, but I suspect that there's something wrong in the ZHA integration, that may only manifest under some scenarios. After recently migrating to Zigbee2MQTT my Zigbee network and automations are stable, reliable and fast, which I hadn't had with ZHA for a few months. |
Your network is working in star mode the routers is only talking with the coordinator and not using the neighbors = no redundancy / rerouting possible. I have my production network overlapping 100% with my WiFi (with over 150 WiFi devices most ESPHome) and its working good but not perfect then i have one backbone of IKEA plugs that can going thru also then having very strong signals from other systems. So having over 10 children connected to one good router (the chip (MG21) is OK but the antenna is very bad) you cant getting it working well its only time you is getting more problems also with Z2M and DEConz then its not the host system that is bad its your network is wrong build and is better using only WiFi devices if you like building like that with all its bad things like more single point of failure. |
In ZHA you shall putting this in your config after have adding at least one router for getting your network working better:
Its making the coordinator not accepting children so its must using one other router and also disabling source routing then its only making problems in your configuration. But in the end one mesh network needs good routers for working well !!! |
@MattWestb thanks for the helpful information, I may look to change some hardware in the near future and give ZHA another go. I still don't understand why it was working in ZHA for over a year without issues and why it still works in Z2M now, but has started getting issues with ZHA in recent months with no changes to hardware, only software changes/updates to ZHA/HA? |
Did you start your Zigbee voyage with wall switches and have the rest on wifi/433/...? |
You is not knowing what radio interference have changed around you network and then its not one valid statement you have going over one limits and getting more and more problem. Some fundamental readings: https://www.silabs.com/documents/public/user-guides/ug103-02-fundamentals-zigbee.pdf |
I don't doubt that your network works now. I want to figure out why, as I have no direction to look without more information about what is being done differently, if anything. Can you describe:
|
Been running flawlessly since 23.10.5 |
Here we go again! |
I have been having issues again too. This time, though, it was a bunch of NETWORK_BUSY errors (hope I remember the wording correctly). Not a single device was reachable, nor did groups seem to work. The error could be resolved by restarting HA. |
Here we go again. @puddly, maybe a clue. Edit: After again removing those two temp sensors and adding them under a router, the Zigbee network took a big sigh of relief and has worked flawlessly since. |
Reload in 23.12.2 did not solve the problem. |
This issue hasn't had any activity from the original author in a few months. Please open a new issue if you have problems. The issue you're currently describing is not related to this one and is being tracked in other issues. |
Most interesting! |
It could be this one: #97662 |
The problem
Randomly ZHA stops being able to send requests to zigbee devices and reports back EmberStatus.delivery FAILED on for example light.turn on or off service. This can start happening after a few minutes of a restart, or after a week of running smoothly. I was finally able to find info in the log when this starts happening. I believe this started happening around 2023.7.x.
Only Zigbee lights or devices are affected, Wifi lights or other wifi devices work fine.
What version of Home Assistant Core has the issue?
2023.8.4
What was the last working version of Home Assistant Core?
No response
What type of installation are you running?
Home Assistant OS
Integration causing the issue
ZHA
Link to integration documentation on our website
https://www.home-assistant.io/integrations/zha/
Diagnostics information
No response
Example YAML snippet
No response
Anything in the logs that might be useful for us?
Additional information
No response
The text was updated successfully, but these errors were encountered: