-
Notifications
You must be signed in to change notification settings - Fork 651
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I believe there is an issue with Z-stack 20240710 FW #518
Comments
Please post the debug log from starting z2m until this error. See this on how to enable debug logging. |
I need to do some work as they will total more than 100Mb in size. Hopefully in a few hrs from now will submit. |
Unfortunately I cannot post debug info yet as I downgraded FW last night and only had the log at info level. Today I will upgrade again to 20240716 and will update once I have the first failure. |
I have updated my Sonoff USB Dongle with 20240710 FW, ZigBee2MQTT and Home Assistant, need to reboot everyd ay because of loose all devices. Version de Zigbee2MQTT I change config to debug log I send beforce next reboot |
i'm having issues as well, CC2652RB with 20240710 FW, with ZigBee2MQTT and HomeAssistant. my devices were working smooth for almost 2 years until i updated. i flashed 20221226 back and will continue monitoring but so far so good. EDIT: woke up with the zigbee network down again, at this point i regret updating so bad |
I have done some deeper digging. For some reason the SLZB-06P7 coordinator decided not to join the network following power off and replaced it with a UZG-01 flashed with 20240710. Trying different coordinator FW versions I have observed routing is wildly different. Earlier versions to 20240710 do not cluster well around routers. 20240710 seems to elect the key routers properly, and all routers seem to connect to the coordinator as expected. Images with the map and the errors are provided below. All of the routers even if they are really close (<5m no obstacles) to the coordinator seem to have low LQI (<99); if I re-pair they seem they go triple digit LQI, but eventually they settle on low double digit LQI, and therefore concluded re-pairing is not useful. Although the battery powered devices are not connected they all seem to work flawlessly. Possibly some improvements required to get the map right. It seems to take a couple of hours for connection routing to settle. It takes close to 20 mins for the above map to be generated. One last thought. It would be a great addition if we have an option to clear all error messages displayed. Sometimes it takes more than 2 mins to manually clear error messages, which obscure action buttons. |
After of almost 2 days of operation Z2M lost connection to UZG, but automatically recovered. I timed map generation to just 2.5 mins. This zip file has all the debug logs. Logs dated 2024-09-05 were created using SLZB-06P7. Logs dated 2024-09-06 were created using UZG-01/20240710. The logs also include the instance where connection to UZG was reset by something. |
I see a (0xc7: NWK_TABLE_FULL) error in the debug logs, if this is of any assistance. |
20240315 allowed more devices to connect to the coordinator, 20240710 less and thus relies more on routers to improve stability of the coordinator. If those routers are crap, you will get very poor performance. I would suggest to first power of some spammy devices, e.g. |
I have already done this twice, powered off the entire section of 00901, waited a few mins and power on the entire section again. I will try once more and provide feedback. |
Do/Can we have any tools which allows us to influence affinity to certain routers. For SLZB which router FW version works better with coordinator FW 20240710. I was thinking of a tool which allows us to group routers and let the system automatically balance between them, therefore being able to avoid certain routers during pairing. |
I am wondering if it is possible to save the map in a nice format. |
I have 20240710 working for some weeks now. Thanks for the good jobs ! |
Oops, I forgot to better explained what I did. |
I confirm there is an issue with FW20240710. The coordinator resets sometimes after 20 mins, sometimes after hours, but it resets nevertheless. I will keep FW20240710 to assist in troubleshooting and because, since v1.40.0, Z2M automatically reconnects. My experience is that I completely lose Ethernet connectivity and I know it is not a network issue as there are hundreds of devices on the network (Ethernet and WiFi) with no issues. Please let me know how I can assist debug this issue. Log file attached. The crash debug is at the top of the log. |
This info might be useful to some. I have lots of Tuya QS relay devices which in theory can act as routers. When the UZG-01 was directly connected to some of them I had constant communication errors, although everything seemed to work with no issues, but with delays, sometimes considerable. Once I paired a SLZB-06P7 router to the UZG-01 coordinator most of the communication issues between QS routers and coordinator vanished. I get the occasional error now, but the errors are not show stoppers. The SLZB-06P7 router (00902-Router) is configured as follows: Since I introduced the SLZB-06P7 router everything seems to have smoothed out. However, because I have issues with the UZG-01 (the coordinator) restarting at random intervals, I will upgrade to the latest SLZB-06P7 router FW and give an update if the UZG-01 issues have been eliminated or some other gremlins were introduced. |
Zigbee2mqtt has the latest update: 1.40.1-1 Again last night total adapter crash... (Adapter Web GUI is working, and Zigbee reset does not help). After PoE reset, and I let it for 1 hour to stabilize, network was a mess.. Ikea bulbs were reporting no network route. had to reset all router devices with appartment elec. braker. Yesterday befor the crash I noticed that everything is very laggy. And it is like this every 1-2 weeks from the day this firmvare version is released. Already rejoined all devices. Are there any ongoing actions about this as reports like this were posted from first day it was released as beta? |
So I updated SLZG-06P7 router FW to 20240716 from 20240315. The device goes offline. To get it back online follow this:
Will update once I am confident of any positive/negative/neutral changes to the Zigbee network. |
@Koenkk my logs are posted here: Koenkk/zigbee2mqtt#23869 (comment) |
Running Zigbee2MQTT Edge Where to start? I updated a week or so ago. The update went well. I noticed Aqara devices dropped off a day later. Repaired. No issue. I then started seeing a few devices drop off. I then started to get complete network crashes. Fast forward to now and my network is struggling to stay up.
Update
I tried to find any device with these numbers and I get nothing Update |
That is the weirdest thing here.. rolling back is not fixing problems.. its like updated coordinator pushed somethint to router devices and network is not stable with older firwares any more... |
having the same thoughts as @cloudbr34k84 and @dankocrnkovic. |
i seem to have stabilised for now, but I flashed the last 3 firmware on the device retarded the device each time, then I flashed the latest and it seems to be stable again. i have 2 devices offline but that's because they are bulbs which my wife switch the lamps off manually |
Give it time. I thought that also, but then in 5-10 days router devices start to die again, bring the network down. I will never again update firmware once (and if) its stable again with some fix. |
as a walkaround and fed up with the sloppy cronjob, i made this HomeAssistant automation to restart the zigbee2mqtt docker container when my philips hue light goes offline (
please note that you need to create an entry for
make sure to replace now when the light goes offline (the simpler way of detecting when the entire zigbee network is down) it will restart the z2m container bringing everything up and running in under 15 seconds i really can't wait for a proper fix tho! |
Just in case a few are facing the same issues as mine, maybe this could help someone.
This has been driving me crazy during 2 weeks (!!!!) My advice:
Hope it helps. |
good thing i didn't update i was having problems updating it with add on repository |
being super frustrated with the lack of support or work being done to fix this, i bought a new i think the new firmware ruins the coordinator somehow but it's just my guess. i tried flashing the @nicolasvila what you described was the most basic troubleshooting one can do, the same instructions are found everywhere on the web, including in the docs. your issue is not related to the one here and does not help but dilute the actual problem @emaayan i strongly recommend you to not update! EDIT: one month in, still no issues with the new zigstar i still see people here updating and then having issues. STOP UPDATING THE ZIGBEE FW!!! you gain nothing but cause issues |
Thanks, now i have to figure out to switch the sonoff from zha to z2m cause i think the zigbee led usb light bars aren't compatible zha |
Following testing over the last couple of weeks I now have feedback, which I hope might be of some value to some. The physical area where the ZB network has been deployed is as follows:
Initial deployment, experiencing lots of device communication errors and coordinator disconnections every few mins:
Re-deployment, no disconnections of any sort with coordinator or devices:
I have therefore concluded that the coordinator (UZG-01 and/or SLZB06x) when flashed with FW 20240710 runs stable when it communicates with dedicated routers (using better router-devices might have solved the issue as well, but have not tested this scenario). To stress test the system, all end-devices have been configured to report instantly on every change and they have been running faultless for a few days. No disconnections or errors of any sort. One last observation is that Tuya devices are definitely not the right way to go, especially on larger ZB networks, for a number of reasons, the primary being stability, configurability and reliability. I am slowly replacing Tuya with Sonoff devices and so far all is good. If I find any subsequent issues related to FW20240710 over the long run I will post on this thread again. The only issue which remains unsolved, but not related to Z2M, is the coordinator (UZG-01) disconnects whenever I try to access its webUI; tried latest ESP32 FW 20240915 but issue persists. |
After 3weeks to try to run with 20240710 and sonoff dongle P , re flash firmware, re pair all devices, Always sames problem after a couple of hours losing devices, lag in command, not possible to pair New device. Need to unplug dongle 2 times by Day. I try also to dowgrade to fw 2023 but finaly I dowgrade to the 20221226 fw and everything working fine now ! no problem for 7days! Zigbee2MQTT version Frontend version |
After updating to 20240710, all my sensors nearby and directly connected to the coordinator started to fail. I installed the old firmware from may 2023 and have not experienced any problem the last week. So rolling back to the previous version worked fine for me. |
I'm no using launchpad_coordinator_20221226 and will report back if this solves the issue. |
@rursache - By reference to your "success" comment (#518 (comment)) I fear I'm at that same point you reached (but likely less technically adept myself). I suspect there has been an element of user error on my side with how I've migrated from different Sonoff Dongle-P coordinators leap-frogging one FW date onto a "spare" and swapping out the "original" or flashing firmware (on Windows via python method - I cannot get into bootloader mode with the button pressed on plugging in USB etc) that has resulted in some element of corruption lingering. Of late my network has devices going offline as quickly as circa 2 hrs and as each eureka moment of a perceived fix does little more than increase randomness I'm in need of a fresh start. Can I ask if your network has remained stable since your replacement coordinator and what FW is that running? @habitats-tech - Noting you've had success with this coordinator could I ask what FM you flashed it to (noting that the SM Light Webflasher naming convention differs with v2.5.6 seemingly the most recent? I've become somewhat uncertain where to turn and what nuggets of information to take from assorted posts as the feedback on coordinator FM flashing results are hugely varied. as such I'm likely to start a topic to gauge what is considered best practice to ensure a clean end result. Did either of you try and retain the IEEE address of a previous coordinator or rebuild from scratch? I'm concerned I might need to remove all devices; reset them and re-build progressively to avoid falling back into previous traps. With you both having had success (applying different methods) I thought I'd ask for additional clarity on how you went about restoring your network post plugging in "new coordinator" and thus whether I should be removing devices, powering them down, deleting coordinator_backup.json file or any other such step in any specific sequence. Everyone appears to have had minimal issues and fathomed a permanent fix or continues to struggle (and suffer) and I'd started with issues that seemed related to one device loosing connection too regularly and ended up with a largely unusable network that I'm not likely to fix by attempting similar methods. Log attached is the pair of logs merged from a restart this morning circa 6am that had the network down again by 8am. Koenkk/zigbee2mqtt#24387 |
it did, perfectly stable. fw and details are in my initial comment
i did not but it didn't seem to matter to any accessory, everything works fine with a new IEEE address. i was just a simple swap for me, nothing else |
Many thanks - great to hear somewhat painless for you and gives me confidence to attempt the same (perhaps playing safe and seeking out the appropriate 20230507 FW as a starting point for my SLZB-06 trial). |
I tried the 20230507 FW and now Z2M just does not start. [09:04:28] INFO: Preparing to start... |
After doing the below, it's still having issues readding devices.
|
I flashed a sonoff dongle P (CC2652P) with 20240710. It worked, and everything came up, but switching lights was randomly very laggy - particularly if I switched any one light (instant) and then another a few seconds later (took 10-15 secs), and viewing the map in Z2M took minutes compared to normally just a few seconds - I only have 16 devices. Re-flashed back to 20230507 and all is well again. |
I got also issues with a slzb06p7 with latest firmware: when I do start pairing, I get the error: and also when I do a restart of zigbee2mqtt, I also get often the error: I have 125 devices and after a week, 50 devices (most router devices) are getting status offline |
So, I ended up nuking everything again, moved from the SLZB-06 to the SLZB-06M I had, and it's more or less been fine. for anyone playing along at home that needs a workaround for now, this is whats been working for me and decently stable, other than one time I did something odd and had to go around and just push all the buttons on pettery operated stuff 1-2 times (just a click to wake it up?) and it's been fine. SLZB-06M port: tcp://192.168.100.26:6638 |
@Helgimagg01 there is no fix once you updated and have issues. see my comment |
I managed to revert the update. Just one of my shades cover didn't fully recovered and I ended up changing the device itself. Guess I was lucky |
So what firmware is the most stable now for usage? 20230507? |
I have updated my SMLIGHT Zigbee LAN Adapter CC2652P Model SLZB-05 with ZigStarGW-MT.exe from In Z2M the devices that didn't work anymore showed up normally as online. I have now reverted back to Network: 87 devices |
Has no one else came accross this behaviour of getting this error once updating a SLZB-06? All I did was applying the latest firmware 20240710 through the SLZB-06 web UI, it immediately resulted in HA Zigbee2MQTT seing multiple erros comming in. Going back to firmware 20221226 didn´t fix the issue. I can´t get it to work anymore - not even with a plain vanilla instance. [07:30:41] INFO: Preparing to start... |
20221226 is what I use. Works well and the releases that have come afterwards seem to be riddled with issues looking at the comments. |
Can confirm 20221226 works well. I'm running 20221226 since a couple weeks ans everything runs well. |
I changed now to slzb-06p7, i can't find a firmware from 20221226 for that new zigbee chip in that device :( |
Hey folks, in general 20240710 has been the most solid firmware for me in my network of ~110 devices. While I've had some issues in the past few months, every time I've traced it down to a bad router or device. Today, I had one of my light switches disconnect (https://www.zigbee2mqtt.io/devices/43076.html#enbrighten-43076). The switch would respond to commands from the coordinator. However, no responses from the switch would make it back to zigbee2mqtt. I sniffed the traffic and saw it was routing through another Enbrighten switch, which was then sending it to the coordinator. However, there were no debug logs from herdsman indicating it got the packet. First, I simply restarted zigbee2mqtt and it made no change. Then, I shut down zigbee2mqtt, pulled the Sonoff adapter out, plugged it back in, and restarted. After zigbee2mqtt started up, and I sent an "on" command to the switch, after a second or two the response showed back up in zigbee2mqtt. However, the routing path remained exactly the same. The only thing that changed was that the firmware on the stick was restarted. So far, this seems like a once-every-few-months issue, and I bet it's dependent on how often I restart my server. Anyone else see anything similar? |
Flashed SLZB-06P7 with 20240710 FW a couple of weeks ago. Since the new FW was applied the device randomly stops receiving updates from devices, while Zigbee2MQTT reports no issues except communication errors.
Restarting ESP32 or Zigbee (the last time I restarted Zigbee, was not enough, I had to restart ESP32 for the device to start responding; it could have been I did not give enough time for the devices to start reporting - gave it a couple of minutes which was my prior experience with the failure) the SLZB-06P7 starts processing packets again.
Time between failures, one took 2 days, and another 6 days. The last 2 weeks the device failed with the same issue 3 times.
Initially I thought it was an isolated incident, but now I am more confident is a FW issue.
Are these type of issues related to TI chipsets only?
The text was updated successfully, but these errors were encountered: