-
Notifications
You must be signed in to change notification settings - Fork 570
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Zombie fix (Refresh command 0x12) #491
base: master
Are you sure you want to change the base?
Conversation
This state occurs on some newer devices when they lose power and come back online with no internet access Will fix devices failing to come back online due to this issue and also handle devices that came back online but were not controllable until their state was switched Every manufacturer I have handles this situation differently; there are likely still outstanding issues on manufacturers I do not have devices from or even firmware versions on devices from the manufacturers I have tested Tested on: Gosung, Sunco, Treatlife, Helloify, Supernight Add a REFRESH command (0x12) to pytuya Handle the refresh command in pytuya similarly to the handshake poll Call the refresh command if we successfully connect but fail to update the initial status Handle new edge cases, including: * Incorrect switches to type_0d * Heartbeats timing out before refresh is done * Refresh command triggering two responses, one of which is an empty 0x08 status response Yet to do: * Configurable dpIds to refresh per device * Decode status responses before deciding who to dispatch them to * General cleanup for a better flow
So, yesterday I setup a single gang WiFi Smart Switch in HA + localtuya after getting the required keys. This one has the WA2 board, so no custom firmware possible. Worked perfectly in its own VLAN with no DNS and WAN access, completely isolated. HA reports firmware version as 3.3 After power cycling, it went zombie, but it would still connect to its access point and respond to pings. After applying this PR and rebooting HA, it picked up the switch's status and started working right away just fine, just like yesterday. I didn't need to resync the device with the Smart Life app nor give it any connectivity. Great work! |
This PR fixed my issues as well. LGTM! |
Worked for me as well on some newer smart bulbs I have. Good job! I think this also fixes #87 |
hi, I'm strrggling with the same, but I am completely new to HA. How do I implement your fix? thnx for all the effort! |
If you want to download before it is merged, you can check it out with git using "Open with" in the upper right, or you can download it as a zip file at: https://github.com/Elendilon/localtuya/archive/2ee7e5c78831588f1a4da7428baad135c605bb62.zip Then install the code normally as a custom component. Upload the custom_components/localtuya files in the zip to your HA's config/custom_components/localtuya directory. You can do so over SSH/SCP (install the HA SSH addon), or via the file manager you can install as an addon. HA's forum would be appropriate to search for more information on how to install, run and debug custom components. |
Unfortunately it doesn't. When trying it all my Tuya plugs became and remained unavailable. |
Applied on my HASS instance. It solved the "unavailable" problem while the internet access was blocked. Need one more reviewer to approve the changes. |
I can confirm that it's working for my Novostella bulb as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works fine.
Can confirm that it fixes my issue as well |
This fixed my down lights too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems to be working fine from here after running for a while.
@Arakon If you want to turn on debug logging and then post a log, I can see if I can tell what your devices are doing. None of mine do the same; but as noted above every device manufacturer seems to treat refresh slightly differently. |
Okay.. I'm not sure if it was just some really weird glitch or what, but I just reinstalled your version and the issue no longer happens. Only difference is that I also added a Tuya power socket since I tried last time. It's possible that I messed up copying all files and folders over last time, or that it simply glitched out. If it comes back, I'll try to get a debug log. |
Doesn't seem to work on my mid-2021 Mirabella globe (DPs in the 20--26 range)... I get a Debug log of trying to add the device
Confusingly, I don't think there are any of the debug log messages added by this PR in the log above, but I can confirm that the code in my
as suggested as a fix in #574 |
Ugh, this is yet another unique way this interaction can happen. The flow that the code is using is: (Original) Connect In this debug log it looks like your path through this part is: Connect So you are not making it into my code at all, because your device responds with a valid packet instead of a failure - its just that the valid packet doesn't have any useful information. Eventually, during the normal/original code flow you try to get a list of usable DPS to continue config flow, but then that is empty and config flow bombs out. There is one missing line from your debug log that I would expect with my code - "Started heartbeat loop" - so it is possible you are not actually using my code? But given the rest of the log you wouldn't have made it into any other new code anyway. I can add something that detects empty dps as an error pretty easy, but I'm not sure all of the other repercussions that might have. It's been very hard to dance around all the various ways different devices respond to different scenarios. Your device, given the command you have run that works - will initialize if sent any "set" command. But not all devices will, and it is most likely a bug on the manufacturers implementation that it does so. So far, they all initialize if sent a "refresh" command - but some devices will kill the connection if sent a refresh command while they are already initialized, so we can't just send it every time we try to connect (or periodically like another PR is trying to do to solve the energy options not updating). Some devices respond with a single packet of type refresh, when sent a refresh command, others respond with two packets - and this tuya library was always set up for "one message sent, one response". So I had to implement a hack to handle that. Some devices will respond to the "heartbeat" command even when not initialized, but others just close the connection or respond with garbage if you do anything except send "refresh" - so I had to move starting the heartbeat to after we successfully get a good response back. Ideally there would be some way to ask the device if it is initialized - and that "garbage" response we get at first may be saying exactly that, but since we can't decode it we don't know. Ideally there is some concrete flow we can use that will work with all devices from all manufacturers - but I'm not even sure that is the case, given googling finds plenty of people with issues trying to use the official app when the internet is down and it is in local mode. This probably isn't something every manufacturer tests for. Anyway, I will make an update later tonight that detects empty dps responses as an error - but I can't test it, so I'll send you a branch to test if you are willing. |
Sorry for the delay, been busy with thanksgiving stuff. I haven't even had a chance to upload and run this, so it may even have a syntax error heh. But I added a line that should throw an exception if we get a valid status packet that contains an empty dps object. Just one file modified. |
@Elendilon thanks for that. I'll give that a go as soon as I can but.... The globe has stopped misbehaving. I did notice before that, despite all DNS traffic being blocked, it somehow magically managed to know an IP for the Tuya cloud, and try to connect to it, which resulted in a zombie state with no DPs. It now seems to have stopped doing that. I suspect it might have had some sort of DNS cache that would survive a power-cycle (even with a day's wait), and that cache would have eventually expired, leaving it to behave normally, and be happy to expose its DPs. It's all conjecture at this point, but I'll keep an eye on it, and try your fix when things go wrong again. |
This fixed my Tuya Lights which I also power on/off manually. Please accept this PR. |
There is another PR implementing this same 0x12 command (for a different purpose). That PR is almost ready to be pushed. Once it is, I will pull that code here (they conflict slightly) and also do a bit of a rework on how a lack of response to the 0x12 command is handled (we found a few more/different ways devices handle receiving the command, and one of them is to revive but just not respond at all; so I need to keep trying the connection attempt if the response times out). |
Hi @Elendilon - happy to help test once you've merged this change in with the most recent changes. I've been working through "awakening" zombie devices, but your solution seems to be a lot more elegant than I've been able to figure out yet. |
@Elendilon thank you, you saved my day. Im new to localtuya, got it working yesterday with some led stripes and ceiling lights. I was happy that everything worked, everything was blocked in my isolated VLAN with pfsense, also saw that all devices were displayed as "offline" in the tuya iot dev console... ...until my wife switched off the power... I searched hours, enabled and disabled my pfsense firewall rules, no chance. After power cycle nothing worked anymore. EDIT: With your branch Lampux-RGBceilinglight is working after power cycle. Tested Devices Working with your fix, not the Lumary Stripes: Tested Devices that stay unavailable with your fix: |
Is this likely to be integrated soon? If not - I have localtuya installed via HACS. How would I install this fix, and would it interfere with HACS ability to update/manage localtuya? |
I'm really waiting for this to be merged as well, |
Merge original request rospogrigio#491 from upstream branch
Confirming that with this PR, my Globe bulbs have started working (the unavailability problem after power restart is gone). Thanks! |
Workaround for rospogrigio/localtuya#445, at least until rospogrigio/localtuya#491 is merged.
I copied the files over my existing localtuya files, and all my lights were unavailable after a reboot (before rebooting, they were working). Rebooted twice for good measure. I have two types of lights, what information would be useful to help troubleshoot this? Rolled back using the HACS redownload option, and they were working again. |
Sooooo, how's progress on this? It would be nice to finally see this merged. |
@Pirateguybrush this branch is getting pretty old, it will only works with HomeAssistant 2021.12 and earlier. The latest master of rospogrigio:master needs go be merged into this branch for it to work with newer versions. Any chance of merging in master @Elendilon ? |
The last LocalTuya update looked liked it adressed a similar issue. After removing my LocalTuya devices and setting them up anew, for me all issues are gone. If you're waiting for this PR to be merged, maybe check to see if that works for you too. |
Thanks for the tip @codingcatgirl . Grabbing the latest did not fix my lights initially, however I have ported the additional relevant changes in a new PR here: #817 |
Discussed in issue #445 , and see the commit comment. This may need further work (due to edge cases on the way each manufacturer handles this command) or restructuring.