-
-
Notifications
You must be signed in to change notification settings - Fork 32.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error in lutron.py -- Hass.io loses ability to control LutronRA2 entities #20348
Comments
Still occurring in 0.86.1
|
This sounds like thecynic/pylutron#17 |
I've created thecynic/pylutron#23 to patch the pylutron library and hopefully resolve this bug. Once that patch is accepted, we still need to wait for a new version to get published and update home-assistant to use the new version. |
Still seeing this in 0.87.1
(Apologies if my attempts to get this noticed are in the wrong place. I'm surprised this isn't getting much uptake. Is no one else is seeing this problem?) |
I'll not seeing this error. Can you describe the behavior in more detail? Does it work at all? Does it stop working? |
After rebooting things seem to be OK for a short period of time -- maybe as much as 30 minutes. Then on/off commands from Hass.io start failing. There's a second or two of hesitation when clicking a light toggle, no result, and then the toggle returns to its prior position. Checking the logs shows the above error. |
Do you reprogram or make other changes to your RadioRA setup at all? I've only experienced the behavior you describe when updating the programming in the main repeater, or otherwise suffer some odd network problem. I know it's annoying, and I'd love to get this fixed, but right now it's at the mercy of the pylutron maintainer. We could consider forking the library, or asking the maintainer if they would consider adding additional people to accept/reject pull requests. |
@cdheiser I really do appreciate the feedback and willingness to help. Yes, I make changes to my setup fairly frequently, but before now it seemed that cold-restarting everything usually resolved any issues. This problem does seem different. Yesterday I did a completely new setup of HA 0.87.1 on a newly formatted SD card. It ran well for a few hours, but by bedtime, toggling lights was beginning to flake out again. Then by this morning the toggles were completely unresponsive. Slightly different symptom: the toggle would stay in the ON position, but the Lutron entity was unaffected. The logs show this familiar info:
I'll just revert to 0.84.6 and wait it out. That version is pretty much rock-solid for me. Again, my thanks. |
I had this same problem. Works for me in 0.86.4, failed when I updated to 0.88.1, I rolled back. Error Log Entry:
|
There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. |
This is still an unresolved issue in 95.4. |
@ToddNJ your issue looks different than this one, appears your using caseta? @grantalewis is using RadioRa2, and a different library altogether. I'd suggest you open another issue for yours. @grantalewis can you describe your setup a bit? I'm afraid the troubleshooting that @cdheiser is doing may possibly provide a workaround for your issue, but I don't think it addresses the root cause: why is your HASS instance losing connection to your Ra2 main receiver? I believe we've chatted about this before on Reddit, but I'll add that my RadioRa2 install is working well with the latest home assistant release (still), so I think we need to diagnose the why. |
The core of my system is OmniPro II / Lutron RadioRA2. Nearly all of my light and fan controls are Lutron. I got my start with Home Assistant in 2017 (v0.7x-ish). I started on Hasbian, moved to Hass.io, and then have gone back/forth a couple of times since. I enjoyed a very smooth HA experience until January, 2019 with v0.85x at which time I had to revert to 0.84.6. I've been stuck on that version since January. (Despite its increasing age and ever-growing list of missing features, 0.84.6 still performs really well for me.) I needed an environment that would let me experiment more and so moved Hass.io to VirtualBox earlier this year (Ubuntu 64). That didn't provide me with a solution to my Lutron problem, but it at least gave me a way to test new versions while still having an easy road back to my stable 0.84.6. I also keep an RPi3 updated with the latest HA release in case a solution suddenly becomes available. (I hope to eventually be able to use Hass.io on RPi3 as my day-to-day system.) I've tried literally every release since 0.84.6, and they all exhibit exactly the same issue. After rebooting, the system seems to be fine for the first hour or two and then I'll notice that the UI begins losing the ability to control entities. This video demonstrates the problem ( <-- Dropbox link to MP4 video). When the issue begins happening, I can control an entity 1 more time using its toggle. The entity will reflect the change, but the toggle will revert to its prior state as if somehow indicating that the change was unsuccessful. At that point, the entity will not respond to any additional changes from the UI. The only way to regain control is to reboot. Sorry if this is long-winded but I didn't want to leave anything out. I'm happy to help in any way I'm able, so please feel free to call on me for testing, more information, etc. |
Thats a good explanation, I appreciate it. The video helps me understand the behavior, which is different than I was understanding from your previous stuff in writing. Going to continue our convo on discord to keep this issue clean, if we come up with anything, we can post it here. |
I’m glad I found this because I’m experiencing exactly the same issue with HA and Ra2. At first everything seems to work just fine, but after an hour or so I experience the exact behavior seen in the video. My automations and service calls seems to still work just fine, but controlling through the front end becomes unusable. Happy to do what I can to provide more info to help troubleshoot. I’m on 93.1 using Hass.io. |
@Tangston311 Interesting. So in your case the actual communication with the lutron controller itself is not affected? Just that frontend gets out of sync? If it's the latter than I'll need some help from someone who's more higher-level stack in HASS oriented to help figuring that part out. @grantalewis Do you happen to have logs for when this happens? I'd also be curious to see if manual service calls work fine as well. Would be good to identify where in the stack the issue is creeping in. |
@grantalewis Oh I see the stacks above. Ok, yes, my pylutron PR should help with that. The connection management previously was very optimistic with respect to how well things remain connected, and thus was not really robust. I wonder if your network is somewhat flaky. Maybe you have things on wifi and they drop, etc? Doesn't really matter, just curious as to why you see it so much more often. |
@thecynic, @JonGilmore suggested debug logging via Discord DM, so I'll do that. Going with
for now but let me know if you want other options enabled, etc. It seems that @Tangston311 and I are seeing the exact same symptoms. UI misbehaves but entities still respond to homeassistant.toggle, homeassistant.turn_on, homeassistant.turn_off, etc. via the Services panel. Network: I suppose it could be network-related, but that doesn't account for rock-solid performance from 0.84.6. It's like night and day: fantastic performance with 0.84.6 and before; completely predictable failure with every version since. |
@thecynic , @grantalewis is correct, I think we're both experiencing the same behavior. I have a few automatons that work by simply turning things on or off (regardless of current state) with "turn_on" or "turn_off" service calls, but controlling through the UI becomes inoperable. Additionally, when this happens, HA no longer knows the current state of a light, so if I attempt to create an automation based on the current state of an entity it won't work because it thinks the light is currently off when it's actually on, or vice versa. Network issues could definitely be a factor for me: I noticed that sometimes the Ra2 Inclusive software couldn't find the main repeater, so I tried relocated the main repeater to a different location. Now the Ra2 software can always find the main repeater, and I noticed this lengthened the period of time that the HA UI would function: it went from bombing out after ~30 minutes to lasting most of the day, but inevitably it still exhibits the same behavior. I can't explain why because the main repeater was hard-wired in both locations, but it does point to some network instability that may be causing the issue. |
I don't mean to be stubborn, but unless 0.84.6 is just more tolerant of borderline network issues, I just don't think it's network related. I have no problems with 0.84.6 whatsoever -- not one, not ever. |
OK, let 0.97.2 run for a while today. As soon as I saw the problem occur, I grabbed the logs from 3 or so minutes prior. I deleted a lot of log entries that I'm fairly certain are not involved in the issue. The .txt file is attached. Thanks for taking a look. |
Hm, this log doesn't include the "normal" stack trace. I'm curious, did it show up again this time? I'll go through it just in case, but I'm not sure this capture is going to be much help. |
In my configuration.yaml:
As soon as I saw the problem occur I grabbed the logs from the prior 3 minutes. Maybe I need to back up further? |
I found the bug. Proposing an edit now.
…On Wed, Sep 4, 2019 at 3:04 PM Chris Heiser ***@***.***> wrote:
HAH!!!
2019-09-04 17:14:15 DEBUG (Thread-3) [custom_components.lutron.cdhlutron]
Uncaught exception: %d format: a number is required, not str
Of course... my logging of the exception is terrible so I don't know yet
exactly where it hit...
On Wed, Sep 4, 2019 at 2:57 PM grantalewis ***@***.***>
wrote:
> Never mind. Problem just occurred.
>
> 5:41 PM tried turning on
>
> light.dining_room_dining_pendant
> light.family_room_family_recessed
> light.kitchen_kitchen_recessed
>
> Lights would turn ON, but respective UI sliders toggled back to OFF
>
> Was able to use Services to turn the entities off.
>
> Light switches continued to work as did the Lutron app.
>
> home-assistant.log.zip
> <https://github.com/home-assistant/home-assistant/files/3576646/home-assistant.log.zip>
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#20348>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/ACQARWU4BHN2FQ637BTQW23QIAVN3ANCNFSM4GR2ZVYA>
> .
>
|
Fix a logging format error in the keypad component to use %s instead of %d for the keypad name. Also add a try/except around the main loop to catch and log exceptions with the lutron thread for future debugging.
thecynic/pylutron#38
Man I don't know how to use github...
…On Wed, Sep 4, 2019 at 3:07 PM Chris Heiser ***@***.***> wrote:
I found the bug. Proposing an edit now.
On Wed, Sep 4, 2019 at 3:04 PM Chris Heiser ***@***.***>
wrote:
> HAH!!!
>
> 2019-09-04 17:14:15 DEBUG (Thread-3) [custom_components.lutron.cdhlutron]
> Uncaught exception: %d format: a number is required, not str
>
> Of course... my logging of the exception is terrible so I don't know yet
> exactly where it hit...
>
> On Wed, Sep 4, 2019 at 2:57 PM grantalewis ***@***.***>
> wrote:
>
>> Never mind. Problem just occurred.
>>
>> 5:41 PM tried turning on
>>
>> light.dining_room_dining_pendant
>> light.family_room_family_recessed
>> light.kitchen_kitchen_recessed
>>
>> Lights would turn ON, but respective UI sliders toggled back to OFF
>>
>> Was able to use Services to turn the entities off.
>>
>> Light switches continued to work as did the Lutron app.
>>
>> home-assistant.log.zip
>> <https://github.com/home-assistant/home-assistant/files/3576646/home-assistant.log.zip>
>>
>> —
>> You are receiving this because you were mentioned.
>> Reply to this email directly, view it on GitHub
>> <#20348>,
>> or mute the thread
>> <https://github.com/notifications/unsubscribe-auth/ACQARWU4BHN2FQ637BTQW23QIAVN3ANCNFSM4GR2ZVYA>
>> .
>>
>
|
Ok, I have updated the custom component in dropbox. It should fix the
problem, and if it doesn't, well, there's better exception logging.
…On Mon, Sep 2, 2019 at 2:36 PM grantalewis ***@***.***> wrote:
Well, the good news is that it's no longer taking hours for the problem to
occur. This time it happened only about 8 minutes after rebooting (maybe
due to the heavy-duty logging tasks?).
The two entities I experimented with this time:
light.main_stairs_main_hall
light.office_office_fan_light
Same results as before: turn the toggle on, the light goes on and stays
on; the toggle returns to off; further changes to the entity can't be made
using the toggle; but changes are successful via Services
These seem to be the relevant log entries:
2019-09-02 16:44:41 DEBUG (MainThread) [homeassistant.core] Bus:Handling <Event state_changed[L]: entity_id=light.main_stairs_main_hall, old_state=None, new_state=<state light.main_stairs_main_hall=off; lutron_integration_id=18, friendly_name=Downstairs Hallway, supported_features=1 @ 2019-09-02T16:44:41.008955-04:00>>
2019-09-02 16:44:41 INFO (SyncWorker_8) [custom_components.lutron.cdhlutron] Sending: ?OUTPUT,45,1
2019-09-02 16:44:41 DEBUG (Thread-2) [custom_components.lutron.cdhlutron] handle_update 45 -- ['1', '0.00']
2019-09-02 16:44:40 DEBUG (MainThread) [homeassistant.core] Bus:Handling <Event state_changed[L]: entity_id=light.office_office_fan_light, old_state=None, new_state=<state light.office_office_fan_light=off; lutron_integration_id=9, friendly_name=Office Fan Light, supported_features=1 @ 2019-09-02T16:44:40.410877-04:00>>
2019-09-02 16:44:40 INFO (SyncWorker_3) [custom_components.lutron.cdhlutron] Sending: ?OUTPUT,25,1
2019-09-02 16:44:40 DEBUG (Thread-2) [custom_components.lutron.cdhlutron] handle_update 25 -- ['1', '0.00']
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#20348>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ACQARWWK64I4XDBHOCVC5RLQHWBN5ANCNFSM4GR2ZVYA>
.
|
Note once pylutron is updated, it will still require another push to Pypi and an update of the component dependencies. |
.... wait - is this the cause of all of @grantalewis issues? Or just a logging bug that is hiding his issues? |
AFAICT, it's definitely what's breaking it now. So there's an updated
custom component, and the logging is still in the custom component, and we
can see if it breaks again or if this actually fixes the issue.
It would make sense, as the keypads and leds were a major addition to
pylytron 0.2, which is where the exception appears to be coming from. What
I don't understand is why my own system doesn't throw these errors, as I
have pico remotes and seetouch keypads. Something about the fan switches
perhaps?
…On Wed, Sep 4, 2019 at 4:45 PM Jon Gilmore ***@***.***> wrote:
I found the bug. Proposing an edit now.
.... wait - is this the cause of all of @grantalewis
<https://github.com/grantalewis> issues? Or just a logging bug that is
hiding his issues?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#20348>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ACQARWXGKWKN3NFDEVSXIB3QIBCAVANCNFSM4GR2ZVYA>
.
|
Interesting - I don't think it has anything to do with fans as I have 5 of them, and don't see these issues either. FWIW, I have picos, hybrid seetouch keypads, 10 key seetouch keypad, and dimmers/switches and have never seen this. Regardless, awesome find! |
Pretty excited about this! Fingers crossed. I've got the new custom component installed on 0.98.3 and am going to let it run overnight. I also re-enabled my customary (non-mimimal) configuration.yaml, the OmniProBridge, and all automations, so I admit to throwing caution to the wind a bit. If things are still working in the AM, that'll be a very good sign. I still have all of the loggers enabled in case something goes wrong. |
And if they're not, we'll have more logs
…On Wed, Sep 4, 2019, 6:43 PM grantalewis ***@***.***> wrote:
Pretty excited about this! Fingers crossed. I've got the new custom
component installed on 0.98.3 and am going to let it run overnight. I also
re-enabled my customary (non-mimimal) configuration.yaml, the
OmniProBridge, and all automations, so I admit to throwing caution to the
wind a bit. If things are still working in the AM, that'll be a very good
sign.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#20348>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ACQARWRXMX7W2BPOPGKWPZTQIBP5PANCNFSM4GR2ZVYA>
.
|
Everything's still working this morning! I will of course keep tabs on it throughout the day, but the outlook seems positive. |
Fantastic, nice find @cdheiser. I'll post feedback on the patch tonight. I
hate python for these kind of dynamic landmines :(
|
Checking in to say that it's been smooth sailing all day long. Completely stoked to have this problem finally taken care of! Thanks to @cdheiser and @thecynic for your determination and patience. I'd appreciate a recommendation on how long to leave the issue open. And also I'd appreciate your giving me a heads up on when to remove the custom component. (I know it's not yet, but if you happen to know the version # that will include the fixed pylutron, please let me know.) Once again, my thanks. If you guys were in my neighborhood I'd buy you a beer. Or three. |
This is awesome - just want to echo @grantalewis comments to say it’s amazing to see how dedicated all of you have been to solving this. This is why HA and this community are so incredible. |
Once we go over the patch and get it submitted, the process is as follows:
1) Update the version and push to pypi
2) Submit a pull request to HA to update the version of pylutron. That
pull request will be linked to this issue.
3) HA will eventually release a new version, and include this issue as
fixed in the release notes. That will be the time to remove the custom
component.
…On Thu, Sep 5, 2019 at 2:11 PM Tangston311 ***@***.***> wrote:
This is awesome - just want to echo @grantalewis
<https://github.com/grantalewis> comments to say it’s amazing to see how
dedicated all of you have been to solving this. This is why HA and this
community are so incredible.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#20348>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ACQARWWPTVPSFV5NU5GTL3LQIFYYNANCNFSM4GR2ZVYA>
.
|
WOOHHOOO! This is great news! I assume this is the longest you've gone without issues since 0.86.4? If so, do you think you can confidently say it's solved? |
Yes, this is definitely the longest I've gone without problems, and I am nearly certain the problem is solved. I couldn't say this to most people and expect them to understand, but I know you guys can relate: this is huge. It's like having a weight off my shoulders. I am a happy man! |
Fix a logging format error in the keypad component to use %s instead of %d for the keypad name.
Fix a logging format error in the keypad component to use %s instead of %d for the keypad name.
I've pushed version 0.2.5 to pypi, will need to update HASS for new version and hopefully we can close this. |
Thanks @thecynic ! Does this mean 99.2 contains the fix, or will it be the next version? |
@Tangston311 Will be in the next version. |
Still happening in Home Assistant 0.101.3 |
I'm running 0.103.6 and have not encountered the problem I first described since 9/5/2019. @amattas I've seen something similar if my RA2 system loses power/connectivity. I suggest shutting down HA, restarting your Lutron system, waiting for ~5 mins, and then rebooting HA. |
The initial issue was due to a typo:
That's been fixed. It appears you're after something else, so you might want to create a new thread. |
I'm neither a Github guru nor a competent lutron.py programmer so I'm going to leave this in others' hands. I'll just say that from my perspective the initial problem mentioned in this thread is resolved and is unrelated to your #29642. I'm guessing you're likely to get more traction in that other thread. |
Home Assistant release with the issue:
0.85.1
Last working Home Assistant release (if known):
0.84.6
Operating environment (Hass.io/Docker/Windows/etc.):
RPi3, Hass.io
Component/platform:
https://www.home-assistant.io/components/lutron/
Description of problem:
Problem-relevant
configuration.yaml
entries and (fill out even if it seems unimportant):Traceback (if applicable):
Additional information:
The text was updated successfully, but these errors were encountered: