-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mosquitto 5.1.1 is broken. #1887
Comments
I know OP has left zero information here and it will probably be closed out, but on a surface level I have to agree. Something has gone wrong with this point update. I get my whole Home Assistant instance go down overnight. When I try to access it over the network I cant connect. After a week of troubleshooting I have narrowed it down to my Mqtt server. What seems to be happening is it loses connection to HA and throws the whole thing off. HA is still running and inaccessible. I have to power cycle to get it up and running again and when I do, I get errors in HA logs about "timed out waiting for mid 2" for Mqtt, and errors in mosquitto about being unable to find HA. I also find my Zigbee2mqtt devices are found at startup then immediately lost again (presumably when the time out occurs) and have to restart Zigbee2mqtt a couple of times to get it running again. I wish i could prove that the HA hang is due to mosquitto but since it's inaccessible (but still running) I can't. What I can say is that in Recorder I can see the only devices that are unavailable are Zigbee2mqtt and this drops because of mosquitto (there's zero errors in the Zigbee2mqtt logs in debug mode). I've restored an old backup of the previous version of Mosquitto to try and combat the issue. I'll try to remember to report back. Edit: I think the same issue is #1817 and #1814 Also this one home-assistant/core#45036 |
Ok so after rolling back to 5.1 version of Mosquitto I no longer get the timeout errors on startup in the core log and Zigbee2mqtt seems to boot from the get go again (instead of requiring multiple reboots to find Mosquitto) Hopefully this will also fix the issue of my whole HA server becoming unresponsive periodically too. This has been a week long hunt to try and find out why my system is dying which I was initially blaming on a new cable for my SSD. I've also tried looking at my Powerline system thinking it was cutting connection to HA. |
Yup, I updated last night and the add-on CPU usage spiked up to 150%. All my zwave2mqtt devices stopped working. Rolled back to 5.1 this morning and it's back to normal |
Confirming same. Very strange, sluggish system recently with hundreds of lines similar to
in home-assistant.log. After rolling Mosquitto back to 5.1, no more errors in the log, performance back to normal. |
I also needed to rollback to 5.1. I had the feeling Hassio couldn't log in any longer, but didn't put time in investigating. 😔 |
I have the same problem and the home assistant has become slow to read the topics |
Yes, that was what I've noticed first: most of my stuff is MQTT-based, and it suddenly took a few seconds to switch on a light through MQTT, controlled by a door- or motion-sensor, also connected through MQTT. I tried restart, then hundreds of "Timed out waiting for mid xxx" messages in the log. Found this topic, rolled back MQTT to 5.1 and no messages any more, and light is switched on well under a second again. |
Yep, same here - none of the things using MQTT in my home works anymore - Openzwave, Ebus, Zigbee2MQTT, Valetudo - all the devices went "Unavailable". I have tried restarting every component but still cant get any of my devices to work.
But the devices those components are responsible for are unavailable in HA. |
Same here, MQTT not working well anymore. Getting a ton of [homeassistant.components.mqtt] Timed out waiting for mid messages |
I have one instance running 5.1 and it's working fine. My main machine I upgraded to 5.1.1 - big mistake - all devices through mqtt are now showing as unavailable and tons of "timeouts waiting for mid..." |
Same issue here! How do I force 5.1.0 installation from Supervisor? Thanks |
Don't quote me as I'm guessing. I had used an old snapshot. |
I think that restoring a snapshot (it is possible to restore JUST the addon) is the only option |
after restart of home assistant and upgrading i have the same problem. |
this is srsly annoying right now! I have to switch out lights manually and close shutters now ^^ |
Same symptoms here with 5.1.1 update In the log I was getting a lot of 'Socket error on client DVES..' errors so I thought this was caused by my recent changes to inter-VLAN routing of Unfi-Wifi config. Wasted too many hours on this before I saw this thread and reverted just the Mosquitto part of a recent snapshot which instantly fixed the problem. |
Exact same issue. Everything's broken with 5.1.1. Luckily had an old snapshot and I'll be trying to restore from that. |
I did the same. Rolled back only Mosquitto to 5.1 and it works well again. |
@JoJa1101 How did you roll back? |
Hey can anyone help me roll back? How would i do that? |
Same here only rollback from old snapshot restored the mqtt function Would be a nice feature to get a version dropdown by the addons so a rollback would be easier for everyone |
I made a snapshot on my HASS Testsystem (was still running on Mosquitto 5.1) and did the partial rollback on my Prod System. |
The issue is that socat (which serves as auth point) is being continuously executed again and again!
|
Glad I found this; I was just about to install and try to migrate from CloudMQTT to a local instance. I only used CM because I needed an external server to use with an older mobile location method that I don't use anymore since the HA App came out. Guess I'll subscribe to this thread and wait it out. |
very much an issue here, especially since the mqtt binary_sensors keep triggering the automations on each reload (don't know how to call it but Mqtt gets re-triggered quite often now...) do we have an add-on dev listening in on this? |
Have same issue... It seems that the speed of mqtt is also changed... I don't know if this is because of the bug that's how in latest version |
Any Devs looking into this? thx. |
See #1887 (comment) |
I just installed the eclipse-mosquitto in docker, exposed the nessecary ports and its working perfectly for now. hope the add-on will be fixed asap.. for now, everything is up and running and working smoothly.. |
Did the same and it's now working really fast. In the core addon, when I initially connected with MQTT Explorer, it needed 1-2s to connect and then sometimes 5s+ to get topics. Reconnect was faster. This most likely also resolves my issues with automations not working: https://community.home-assistant.io/t/on-motion-wait-until-illuminance-sensor-is-updated |
Status update time 🕐 I've created quite a few versions of the add-on the past week and ended up with the simplest of them all...
I'm happy with the result, it seems to perform better. Currently got Z-Wave JS 2 MQTT, Home Assistant, a local mosquitto_sub subscribed to all topics and MQTT explorer connected; Around 25 ESPChip on ESPHome sending all a lot of status updates each second. Additionally opened up a script that tries to send in updates as fast as possible (while disconnecting/connecting between each update). This is how fast it is now, each line in the screenshot below is a client that publishes a time (with microseconds) as fast as possible while re-connecting between each update: I'll leave this all running overnight to see how it holds and do a final round of reviewing my own code in the morning. Morning update: Everything still alive and kicking. |
Do you maybe have numbers from the current mqtt addon to compare, to see if the poor results can be replicated in the old addon:) |
Very well done, this is hard to test as the problem always was only the first login (of each device) that took very long (> 5 sec). Thanks for putting you head into this! |
@frenck if you connect to the MQTT server, do you get all topics right away? In the problematic server, my connection took some time (between 0-2s), but then I had to wait 5s-6s for topics to apear. Is that mitigated in your new version? |
Yes, however, this of course is affected by the amount of data you store, and how fast both the client and server are.
True! There is a big difference in the handling now though, but will run a test without caching layers to measure the first-time every time. |
@CubieMedia Auth/ACL caching disabled, local user: Auth/ACL caching disabled, Home Assistant user account: |
Doesn't seem to do much difference, around 3ms for local users and a few more when asking HA. This is fast!
I'm currently running the obsolete MQTT addon and compared to the core addon, it connects and shows all topics instantly compared to core addon where it takes a few seconds to get them. I wanted to do some mosquito_pub tests today but seeing as you got pretty far with the updated version, I'll just wait for the new release and test with that. |
With the HA user it looks like 8-10 ms and Auth/ACL disabled should create an environment like a restart and devices newly connect to the broker. Test approved! The 10ms are putting my hopes up that this will work soon again! Can i get any info of what you changed when the new version is released? Startup parameters from MQTT Broker or something container specific? |
There is a changelog and, of course, it is open source: Yes, you can see that. The main difference in this case, is that part of the authentication handling is now done by the NGINX webserver. That thing is able to handle a lot of requests: fast. This replaces the bash scripts piped to socat in the current version. |
@frenck Thank you for that short inside. |
Will report results once the PR is merged and new version is available. Or is there an easy way to test it before it's merged? |
I did some quick testing and initial filing is that it works as it should now. MQTT client connects really fast and gets all it's topic at the same time. Motion and button responses are also way faster then before, the same as with deprecated mqtt client from community addons. There are also no reconnect issues in home assistant with MQTT. Great work @frenck |
After a couple hours on 6.0.1, I'm happy to report zero slow down issues. Very quick response.
|
Yes it is working again, well done! |
I'm happy to report things are looking a lot more responsive now that I've upgraded to 6.0.1, the errors in the logs and laggy switch/light responses seem to be gone. Many thanks @frenck |
Yes working perfectly and fast! Thanks @frenck ! |
Thank you @frenck - great job! |
Is anyone else seeing errors in the supervisor log after upgrading to 6.0.1.? If I disable the watchdog on Mosquitto the errors quit. I also rolled back to 5.1.1 and verified no errors with watchdog enabled.
|
@jeffgraham-hub I see the same error in the logs. |
Same here, though still having errors like these after reverting back to Mosquitto 5.1, but only during startup, so probably from another addon (with MQTT 6.x the supervisor log was flooded with these even during the day, not only at startup) |
Yes same errors |
these errors are gone/fixed in supervisor-2021.05.1 seeing this though in the MQTT log:
|
Please stop digging into old issues reporting unrelated new issues. Instead, just create a new issue for each individual thing so it can be tracked and looked at. Thanks. |
I upgraded to 6.0.1 and it was fine for a day but now I’m experiencing delays again 5.1 was fine though |
@frenck I've come up with a slightly less abrasive option for a similar comment. It's a bit more verbose but feel free to copy and paste it. The goal is to avoid discouraging good faith participation while still adhering to your preferred workflow. While it's not as efficient to add a buffer for the emotional state of the reader it can be considered a "social optimization" to add a bit more explanation. "If this isn't resolved or there's a new problem please create a new issue for each individual thing so it can be tracked and looked at. We have a workflow that involves opening a new issue instead of reviving old ones; this greatly helps with our tracking and organizing. Re-opening an issue or combining multiple subjects into a single issue adds complexity to the tracking." |
The problem
Environment
Problem-relevant configuration
Traceback/Error logs
Additional information
The text was updated successfully, but these errors were encountered: