-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Configuration backup/safeguard against corruption #9046
Comments
If you have often power outtakes use |
I do not have power outages often. The previous one before this incident was maybe 4-5 years ago. But the severity of the incident is what makes me worried, not the frequency. |
There is no problem with actual Tasmota version. It is as stable as older versions. |
If the consensus is that a case where all devices in a section of a house get soft bricked, is by design and nothing will be done about it - then I accept it and take note. However, I do not agree with the statement that there is no problem and this is not something that can be solved via software. I have probably way more "smart home" devices in my house than the average user. Ranging from Z-Wave to Zigbee to WiFi. Not to mention wired devices. During this incident, none of the other devices showed any problems. Everything just worked after the power was fixed. Do not get me wrong, I absolutely love Tasmota. I've been using and preaching for it for years. I just feel that this is something that should not just get dismissed with "Not our problem, our firmware works great" but rather it could be a point of discussion on "what can we do to overcome these situations - however unlikely, in the future" |
Please provide the process how to reproduce the issue. Since we do not have this issue we cant search. |
Unfortunately it will be difficult to reproduce. The best I can do is offer one of my Tasmota flashed devices that is affected, to someone who is willing to troubleshoot this. Not sure if it would help in troubleshooting or not though. |
Thanks, but this would not help. |
I think I have a similar issue. But in my case it is because of a test setup with a breadboard and unreliable wiring. The ESP32 in my case sometimes is just not starting up anymore and doing other weird things. Is there a way to do a clean shutdown to prevent any corruption when doing maintainance or shutting down power? |
Currently there is no way to do a shutdown; only a clean restart using command A shutdown without a restart would mean to execute the restart code and stay in an early loop after the restart or finding a way to halt the processor(s) during restart. The only way out would be a power cycle. I haven't thought about it yet. |
Maybe it is possible to have a special restart option. Unfortunately i cannot code it. |
Add command ``Restart 2`` to halt system. Needs hardware reset or power cycle to restart (#9046)
@Ingenieur89 try latest development commit which introduces command Pls report if this solves your issue. |
Works fine. No bugs or unexpected behaviour. Thanks a lot! What I see when looking at "Flash write Count": |
I can comment also on the bricked devices due to config corruption. If you loose power or have under voltage during flash write you’re in trouble. The checksum is already a big step forward. If we now change the pointer to the new config AFTER it was written that may also help to start without a bad config. I use the user config override to provide WiFi and minimal mqtt connection. This currently allows me to get in 99% control to the device. Even if it was completely resetted. The only 1% issue I still have is that the config is detected as good from tasmota and the WiFi ap and credentials are scrap. I have disabled ap mode for security. In this case the usb-cable comes into play, or pressing 10 times short reset. |
Regarding damaged hardware, can microcontrollers like an ESP get permanently damaged by brownouts? (No fix by just deleting and reflashing) |
Never had a defect ESP82xx. All defects where power supply or Flash chip. |
There is no pointer pointing to the latest config after a restart for the same reason that it may be corrupt too. The way config resiliency currently works after a restart is that it tries to find the latest updated config from the config pool of eight 4k flash pages. Without corruption this works as expected. With corruption in theory it searches the pool for the next older config and uses that one. As it's theory it is depending on legacy situations:
As said due to legacy reasons the timestamp isn't used yet. The config In all cases of detected corruption it should have loaded the default configuration but in practice it seems it often fails to detect corruption. Considering the fact that the 32-bit CRC AND the timestamp are now active for almost a year I think it's time to drop config resilient support for versions before 6.6.0.11 and try to fix resiliency based on CRC AND tiemstamp only. As an important note this will definitly break OTA upgrades from versions before 6.6.0.11 to 8.4.0.2 in one step. But then I already noted this in the ReleaseNotes what the supported upgrade path is. |
- Add better config corruption recovery (#9046) - Remove support for 1-step upgrade from versions before 6.6.0.11 to versions after 8.4.0.1
Give it a try. |
Thank you @arendst for putting your time and effort into figuring this out and trying to find a solution that would prevent scenarios like I experienced from happening in the future. |
Have you looked for this feature in other issues and in the docs?
Yes
Is your feature request related to a problem? Please describe.
Prolonged period of power problems lead to 6 separate devices with corrupted config, needing to be re-flashed over serial.
#8929
Describe the solution you'd like
It would be great to have a backup of the config stored in flash, along with sanity check/journaling to fall back to a working conf should the primary config get corrupted for whatever reason. Removing installed devices to re-flash them can be a real pain and damage the internal decoration of a finished setup.
Describe alternatives you've considered
Currently I'm having to reconsider installing Tasmota devices in limited access locations.
Additional context
I know that a corrupted config is something that is not very common. But it can be extremely disruptive if it happens to a large number of devices at the same time (as I recently experienced). I've been recommending Tasmota to friends and even helped less technically inclined people set up home automation systems based on Tasmota. I've been telling them how reliable and trouble-free my experience of many years has been so far. Now I need to be aware of the fact that a power outage or any other problems on the power line can cause their entire setup to fail.
The text was updated successfully, but these errors were encountered: