Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed: Failing Sonoff LED - Upgrade power regulator #122

Closed
ecsfang opened this issue Feb 28, 2017 · 99 comments
Closed

Fixed: Failing Sonoff LED - Upgrade power regulator #122

ecsfang opened this issue Feb 28, 2017 · 99 comments
Labels
help needed Action - Asking for help from the community

Comments

@ecsfang
Copy link

ecsfang commented Feb 28, 2017

Hi!

I have two Sonoff LEDs running, and one of them seems to fail. It starts ok, but after a while it starts to restarts itself.
I have also noticed, that as soon as I redirect my browser to the web-server in the Sonoff, it also restarts. Using serialcom or MQTT seems to work, at least so I can communicate with the device.

Typically the log looks like this (after a restart and then trying to access the webserver via WiFi):

00:00:13 Wifi: Connect failed with AP incorrect password
00:00:13 Wifi: Connecting to AP2 TIGERGAP in mode 11N as kitchen_left-2505...
00:00:20 Wifi: Connect failed as AP cannot be reached
00:00:21 Wifi: Connect failed as AP cannot be reached
00:00:21 Wifi: Connecting to AP1 TIGERWOLF in mode 11N as kitchen_left-2505...
00:00:26 Wifi: Connected
00:00:26 HTTP: Webserver active on kitchen_left-2505 with IP address 1xx.xxx.xxx.xxx
19:25:17 MQTT: Attempting connection...
19:25:17 MQTT: Connected
19:25:17 MQTT: tele/kitchen_left/LWT = Online (retained)
19:25:17 MQTT: cmnd/kitchen_left/POWER = 
19:25:17 MQTT: tele/kitchen_left/INFO1 = {"Module":"Sonoff LED", "Version":"3.9.19",     "FallbackTopic":"DVES_CC09C9", "GroupTopic":"sonoffleds"}
19:25:17 MQTT: tele/kitchen_left/INFO2 = {"WebserverMode":"Admin", "Hostname":"kitchen_left-2505", "IPaddress":"1xx.xxx.xxx.xxx"}
19:25:18 MQTT: stat/kitchen_left/RESULT = {"POWER":"ON"}
19:25:18 MQTT: stat/kitchen_left/POWER = ON
19:25:25 MQTT: tele/kitchen_left/STATE = {"Time":"2017-02-28T19:25:25", "Uptime":0, "POWER":"ON", "Wifi":{"AP":1, "SSID":"TIGERWOLF", "RSSI":56}}

Exception (29):
epc1=0x4021d3d6 epc2=0x00000000 epc3=0x00000000 excvaddr=0x00000010 depc=0x00000000

ctx: cont 
sp: 3fff1ff0 end: 3fff2300 offset: 01a0

>>>stack>>>
3fff2190:  00000000 00000000 00000000 00000000  
3fff21a0:  00000000 00000000 00000001 3fff02d8  
3fff21b0:  3fff02d7 3ffe8528 00000002 40211aef  
3fff21c0:  00000000 00000000 00000000 3fff3854  
3fff21d0:  0000000f 00000000 3fff4f3c 00000a5f  
3fff21e0:  00000a5b 00000001 3fff2230 4021be2f  
3fff21f0:  3fff11e8 00000178 00000178 40218558  
3fff2200:  00000001 00000001 3fff389c 4021ccfa  
3fff2210:  00000000 3fffdad0 3fff389c 4021854e  
3fff2220:  3fff389c 3fff39c4 3fff389c 4021858a  
3fff2230:  00000000 00000000 00000000 4021bf90  
3fff2240:  3fff389c 3fff39c4 3fff3984 4021861d  
3fff2250:  3fff4e64 0000000f 00000001 40215224  
3fff2260:  3fff39c4 00001388 3fff3984 00000001  
3fff2270:  00000001 40217abc 0000000f 4021cae4  
3fff2280:  00000000 00000000 3fff3984 3fff12cc  
3fff2290:  00000001 3fff39a8 3fff3984 402187a7  
3fff22a0:  3ffe9510 00000000 000003e8 00000000  
3fff22b0:  00000000 3fff4c9c 4021ca2c 3fff12e0  
3fff22c0:  3fffdad0 00000000 3fff12c4 40203fce  
3fff22d0:  00000000 00000000 00000001 40213fc8  
3fff22e0:  3fffdad0 00000000 3fff12c4 4021ca78  
3fff22f0:  feefeffe feefeffe 3fff12e0 40100718  
<<<stack<<<

 ets Jan  8 2013,rst cause:1, boot mode:(1,7)


 ets Jan  8 2013,rst cause:4, boot mode:(1,7)

wdt reset

I have tried to reflash, erasing the flash etc, but I always ends up in this scenario - with this device.
The other Sonoff LED I have, loaded with the same software seems to work fine.

I suspect bad hardware, anyone else with another opinion?

This is status after flash erase and default values loaded:

19:48:07 MQTT: stat/sonoff/STATUS = {"Status":{"Module":1, "FriendlyName":"Sonoff", "Topic":"sonoff", "ButtonTopic":"0", "Subtopic":"POWER", "Power":1, "PowerOnState":3, "LedState":1, "SaveData":1, "SaveState":1, "ButtonRetain":0, "PowerRetain":0}}
19:48:07 MQTT: stat/sonoff/STATUS1 = {"StatusPRM":{"Baudrate":115200, "GroupTopic":"sonoffs", "OtaUrl":"http://1xx.xxx.yyy.zzz:80/api/arduino/sonoff.ino.bin", "Uptime":0, "Sleep":0, "BootCount":3, "SaveCount":6}}
19:48:07 MQTT: stat/sonoff/STATUS2 = {"StatusFWR":{"Program":"3.9.19", "Boot":31, "SDK":"1.5.3(aec24ac9)"}}
19:48:07 MQTT: stat/sonoff/STATUS3 = {"StatusLOG":{"Seriallog":2, "Weblog":2, "Syslog":0, "LogHost":"1xx.xxx.xxx.xxx", "SSId1":"TIGERWOLF", "SSId2":"TIGERGATE", "TelePeriod":300}}
19:48:07 MQTT: stat/sonoff/STATUS4 = {"StatusMEM":{"ProgramSize":443, "Free":496, "Heap":28, "SpiffsStart":940, "SpiffsSize":64, "FlashSize":1024, "ProgramFlashSize":1024, "FlashChipMode",3}}
19:48:07 MQTT: stat/sonoff/STATUS5 = {"StatusNET":{"Host":"sonoff-2505", "IP":"1xx.xxx.xxx.xxx", "Gateway":"1xx.xxx.1.1", "Subnetmask":"255.255.255.0", "Mac":"xx.xx.xx:CC:09:C9", "Webserver":2, "WifiConfig":3}}
19:48:07 MQTT: stat/sonoff/STATUS6 = {"StatusMQT":{"Host":"1xx.xxx.yyy.zzz", "Port":1883, "ClientMask":"DVES_%06X", "Client":"DVES_CC09C9", "User":"DVES_USER", "MAX_PACKET_SIZE":512, "KEEPALIVE":15}}
19:48:07 MQTT: stat/sonoff/STATUS7 = {"StatusTIM":{"UTC":"Tue Feb 28 18:48:07 2017", "Local":"Tue Feb 28 19:48:07 2017", "StartDST":"Sun Mar 26 02:00:00 2017", "EndDST":"Sun Oct 29 03:00:00 2017", "Timezone":1}}
19:48:07 MQTT: stat/sonoff/STATUS10 = {"StatusSNS":{"Time":"2017-02-28T19:48:07"}}

(Another "problem" seen in the log that I see quite often, is that the first attempt(s) to connect to WiFi fails with incorrect password. The second (or third) attempt works. This shows up as many devices are connected to AP2, even if AP1 is closer and stronger ...)

Cheers!

@arendst
Copy link
Owner

arendst commented Feb 28, 2017

I see your flashchipmode is 3 (esp8285). For the sonoff led a 2 should be used. Try command flashchipmode 2 and reflash using ota or web upgrade.

@ecsfang
Copy link
Author

ecsfang commented Feb 28, 2017

Hi!

I tried setting the mode, and then updated using OTA (from smadds binaries):

stat/kitchen_left/RESULT : msg.payload : string [19]
{"FlashChipMode":2}

stat/kitchen_left/RESULT : msg.payload : string [83]
{"Upgrade":"Version 3.9.19 from http://sonoff.maddox.co.uk/tasmota/sonoff.ino.bin"}

stat/kitchen_left/UPGRADE : msg.payload : string [22]
Successful. Restarting

stat/kitchen_left/RESULT : msg.payload : string [19]
{"FlashChipMode":2}

But, that results in the same problem as before ... ?

20:39:34 MQTT: stat/kitchen_left/POWER = ON
20:39:41 MQTT: tele/kitchen_left/STATE = {"Time":"2017-02-28T20:39:41", "Uptime":0, "POWER":"ON", "Wifi":{"AP":1, "SSID":"TIGERWOLF", "RSSI":36}}
20:39:53 MQTT: stat/kitchen_left/RESULT = {"FlashChipMode":2}

Exception (29):
epc1=0x4021d3d6 epc2=0x00000000 epc3=0x00000000 excvaddr=0x00000010 depc=0x00000000

ctx: cont 
sp: 3fff1ff0 end: 3fff2300 offset: 01a0

I also noticed that the other device has flashchipmode 0 ... ?

20:43:04 CMND: flashchipmode
20:43:04 MQTT: stat/kitchen_right/RESULT = {"FlashChipMode":0}

@ecsfang
Copy link
Author

ecsfang commented Mar 2, 2017

Jepp, after some fiddeling back and forth, I have now made a clean flash, restored all settings and flashed via serial a build for 8266, and now it seems to work!
I must have flashed a build made for 8285 (build for the 4CH), and this put the device in a semi-state, i.e. almost working but not quite ... setting flashchipmode and upgrading via OTA didn't help.

Thanks for directing me into the right direction!
Cheers!

@ecsfang ecsfang closed this as completed Mar 2, 2017
@ecsfang
Copy link
Author

ecsfang commented Mar 3, 2017

Hi again,

Is this expected behaviour on the Sonoff LED?
I saw that it blinked (restarted), and the log shows the following, that the device started because of hardware watchdog.
And look at the Uptime, it was not reset by the watchdog, should that not be reset if the device is restarted?

00:00:00 APP: Project sonoff Sonoff (Topic kitchen_left, Fallback DVES_CC09C9, GroupTopic sonoffleds) Version 3.9.22
00:00:00 Wifi: Connecting to AP2 TIGERGATE in mode 11N as kitchen_left-2505...
00:00:15 Wifi: Connect failed with AP timeout
00:00:15 Wifi: Connecting to AP1 TIGERWOLF in mode 11N as kitchen_left-2505...
00:00:20 Wifi: Connected
00:00:20 HTTP: Webserver active on kitchen_left-2505 with IP address 1xx.xxx.xxx.xxx
00:00:22 MQTT: Attempting connection...
07:02:15 MQTT: Connected
07:02:15 MQTT: tele/kitchen_left/LWT = Online (retained)
07:02:15 MQTT: cmnd/kitchen_left/POWER = 
07:02:15 MQTT: tele/kitchen_left/INFO1 = {"Module":"Sonoff LED", "Version":"3.9.22", "FallbackTopic":"DVES_CC09C9", "GroupTopic":"sonoffleds"}
07:02:15 MQTT: tele/kitchen_left/INFO2 = {"WebserverMode":"Admin", "Hostname":"kitchen_left-2505", "IPaddress":"1xx.xxx.xxx.xxx"}
07:02:15 MQTT: tele/kitchen_left/INFO3 = {"Started":"Hardware Watchdog"}
07:02:16 MQTT: stat/kitchen_left/RESULT = {"POWER":"ON"}
07:02:16 MQTT: stat/kitchen_left/POWER = ON
07:02:23 MQTT: tele/kitchen_left/STATE = {"Time":"2017-03-03T07:02:23", "Uptime":0, "POWER":"ON", "Wifi":{"AP":1, "SSID":"TIGERWOLF", "RSSI":38}}
07:02:30 MQTT: tele/kitchen_left/UPTIME = {"Time":"2017-03-03T07:02:30", "Uptime":1}

@ecsfang ecsfang reopened this Mar 3, 2017
@arendst
Copy link
Owner

arendst commented Mar 3, 2017

I prefer not to call it expected but I've noticed too that the Sonoff Led has an appetite for Hardware Watchdog timeouts. I can't figure out what the reason is and I thought it had something to do with the webserver. That's also the reason I implemented the Info3 message to get more info out of the reboots.

The uptime notice is a nice one as the reboot happened around xx:02:30 the time I increment the hourly uptime counter. As you can see at 07:02:23 it says Uptime 0 and at 07:02;30 it is incremented by one. Just a way of saving precious memory and code for more important features...

@ecsfang
Copy link
Author

ecsfang commented Mar 3, 2017

Ok, then I just continue to monitor the logs and see if I can get some more hints about whats going on :)
Its is not that common (compared to when I flashed the wrong software ... ;)).

And uptime - ok, then I get it - hehe!

Cheers!

@ecsfang ecsfang closed this as completed Mar 3, 2017
arendst added a commit that referenced this issue Mar 3, 2017
4.0.0 20170303
* Add define to remove config migration code for versions below 3.0 (See
Wiki-Upgrade-Migration path)
* Free memory by switching from String to char[]
* Raised Sonoff Led PWM frequency from 200Hz to 432Hz in search of
stability (hardware watchdog timeouts) (#122)
* Increase message size and suggested minimum MQTT_MAX_PACKET_SIZE to
512 (#114, #124)
* Remove runtime warning message regarding MQTT_MAX_PACKET_SIZE too
small as it is now moved to compile time (#124)
* Fix possible panics with web console and http commands while UDP
syslog is active (#127)
* Add optional static IP address (#129)
* Add define ENERGY_RESOLUTION in user_config.h to allow user control
over precision (#136)
@ecsfang
Copy link
Author

ecsfang commented Mar 5, 2017

Hi again!

Yes, still HW Watchdog every now and then, and very often I see the following problem as well:

00:00:00 APP: Project sonoff Sonoff (Topic kitchen_right, Fallback DVES_07261C, GroupTopic sonoffleds) Version 4.0.0
00:00:00 Wifi: Connecting to AP1 TIGERWOLF in mode 11N as kitchen_right-1564...
00:00:15 Wifi: Connect failed with AP timeout
00:00:15 Wifi: Connecting to AP2 TIGERGATE in mode 11N as kitchen_right-1564...
00:00:20 Wifi: Connected

Any idea why? AP 1 is the strongest Wifi, and I think it is the first attempts that fails. Sometime it is "connection timeout", sometime "Incorrect password" and sometimes "AP cannot be reached". Sometime it connects directly, sometime after several attempts. Is there a way of logging what actually happens there?
I'm using Webconfig 4, so I guess that all restarts I see is only because of HWWD.

@ecsfang ecsfang reopened this Mar 5, 2017
@arendst
Copy link
Owner

arendst commented Mar 5, 2017

Perhaps the latest version 4.0.1 can provide more stability as I leave more memory available for normal tasks. If you are using wemo or hue emulation too than this version should also provide more stability as it disables syslog if emulation is active.

The wifi part I've seen once and a while but reflashing or moving the box some inches usually solves it.

You might see more info when logging option 3 or4 is enabled.

@ecsfang
Copy link
Author

ecsfang commented Mar 6, 2017

One thing I noticed was that using MQTT to control the device works fine most of the time (except the sporadic HWWD restarts), but as soon as I try the web-interface the device restarts. Just by entering the ip-address in the browser is usually enough (I guess the browser do some pre-fetching) and that causes the device to restart. I will make a build, removing the web-server and see if that affects the stability.

@arendst
Copy link
Owner

arendst commented Mar 6, 2017

Does it also happen with 4.0.1? Do you use Hue or Wemo?

@ecsfang
Copy link
Author

ecsfang commented Mar 6, 2017

Yes, but I will double-check - I tried to upgrade to 4.0.1 last night and I think it upgraded ok, and that I still had the same problem. I have some time tonight to play around ... :)

@ecsfang
Copy link
Author

ecsfang commented Mar 6, 2017

I was unclear in my last comment: I'm not using Hue or Wemo (if they are default enabled they are included), but Yes, I think it happens with 4.0.1 - that is what I will double-check tonight!
Sorry for the confusion!

@ecsfang
Copy link
Author

ecsfang commented Mar 6, 2017

Ok, now I have checked, and yes, I'm getting the same problem with 4.0.1.
I have now disabled Webserver etc, and it then behaves much better (at least so far), but still I see HWWD now and then (I have two devices who behaves similar, so I suspect the software):

Mar  6 19:58:26 kitchen_left-2505 ESP-MQTT: Attempting connection...
Mar  6 19:58:26 kitchen_left-2505 ESP-MQTT: Connected
Mar  6 19:58:26 kitchen_left-2505 ESP-MQTT: tele/kitchen_left/LWT = Online (retained)
Mar  6 19:58:26 kitchen_left-2505 ESP-MQTT: cmnd/kitchen_left/POWER = 
Mar  6 19:58:26 kitchen_left-2505 ESP-MQTT: tele/kitchen_left/INFO1 = {"Module":"Sonoff LED", "Version":"4.0.1", "FallbackTopic":"DVES_CC09C9", "GroupTopic":"sonoffleds"}
Mar  6 19:58:26 kitchen_left-2505 ESP-MQTT: tele/kitchen_left/INFO3 = {"Started":"Hardware Watchdog"}
Mar  6 19:58:27 kitchen_left-2505 ESP-MQTT: stat/kitchen_left/RESULT = {"POWER":"ON"}
Mar  6 19:58:27 kitchen_left-2505 ESP-MQTT: stat/kitchen_left/POWER = ON
Mar  6 19:58:34 kitchen_left-2505 ESP-MQTT: tele/kitchen_left/STATE = {"Time":"2017-03-06T19:58:34", "Uptime":0, "POWER":"ON", "Wifi":{"AP":1, "SSID":"TIGERWOLF", "RSSI":26}}

I also got one of these:

Mar  6 18:48:49 kitchen_left-2505 ESP-MQTT: Connected
Mar  6 18:48:49 kitchen_left-2505 ESP-MQTT: tele/kitchen_left/LWT = Online (retained)
Mar  6 18:48:49 kitchen_left-2505 ESP-MQTT: cmnd/kitchen_left/POWER = 
Mar  6 18:48:49 kitchen_left-2505 ESP-MQTT: tele/kitchen_left/INFO1 = {"Module":"Sonoff LED", "Version":"4.0.1", "FallbackTopic":"DVES_CC09C9", "GroupTopic":"sonoffleds"}
Mar  6 18:48:49 kitchen_left-2505 ESP-MQTT: tele/kitchen_left/INFO3 = {"Started":"Fatal exception:0 flag:2 (EXCEPTION) epc1:0x40201ee5 epc2:0x00000000 epc3:0x00000000 excvaddr:0x3ffffe60 depc:0x000
Mar  6 18:48:50 kitchen_left-2505 ESP-RTC: (UTC) Mon Mar 06 17:48:50 2017

Status 0 is as follows:

Mar  6 20:08:18 kitchen_left-2505 ESP-APP: Serial logging disabled
Mar  6 20:08:23 kitchen_left-2505 ESP-RSLT: Receive topic cmnd/kitchen_left/status, data size 1, data 0
Mar  6 20:08:23 kitchen_left-2505 ESP-RSLT: DataCb Topic kitchen_left, Group 0, Index 1, Type STATUS, Data 0 (0)
Mar  6 20:08:24 kitchen_left-2505 ESP-MQTT: stat/kitchen_left/STATUS = {"Status":{"Module":11, "FriendlyName":"Sonoff", "Topic":"kitchen_left", "ButtonTopic":"0", "Subtopic":"POWER", "Power":1, "PowerOnState":3, "LedState":1, "SaveData":1, "SaveState":1, "ButtonRetain":0, "PowerRetain":0}}
Mar  6 20:08:24 kitchen_left-2505 ESP-MQTT: stat/kitchen_left/STATUS1 = {"StatusPRM":{"Baudrate":115200, "GroupTopic":"sonoffleds", "OtaUrl":"http://1xx.xxx.xxx.xxx:80/api/arduino/sonoff.ino.bin", "Uptime":1, "Sleep":0, "BootCount":102, "SaveCount":146}}
Mar  6 20:08:24 kitchen_left-2505 ESP-MQTT: stat/kitchen_left/STATUS2 = {"StatusFWR":{"Program":"4.0.1", "Boot":31, "SDK":"1.5.3(aec24ac9)"}}
Mar  6 20:08:24 kitchen_left-2505 ESP-MQTT: stat/kitchen_left/STATUS3 = {"StatusLOG":{"Seriallog":2, "Weblog":4, "Syslog":4, "LogHost":"1xx.xxx.xxx.xxx", "SSId1":"TIGERWOLF", "SSId2":"TIGERGATE", "TelePeriod":300}}
Mar  6 20:08:24 kitchen_left-2505 ESP-MQTT: stat/kitchen_left/STATUS4 = {"StatusMEM":{"ProgramSize":364, "Free":572, "Heap":37, "SpiffsStart":940, "SpiffsSize":64, "FlashSize":1024, "ProgramFlashSize":1024, "FlashChipMode":2}}
Mar  6 20:08:24 kitchen_left-2505 ESP-MQTT: stat/kitchen_left/STATUS5 = {"StatusNET":{"Host":"kitchen_left-2505", "IP":"1yy.yyy.yyy.yyy", "Gateway":"1yy.yyy.1.1", "Subnetmask":"255.255.255.0", "Mac":"zz:zz:zz:CC:09:C9", "Webserver":2, "WifiConfig":4}}
Mar  6 20:08:24 kitchen_left-2505 ESP-MQTT: stat/kitchen_left/STATUS6 = {"StatusMQT":{"Host":"192.168.1.72", "Port":1883, "ClientMask":"DVES_%06X", "Client":"DVES_CC09C9", "User":"DVES_USER", "MAX_PACKET_SIZE":512, "KEEPALIVE":15}}
Mar  6 20:08:24 kitchen_left-2505 ESP-MQTT: stat/kitchen_left/STATUS7 = {"StatusTIM":{"UTC":"Mon Mar 06 19:08:23 2017", "Local":"Mon Mar 06 20:08:23 2017", "StartDST":"Sun Mar 26 02:00:00 2017", "EndDST":"Sun Oct 29 03:00:00 2017", "Timezone":1}}
Mar  6 20:08:24 kitchen_left-2505 ESP-MQTT: stat/kitchen_left/STATUS10 = {"StatusSNS":{"Time":"2017-03-06T20:08:23"}}
Mar  6 20:08:25 kitchen_left-2505 ESP-Wifi: Checking connection...
Mar  6 20:08:25 kitchen_left-2505 ESP-Wifi: Connected

@ecsfang
Copy link
Author

ecsfang commented Mar 6, 2017

Now I have flashed the original 4.0.1, with Webserver enabled, and it is just as stable as without ... :S
One HWWD, and one other restart in almost two hours ... AND the Webserver works fine from a browser.
I don't like the idea that software gets better if you flash it several times, but this is how it feels at the moment. Or is the moon just in another place right now? :)
I'll give it some days runtime and count the number of restarts ...

@arendst
Copy link
Owner

arendst commented Mar 6, 2017

I see you are using logging option 4. Do you use it all the time? If so it impacts the analogwrite interrupts used by the leds.

Do you see a difference in wdts if you use only one of the two led colors? ie color 0022 uses only the warm leds while color 2200 uses only the cold leds. In those cases only one of the two analogwrite ports is being used with half the amount of interrupts.

@ecsfang
Copy link
Author

ecsfang commented Mar 6, 2017

I increased it earlier, but I can lower the logging to 2.
No, I have not tried with different ports (colors), right now both are being used (0x80FF).
But, right now I have 2+ in uptime on both devices - never happened before - touch wood ... ;)
Should it start to generate wdt again I will try that approach and see if it makes a difference.

@ecsfang
Copy link
Author

ecsfang commented Mar 7, 2017

No, didn't see any difference using only one port.
By the way - setting color 0022 ends up as 0021 in the log (and 0080 as 007F):

Mar  7 19:45:21 kitchen_left-2505 ESP-MQTT: stat/kitchen_left/RESULT = {"Color":"0021"}

Apart from the wdt's, I see many exceptions (0, 9 and 28). In this log I only extracted the INFO3 messages from the two devices:

Mar  7 17:31:35 kitchen_right-1564 ESP-MQTT: tele/kitchen_right/INFO3 = {"Started":"Power on"}
Mar  7 17:32:14 kitchen_left-2505 ESP-MQTT: tele/kitchen_left/INFO3 = {"Started":"Power on"}
Mar  7 17:35:41 kitchen_right-1564 ESP-MQTT: tele/kitchen_right/INFO3 = {"Started":"Fatal exception:28 flag:2 (EXCEPTION) epc1:0x402505e7 epc2:0x00000000 epc3:0x00000024 excvaddr:0x00000024 depc:0x00
Mar  7 19:42:14 kitchen_left-2505 ESP-MQTT: tele/kitchen_left/INFO3 = {"Started":"Fatal exception:0 flag:2 (EXCEPTION) epc1:0x4021e8a1 epc2:0x00000000 epc3:0x00000000 excvaddr:0x00000000 depc:0x000
Mar  7 19:42:59 kitchen_left-2505 ESP-MQTT: tele/kitchen_left/INFO3 = {"Started":"Hardware Watchdog"}
Mar  7 19:44:19 kitchen_left-2505 ESP-MQTT: tele/kitchen_left/INFO3 = {"Started":"Hardware Watchdog"}
Mar  7 19:48:47 kitchen_left-2505 ESP-MQTT: tele/kitchen_left/INFO3 = {"Started":"Hardware Watchdog"}
Mar  7 19:49:46 kitchen_left-2505 ESP-MQTT: tele/kitchen_left/INFO3 = {"Started":"Hardware Watchdog"}
Mar  7 19:51:18 kitchen_left-2505 ESP-MQTT: tele/kitchen_left/INFO3 = {"Started":"Hardware Watchdog"}
Mar  7 20:09:45 kitchen_right-1564 ESP-MQTT: tele/kitchen_right/INFO3 = {"Started":"Hardware Watchdog"}
Mar  7 20:20:39 kitchen_right-1564 ESP-MQTT: tele/kitchen_right/INFO3 = {"Started":"Fatal exception:9 flag:2 (EXCEPTION) epc1:0x4022e15f epc2:0x00000000 epc3:0x00000000 excvaddr:0x401055ab depc:0x000
Mar  7 20:38:42 kitchen_right-1564 ESP-MQTT: tele/kitchen_right/INFO3 = {"Started":"Fatal exception:9 flag:2 (EXCEPTION) epc1:0x4022e154 epc2:0x00000000 epc3:0x00000000 excvaddr:0x00000036 depc:0x000
Mar  7 20:43:58 kitchen_right-1564 ESP-MQTT: tele/kitchen_right/INFO3 = {"Started":"Fatal exception:0 flag:2 (EXCEPTION) epc1:0x4022e240 epc2:0x00000000 epc3:0x00000000 excvaddr:0x401055ab depc:0x000
Mar  7 21:47:39 kitchen_right-1564 ESP-MQTT: tele/kitchen_right/INFO3 = {"Started":"Hardware Watchdog"}

Sometimes it can last for quite a while, sometimes there are many wdt in a row ...

@arendst
Copy link
Owner

arendst commented Mar 7, 2017

Color is adjusted by the dimmer value and as color is 255 and dimmer is 100 some digits do get lost.

Today I used your color code of 80FF for over 8 hours (lots of energy used today) but no wdt or any exception received. In fact it is running for over 24 hours without any anomaly...

It looks like you have a power problem with so many different exceptions.

Do you see this on both of your sonoff led devices?
'''
22:31:51 MQTT: domoticz/in = {"idx":159,"nvalue":2,"svalue":"6"}
22:31:51 MQTT: domoticz/in = {"idx":159,"nvalue":1,"svalue":""}
22:31:51 MQTT: stat/led1/RESULT = {"POWER":"ON"}
22:31:51 MQTT: stat/led1/POWER = ON
22:33:04 MQTT: domoticz/in = {"idx":159,"nvalue":2,"svalue":"6"}
22:33:04 MQTT: domoticz/in = {"idx":159,"nvalue":0,"svalue":""}
22:33:04 MQTT: stat/led1/RESULT = {"POWER":"OFF"}
22:33:04 MQTT: stat/led1/POWER = OFF
22:33:19 MQTT: tele/led1/STATE = {"Time":"2017-03-07T22:33:19", "Uptime":31, "POWER":"OFF", "Wifi":{"AP":2, "SSID":"indebuurt2", "RSSI":76}}
''''

@ecsfang
Copy link
Author

ecsfang commented Mar 8, 2017

Ok, thanks for the explanation, then I know.
Yes, both devices behaves the same, even if one of them (left) is worse then the other.

@ecsfang
Copy link
Author

ecsfang commented Mar 8, 2017

Maybe I just buy 2 new devices and check with them, maybe I just was unlucky and got two from a bad batch ...
In the mean time (postage take a while from over there) I could go through all solder joints etc, so there is no obvious problem on the board itself - it is quite a modern house, and I have no reason to assume noise or something similar on the powerline.
Had it just been one out of the two it would have been much more easy to swallow the idea of bad hardware. But, I as use to say, you get what you pay for ... ;)

@ecsfang
Copy link
Author

ecsfang commented Mar 9, 2017

Been playing around a bit with different settings. Most of the time, the MTBR (Mean Time Between Restarts) is about 5 minutes ... :P Sometimes as low as a couple of seconds, sometimes up to 15-20 minutes (on the "good one", the "bad one" is seldom above 5 min MTBR).

I even had one case where the device was reset (at least module, topic and grouptopic was back to default values (basic/sonoff/sonoffs), but AP's etc was still there), after a restart.

The settings I have been using are Speed=4, Fade=on, Ledtable=on, Dimmer=70, Color=80FF, Syslog=4. I'm not toggling the power nor changing the dimmer, the device has just been turned on.

In my last build, I decreased the ANALOG_WRITE_FREQ down to 100, and right now it is very stable. After 3 hours of runtime I have no yet had a single restart on any of the two devices!
Not only that, but the Webserver is included and has been running flawless as well ... knock wood!
The blinking is hardly visible, but I will try to increase it somewhat tomorrow and see how that goes.

@Nayar
Copy link

Nayar commented Mar 12, 2017

Think I'm having the same problem maybe here. One of my Sonoff Pow is restarting every 2/3 minutes. My logs on Graylog is like this.

screen shot 2017-03-12 at 3 06 40 pm

{"Started":"Fatal exception:9 flag:2 (EXCEPTION) epc1:0x40105e99 epc2:0x00000000 epc3:0x00000000 excvaddr:0x3ffea785 depc:0x00000000"}

It was restarting very often last week. Then stabilised, to uptime like 24 hours.

screen shot 2017-03-12 at 3 15 16 pm

@AlexTransit
Copy link

Do you use the button to turn on / off?
I use clasic sonoff
Connected it to the LED driver. (I use a relay to power the driver) (set device to sonoff led.)
Previously there was instability in the work.
Changed the initialization and the problem is gone.
I gave the code, but it was not used.
If you need I can give it again :)

@ecsfang
Copy link
Author

ecsfang commented Mar 13, 2017

Hi, I use a physical button to turn the power on/off to the Sonoff Led.
Please describe the changes you made and why.
In my case, it feels a bit like different flashings give different results. I.e. using different settings (e.g. PWM freq) gives different results, but changing back to "good" values sometimes still give bad result. Right now it works ok, several hours of uptime between HWWDs.
Cheers, Thomas

@AlexTransit
Copy link

arendst added a commit that referenced this issue Apr 10, 2017
4.1.3 20170410
* Add user configuarble GPIO to module S20 Socket and Slampher
* Add support for Sonoff SC (#112)
* Set PWM frequency from 1000Hz to 910Hz as used on iTead Sonoff Led
firmware (#122)
* Set Sonoff Led unconfigured floating outputs to 0 to reduce exceptions
due to power supply instabilities (#122)
* Add Access Point Mac Address to Status 11 and Telemetry (#329)
* Fix DS18B20 negative temperature readings (#334)
arendst added a commit that referenced this issue Apr 10, 2017
4.1.3 20170410
* Add user configuarble GPIO to module S20 Socket and Slampher
* Add support for Sonoff SC (#112)
* Set PWM frequency from 1000Hz to 910Hz as used on iTead Sonoff Led
firmware (#122)
* Set Sonoff Led unconfigured floating outputs to 0 to reduce exceptions
due to power supply instabilities (#122)
* Add Access Point Mac Address to Status 11 and Telemetry (#329)
* Fix DS18B20 negative temperature readings (#334)
@pnuding
Copy link

pnuding commented Apr 17, 2017

Without hardware modifications I still have issues running 4.1.3.
Yesterday I've now ripped out the 7533-1 regulator of one of the Sonoff LED I have and connecter a big (TO220 case) 1117 regulator outside the case. As expected it gets ridiculously hot but so far the device has been stable at low dimmer settings and an odd colour setting for more than 24 hours.
If this remains stable I'll use some buck converter instead of the linear regulator so we don't get this much heat in the first place.
I'll report back again if it's still stable for the next days

@ecsfang
Copy link
Author

ecsfang commented Apr 17, 2017

Good news, I have some LM1117 in the post coming my way; I thought of doing the same experiment. I haven't thought about the heat though...
I have also been measuring the voltage, and it is obvious that after a while, the temperature rises which result in lower voltage, and bigger problems I have - and that it is quite big difference between different devices.

@hamwong
Copy link

hamwong commented Apr 18, 2017

I believe itead also realize hardware is not designed well, LED series have been stop selling on web or taobao(Official) since few weeks ago.

@hamwong
Copy link

hamwong commented Apr 18, 2017

@pnuding @ecsfang will you mind create a video howto or pic howto with detail step so people with no electronic knowledge can replicate your modification way? a simple electronic schematic is difficult for us.

@pnuding
Copy link

pnuding commented Apr 18, 2017

I'll take a few pictures, sure! so far 47 hours uptime and counting... :)

@ecsfang
Copy link
Author

ecsfang commented Apr 21, 2017

Ok, I'll try to explain how I did. Please note that this shows just how I did this, and I take no responsibility for any failures or other problems that might occur if you try to repeat what I have done.

As mentioned earlier I bought some LM1117mp-3.3 in a SOT package (since it was very similar to the original 7533 in form and size, but it is a little bit larger and two pins are swapped (Vout and Vin), so I could not just replace the driver just like that.

First picture shows the original 7533 in place, note that I have removed the capacitor (the two holes in the circle - it is a 470uF 10V connected between GND and Vout, i.e. on the 3.3V side).

dsc_0294

Second picture shows the pcb with the 7533 removed and the pads cleaned somewhat,

dsc_0295

The SOT package has three pins on one side (GND, Vout and Vin). The big tab on the other side is also connected to the Vout. To simplify for me, I just removed the middle pin (the smaller Vout pin) and soldered GND and Vin to the old pads.
Remember that Vout and Vin are swapped, so the right-most pin should be connected to the middle pad, and the middle pin (tab) connected to the rightmost pad on the pcb.

dsc_0296

And then I added a small wire, connecting the Vout to the big tab on the device. It was easier to solder the wire to the hole instead of to the old pad). This shows the final placement of the LM1117 on the pcb.

dsc_0297

And finally, I just had to reinsert the capacitor (note the orientation - the minus (-) side of the capacitor should be connected to gnd).

dsc_0298

Put it all back into the box and connect the LEDs and mains and voilá!
Right now I have just been running for almost 2 hours, but the voltage is VERY stable at about 3.16V! With the old 7533, the voltage sometimes dropped down to 2.6V and was very affected by temperature: turn the LEDs on, and you could really see in real-time the voltage drop ...

So, now only remains to run it for a longer time, and see how it copes, so far no restarts at all :)
I did ran one yesterday for several hours to check the temperature on the LM1117. Yes, it got hot, but not burning hot - I had no thermometer, but I could touch the device without burning myself, it just felt hot, so I don't think it gets hotter than the old one.
This was quite "un-scientifically" done - and be careful - never put your fingers close to the pcb without disconnecting the mains first!

17:57:17 MQTT: tele/kitchen_left/STATE = {"Time":"2017-04-21T17:57:17", "Uptime":0, "Vcc":3.160, "POWER":"ON", "Wifi":{"AP":1, "SSID":"TIGERWOLF", "RSSI":56, "APMac":"90:94:E4:39:B1:2C"}}
18:02:00 MQTT: tele/kitchen_left/UPTIME = {"Time":"2017-04-21T18:02:00", "Uptime":1}
18:02:17 MQTT: tele/kitchen_left/STATE = {"Time":"2017-04-21T18:02:17", "Uptime":1, "Vcc":3.160, "POWER":"ON", "Wifi":{"AP":1, "SSID":"TIGERWOLF", "RSSI":56, "APMac":"90:94:E4:39:B1:2C"}}
18:07:18 MQTT: tele/kitchen_left/STATE = {"Time":"2017-04-21T18:07:18", "Uptime":1, "Vcc":3.160, "POWER":"ON", "Wifi":{"AP":1, "SSID":"TIGERWOLF", "RSSI":54, "APMac":"90:94:E4:39:B1:2C"}}
18:12:18 MQTT: tele/kitchen_left/STATE = {"Time":"2017-04-21T18:12:18", "Uptime":1, "Vcc":3.160, "POWER":"ON", "Wifi":{"AP":1, "SSID":"TIGERWOLF", "RSSI":52, "APMac":"90:94:E4:39:B1:2C"}}
18:17:18 MQTT: tele/kitchen_left/STATE = {"Time":"2017-04-21T18:17:18", "Uptime":1, "Vcc":3.160, "POWER":"ON", "Wifi":{"AP":1, "SSID":"TIGERWOLF", "RSSI":56, "APMac":"90:94:E4:39:B1:2C"}}

@ecsfang
Copy link
Author

ecsfang commented Apr 21, 2017

It should also be noted that I still have seen some restart when I play with different color combinations.
Right now I'm just using "color 00FF" - which failed every now and then with the 7533, but so far (knock wood) there has been no restarts at all with this configuration ("wife happy").
So, these restarts (different color combinations) are not (as I can see) related to the voltage, but rather some timing problem.

I'll update after the weekend with my findings - if I see any other problems, or if it is stable enough for me :)

Cheers,
Thomas

@pnuding
Copy link

pnuding commented Apr 22, 2017

Thanks Thomas! I've only managed crappy pictures of my setup. Yours are much clearer!

What I've done:
Since my 1117 is the big TO220 format, I couldn't nicely solder it in place.
So I attached three wires to it and connected them to the solder pads of the 7533-1.
Again, important to remember that the input and output pins of the 1117 are switched compared to the 7533-1.
So I connected
-the left pin of the 1117 to the left solder pad
-the right pin of the 1117 to the middle solder pad
-the middle pin of the 1117 to the right solder pad

It's been almost a week now and I've not had a single restart. I use color 88ff here which previously gave me boatload of trouble and the device has been through at least a hundred brightness changes

@pnuding
Copy link

pnuding commented Apr 22, 2017

PS: I wrapped the wire connections to the 1117's pins in shrink tube and also the whole regulator so I could drop it into the case after all. Since it would inevitably touch other components I needed it to be insulated nicely.

@ecsfang
Copy link
Author

ecsfang commented Apr 22, 2017

Thanks @pnuding, I had the opportunity to use a microscope, and took some pictures through the eyepiece with my handheld Xperia phone. A bit hard to get the angle right, but they came out quite ok :)
So far mine (I modified all 4 units I have) has also been very stable, not a single restart yet - still not so many hours runtime but enough to say that it is much better than before ;)

@arendst
Copy link
Owner

arendst commented Apr 29, 2017

@ecsfang and @pnuding can we conclude that changing the 7533 by a better suited voltage regulator solves the issues observed?

If so I'll add your great pictures and workaround to the Wiki for all to know.

@pnuding
Copy link

pnuding commented Apr 29, 2017

Hey Theo, still running stable here so yes, I think it's now safe to declare this a good solution. Yay! :)

@ecsfang
Copy link
Author

ecsfang commented Apr 29, 2017

Hi Theo!
Yes, in my case it is much better, I've been running for some week now, and it is much more stable!
But (isn't there always a but?), I've seen some sporadic restarts (on one unit of two, few = counted on my left hands fingers) which according to the log was due to HWWD (hardware watchdog). Why I see these I don't know, but I don't think they are related to the voltage. The voltage is very very stable (+/-.05V).
In my opinion you could close this issue for now.
Best regards,
Thomas

@timtimsson
Copy link

timtimsson commented Apr 30, 2017

Hi,
i changed also the 7533 to a LM1085IT-3.3(TO220) - it was not so easy for me , becase TO220 ist very big and the pins was also changed - but now it works perfect....

@arendst arendst changed the title Failing Sonoff LED? Fixed: Failing Sonoff LED - Upgrade power regulator Aug 9, 2017
@arendst arendst closed this as completed Aug 9, 2017
@arendst
Copy link
Owner

arendst commented Aug 19, 2017

It took a while but today I finally installed a TS1117 the @ecsfang Thomas way and it seems to hold just fine. Power is indeed stable now at 3.226V

Thnx for your experiments and great pictures

@arendst
Copy link
Owner

arendst commented Aug 19, 2017

Well I'm afraid my TS1117 is not up for the task.... First I ran Color 00FF when the device got as hot as 60 degress C. When I tried to change color it started to reboot with status "power on" so it's not a software reboot.

After it was cooled down to 37 degrees I tried again with color 0016 and at 40 degrees it started to reboot again.

Looking at the specs of the TS1117 I see it only allows 12V input voltage so that's probably too low. I'll order a LM1117 in TO-220 package and try again :-(

@pnuding
Copy link

pnuding commented Aug 19, 2017

You can also use one of those little MP2307-based step down converter modules they sell from china. They handle even such steep input voltage differences nicely without heat. They're rapidly becoming my favorite way of powering ESP8266s and with the adjustable output voltage they're also useful for all sorts of other cases

@arendst
Copy link
Owner

arendst commented Aug 24, 2017

Today I replaced the failing TS1117 with a LM1085-3.3 in TO-220 package.

It seems to work for at least two hours with temperature rising to 54 degrees C. Will continue to monitor...

@ageorgios
Copy link

Is the LM1117 working with high temperatures?
Are there any other alternatives?

@arendst
Copy link
Owner

arendst commented Sep 26, 2017

The LM1085 is working ok with 54 C but that's quite hot for led light...

@ageorgios
Copy link

ageorgios commented Nov 30, 2017

Accidentally bought this one L1117:
Does it do the work? If yes how to make the connections?

screen shot 2017-11-30 at 15 56 41

curzon01 pushed a commit to curzon01/Tasmota that referenced this issue Sep 6, 2018
4.0.0 20170303
* Add define to remove config migration code for versions below 3.0 (See
Wiki-Upgrade-Migration path)
* Free memory by switching from String to char[]
* Raised Sonoff Led PWM frequency from 200Hz to 432Hz in search of
stability (hardware watchdog timeouts) (arendst#122)
* Increase message size and suggested minimum MQTT_MAX_PACKET_SIZE to
512 (arendst#114, arendst#124)
* Remove runtime warning message regarding MQTT_MAX_PACKET_SIZE too
small as it is now moved to compile time (arendst#124)
* Fix possible panics with web console and http commands while UDP
syslog is active (arendst#127)
* Add optional static IP address (arendst#129)
* Add define ENERGY_RESOLUTION in user_config.h to allow user control
over precision (arendst#136)
curzon01 pushed a commit to curzon01/Tasmota that referenced this issue Sep 6, 2018
4.1.3 20170410
* Add user configuarble GPIO to module S20 Socket and Slampher
* Add support for Sonoff SC (arendst#112)
* Set PWM frequency from 1000Hz to 910Hz as used on iTead Sonoff Led
firmware (arendst#122)
* Set Sonoff Led unconfigured floating outputs to 0 to reduce exceptions
due to power supply instabilities (arendst#122)
* Add Access Point Mac Address to Status 11 and Telemetry (arendst#329)
* Fix DS18B20 negative temperature readings (arendst#334)
curzon01 pushed a commit to curzon01/Tasmota that referenced this issue Sep 6, 2018
4.1.3 20170410
* Add user configuarble GPIO to module S20 Socket and Slampher
* Add support for Sonoff SC (arendst#112)
* Set PWM frequency from 1000Hz to 910Hz as used on iTead Sonoff Led
firmware (arendst#122)
* Set Sonoff Led unconfigured floating outputs to 0 to reduce exceptions
due to power supply instabilities (arendst#122)
* Add Access Point Mac Address to Status 11 and Telemetry (arendst#329)
* Fix DS18B20 negative temperature readings (arendst#334)
curzon01 pushed a commit to curzon01/Tasmota that referenced this issue Sep 6, 2018
4.1.3 20170410
* Add user configuarble GPIO to module S20 Socket and Slampher
* Add support for Sonoff SC (arendst#112)
* Set PWM frequency from 1000Hz to 910Hz as used on iTead Sonoff Led
firmware (arendst#122)
* Set Sonoff Led unconfigured floating outputs to 0 to reduce exceptions
due to power supply instabilities (arendst#122)
* Add Access Point Mac Address to Status 11 and Telemetry (arendst#329)
* Fix DS18B20 negative temperature readings (arendst#334)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help needed Action - Asking for help from the community
Projects
None yet
Development

No branches or pull requests

9 participants