Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

intermittent watchdog messages on esp32/esp32c3 #2914

Closed
pljakobs opened this issue Nov 24, 2024 · 38 comments
Closed

intermittent watchdog messages on esp32/esp32c3 #2914

pljakobs opened this issue Nov 24, 2024 · 38 comments

Comments

@pljakobs
Copy link
Contributor

I'm getting

E (54167) task_wdt: esp_task_wdt_reset(713): task not found

intermittently.
Sometimes, that's followed by a restart, but I can't see any reason yet.

@mikee47
Copy link
Contributor

mikee47 commented Nov 24, 2024

All changing in #2913 so should disappear.

@slaff
Copy link
Contributor

slaff commented Nov 25, 2024

@pljakobs can you check if the latest develop version fixes the reported issue?

@slaff
Copy link
Contributor

slaff commented Nov 26, 2024

@pljakobs I will close the issue for now. If it happens again in the latest develop feel free to re-open the issue.

@slaff slaff closed this as completed Nov 26, 2024
@mikee47
Copy link
Contributor

mikee47 commented Nov 26, 2024

By way of explanation, the message appears because the task watchdog is created for the Sming task but is getting called on the tcpip task on occasion. The revisions revert to the IDLE task watchdog so the message won't appear any more, but more importantly tasks are correctly serialised so general reliabiity will be improved.

@pljakobs
Copy link
Contributor Author

pljakobs commented Nov 26, 2024

built my current version using the current sming develop branch fails an assert on the esp32c3:

assert failed: xQueueGenericSendFromISR queue.c:1180 (pxQueue)
Core  0 register dump:
MEPC    : 0x40380698  RA      : 0x4038558e  SP      : 0x3fc9bbc0  GP      : 0x3fc90800  
TP      : 0x3fc44c64  T0      : 0x37363534  T1      : 0x7271706f  T2      : 0x33323130  
S0/FP   : 0x00000001  S1      : 0x3fc9bd1e  A0      : 0x3fc9bc18  A1      : 0x3fc91c99  
A2      : 0x00000001  A3      : 0x00000029  A4      : 0x00000001  A5      : 0x3fc98000  
A6      : 0x7a797877  A7      : 0x76757473  S2      : 0x00000089  S3      : 0x3fc9bc18  
S4      : 0x3fc9bc18  S5      : 0x3c0d2ef8  S6      : 0x00000029  S7      : 0x3c10760c  
S8      : 0x3c10d400  S9      : 0x3c0c4000  S10     : 0x00000000  S11     : 0x00000000  
T3      : 0x6e6d6c6b  T4      : 0x6a696867  T5      : 0x66656463  T6      : 0x62613938  
MSTATUS : 0x00001801  MTVEC   : 0x40380001  MCAUSE  : 0x00000007  MTVAL   : 0x00000000  
MHARTID : 0x00000000  

same on the esp32:

assert failed: xQueueGenericSendFromISR queue.c:1180 (pxQueue)


Backtrace: 0x4008115a:0x3ffb1ce0 0x400880c1:0x3ffb1d00 0x4008e77d:0x3ffb1d20 0x40088716:0x3ffb1e40 0x400833c3:0x3ffb1e70 0x40083fe1:0x3ffb1ea0 0x40112772:0x3ffb1ec0 0x401100fa:0x3ffb1ee0 0x4010fe4d:0x3ffb1f00 0x400d88f5:0x3ffb1f20 0x400d8a62:0x3ffb2120 0x400d4749:0x3ffb2170 0x40174a6c:0x3ffb2190 0x40088d55:0x3ffb21b0

@pljakobs
Copy link
Contributor Author

@mikee47 I can't reopen this since I can only re-open issues that I have closed myself. I don't know if you get a notification on a closed issue, so: ping

@slaff slaff reopened this Nov 26, 2024
@mikee47
Copy link
Contributor

mikee47 commented Nov 26, 2024

That assert indicates the queue hasn't been created yet: I'll try Basic_Wifi on the esp32c3, see what I can dig up.

@mikee47
Copy link
Contributor

mikee47 commented Nov 26, 2024

So I've tried the Basic_Wifi sample on an esp32c3 and runs without issue. We need to know what called xQueueGenericSendFromISR and, more specifically, why the queue handle is invalid.

If you rebuild with ENABLE_GDB=1 then when it crashes you should be able to run make gdb, then bt to get a backtrace.

@mikee47
Copy link
Contributor

mikee47 commented Nov 26, 2024

I can't reproduce the issue but I suspect something is trying to post to the Sming event queue before it's been created, probably during network initialisation. Can you try this patch to Sming/Arch/Esp32/Components/esp32/src/startup.cpp:

@@ -30,15 +30,16 @@ extern "C" void app_main(void)
 	hw_timer_init();
 
 	smg_uart_detach_all();
 	esp_log_set_vprintf(m_vprintf);
 
+	System.initialize();
+
 #ifndef DISABLE_WIFI
 	esp_network_initialise();
 #endif
 
-	System.initialize();
 	Storage::initialize();
 
 	// Application gets called outside main thread at startup
 	// Things like smartconfig won't work if called via task queue
 	init();

@pljakobs
Copy link
Contributor Author

will try, I have not looked at this more deeply yet (day job taking excessive cycles) - it happens right after initializing ConfigDB, but I will need to debug it more.

@pljakobs
Copy link
Contributor Author

with this startup.cpp, the issue persists

/****
 * Sming Framework Project - Open Source framework for high efficiency native ESP8266 development.
 * Created 2015 by Skurydin Alexey
 * http://github.com/SmingHub/Sming
 * All files of the Sming Core are provided under the LGPL v3 license.
 *
 * startup.cpp
 *
 */

#include <esp_system.h>
#include <esp_log.h>
#include <esp_event.h>
#include <debug_progmem.h>
#include <Platform/System.h>
#include <driver/hw_timer.h>
#include <driver/uart.h>
#include <Storage.h>

extern void init();
extern void esp_network_initialise();
extern void start_sming_task_loop();

extern "C" void __wrap_esp_newlib_init_global_stdio(const char*)
{
}

extern "C" void app_main(void)
{
	hw_timer_init();

	smg_uart_detach_all();
	esp_log_set_vprintf(m_vprintf);

	System.initialize();

#ifndef DISABLE_WIFI
	esp_network_initialise();
#endif

	Storage::initialize();

	// Application gets called outside main thread at startup
	// Things like smartconfig won't work if called via task queue
	init();

	start_sming_task_loop();
}

@mikee47
Copy link
Contributor

mikee47 commented Nov 26, 2024

Going to need a reproducible sample or backtrace to diagnose this further.

@pljakobs
Copy link
Contributor Author

pljakobs commented Nov 27, 2024

so it took me a while to set this up, broke out the jtag interface. Here's what I got:

Temporary breakpoint 1, app_main () at /opt/Sming/Sming/Arch/Esp32/Components/esp32/src/startup.cpp:30
30		hw_timer_init();
(gdb) c
Continuing.
[esp32.cpu0] Target halted, PC=0x400DADE9, debug_reason=00000002
Set GDB target to 'esp32.cpu0'

Program received signal SIGTRAP, Trace/breakpoint trap.
ArduinoJson::V6213PB2::detail::storeString<ArduinoJson::V6213PB2::detail::SizedRamString, ArduinoJson::V6213PB2::detail::SlotKeySetter> (pool=0x3ffb1970, str=..., callback=...) at /opt/Sming/Sming/Libraries/ArduinoJson6/ArduinoJson/src/ArduinoJson/Memory/MemoryPool.hpp:226
226	  JsonString storedString(copy, str.size(), JsonString::Copied);
(gdb) bt
#0  ArduinoJson::V6213PB2::detail::storeString<ArduinoJson::V6213PB2::detail::SizedRamString, ArduinoJson::V6213PB2::detail::SlotKeySetter> (pool=0x3ffb1970, str=..., callback=...)
    at /opt/Sming/Sming/Libraries/ArduinoJson6/ArduinoJson/src/ArduinoJson/Memory/MemoryPool.hpp:226
#1  0x400dae20 in ArduinoJson::V6213PB2::detail::storeString<ArduinoJson::V6213PB2::detail::SizedRamString, ArduinoJson::V6213PB2::detail::SlotKeySetter> (pool=0x3ffb1970, str=..., 
    callback=...)
    at /opt/Sming/Sming/Libraries/ArduinoJson6/ArduinoJson/src/ArduinoJson/Memory/MemoryPool.hpp:250
#2  0x400dae5a in ArduinoJson::V6213PB2::detail::slotSetKey<ArduinoJson::V6213PB2::detail::SizedRamString> (pool=0x3ffb1970, key=..., var=0x3ffcdde0)
    at /opt/Sming/Sming/Libraries/ArduinoJson6/ArduinoJson/src/ArduinoJson/Variant/SlotFunctions.hpp:13
#3  ArduinoJson::V6213PB2::detail::CollectionData::addMember<ArduinoJson::V6213PB2::detail::SizedRamString> (this=0x3ffb1988, key=..., pool=0x3ffb1970)
    at /opt/Sming/Sming/Libraries/ArduinoJson6/include/../ArduinoJson/src/ArduinoJson/Collection/CollectionImpl.hpp:40
#4  0x400dae98 in ArduinoJson::V6213PB2::detail::CollectionData::getOrAddMember<ArduinoJson::V6213PB2::detail::SizedRamString> (this=0x3ffb1988, key=..., pool=0x3ffb1970)
    at /opt/Sming/Sming/Libraries/ArduinoJson6/include/../ArduinoJson/src/ArduinoJson/Collection/CollectionImpl.hpp:125
#5  0x400daed8 in ArduinoJson::V6213PB2::detail::VariantData::getOrAddMember<ArduinoJson::V6213PB2::detail::SizedRamString> (this=0x3ffb1988, key=..., pool=0x3ffb1970)
--Type <RET> for more, q to quit, c to continue without paging--c
    at /opt/Sming/Sming/Libraries/ArduinoJson6/ArduinoJson/src/ArduinoJson/Variant/VariantData.hpp:277
#6  0x400daefc in ArduinoJson::V6213PB2::detail::variantGetOrAddMember<ArduinoJson::V6213PB2::detail::SizedRamString> (var=0x3ffb1988, key=..., pool=0x3ffb1970) at /opt/Sming/Sming/Libraries/ArduinoJson6/ArduinoJson/src/ArduinoJson/Variant/VariantFunctions.hpp:89
#7  0x400f5470 in ArduinoJson::V6213PB2::detail::MemberProxy<ArduinoJson::V6213PB2::JsonObject, String>::getOrCreateData (this=0x3ffb1438) at /opt/Sming/Sming/Libraries/ArduinoJson6/ArduinoJson/src/ArduinoJson/Object/MemberProxy.hpp:54
#8  ArduinoJson::V6213PB2::detail::VariantAttorney::getOrCreateData<ArduinoJson::V6213PB2::detail::MemberProxy<ArduinoJson::V6213PB2::JsonObject, String> const> (client=...) at /opt/Sming/Sming/Libraries/ArduinoJson6/ArduinoJson/src/ArduinoJson/Variant/VariantAttorney.hpp:44
#9  ArduinoJson::V6213PB2::detail::VariantRefBase<ArduinoJson::V6213PB2::detail::MemberProxy<ArduinoJson::V6213PB2::JsonObject, String> >::getOrCreateData (this=0x3ffb1438) at /opt/Sming/Sming/Libraries/ArduinoJson6/ArduinoJson/src/ArduinoJson/Variant/VariantRefBase.hpp:290
#10 ArduinoJson::V6213PB2::detail::VariantRefBase<ArduinoJson::V6213PB2::detail::MemberProxy<ArduinoJson::V6213PB2::JsonObject, String> >::getOrCreateVariant (this=0x3ffb1438) at /opt/Sming/Sming/Libraries/ArduinoJson6/include/../ArduinoJson/src/ArduinoJson/Variant/VariantImpl.hpp:120
#11 ArduinoJson::V6213PB2::detail::VariantRefBase<ArduinoJson::V6213PB2::detail::MemberProxy<ArduinoJson::V6213PB2::JsonObject, String> >::set<char const> (value=0x3f40c6ec "2.0", this=0x3ffb1438) at /opt/Sming/Sming/Libraries/ArduinoJson6/ArduinoJson/src/ArduinoJson/Variant/VariantRefBase.hpp:135
#12 ArduinoJson::V6213PB2::detail::MemberProxy<ArduinoJson::V6213PB2::JsonObject, String>::operator=<char const> (src=0x3f40c6ec "2.0", this=0x3ffb1438) at /opt/Sming/Sming/Libraries/ArduinoJson6/ArduinoJson/src/ArduinoJson/Object/MemberProxy.hpp:39
#13 JsonRpcMessage::JsonRpcMessage (this=<optimized out>, name=...) at /home/pjakobs/devel/esp_rgbww_firmware/app/jsonrpcmessage.cpp:6
#14 0x400ed611 in AppWIFI::broadcastWifiStatus (this=0x3ffc1140 <app>, message=...) at /home/pjakobs/devel/esp_rgbww_firmware/app/networking.cpp:433
#15 0x400eea14 in AppWIFI::startAp (this=0x3ffc1140 <app>) at /home/pjakobs/devel/esp_rgbww_firmware/app/networking.cpp:416
#16 0x400eefdc in AppWIFI::init (this=0x3ffc1140 <app>) at /home/pjakobs/devel/esp_rgbww_firmware/app/networking.cpp:168
#17 0x400d7aea in Application::init (this=0x3ffc1140 <app>) at /home/pjakobs/devel/esp_rgbww_firmware/app/application.cpp:361
#18 0x400d7c40 in init () at /home/pjakobs/devel/esp_rgbww_firmware/app/application.cpp:145
#19 0x400d44d0 in app_main () at /opt/Sming/Sming/Arch/Esp32/Components/esp32/src/startup.cpp:45
#20 0x4018bc4b in main_task (args=0x0) at /opt/esp-idf-5.2/components/freertos/app_startup.c:208
#21 0x40088e1c in vPortTaskWrapper (pxCode=0x4018bc04 <main_task>, pvParameters=0x0) at /opt/esp-idf-5.2/components/freertos/FreeRTOS-Kernel/portable/xtensa/port.c:134

I'm unsure that this really shows the issue since this is an unconfigured chip, meaning it does not connect to the wifi yet but should open it's AP for configuration - which it doesn't seem to do.
But: it also doesn't boot loop, hence I wonder if the debug output shows anything at all.
(the inline gdb didn't work for some odd reason. Done it multiple times with the 8266, but with the 32c3, I would always just end up with normal terminal output on the serial link, causing gdb to time out.)

@mikee47
Copy link
Contributor

mikee47 commented Nov 27, 2024

Using GDB on the esp32 requires the ESP_SYSTEM_GDBSTUB_PANIC setting via make sdk-menuconfig, so when the fault occurs we get:

Entering gdb stub now.
$T0b#e6

Should be able to break out then run make gdb:

Remote debugging using /dev/ttyUSB0
warning: multi-threaded target stopped without sending a thread-id, using first non-exited thread
0x4008111d in panic_abort (details=0x3ffb0b30 "assert failed: xQueueGenericSendFromISR queue.c:1180 (pxQueue)")
    at /opt/esp-idf-5.2/components/esp_system/panic.c:466
466	    *((volatile int *) 0) = 0; // NOLINT(clang-analyzer-core.NullDereference) should be an invalid operation on targets
(gdb) bt
#0  0x4008111d in panic_abort (details=0x3ffb0b30 "assert failed: xQueueGenericSendFromISR queue.c:1180 (pxQueue)")
    at /opt/esp-idf-5.2/components/esp_system/panic.c:466
#1  0x40087d74 in esp_system_abort (details=0x3ffb0b30 "assert failed: xQueueGenericSendFromISR queue.c:1180 (pxQueue)")
    at /opt/esp-idf-5.2/components/esp_system/port/esp_system_chip.c:93
#2  0x4008e2d8 in __assert_func (file=<optimized out>, line=<optimized out>, func=<optimized out>, expr=<optimized out>)
    at /opt/esp-idf-5.2/components/newlib/assert.c:81
#3  0x400883c9 in xQueueGenericSendFromISR (xQueue=0x0, pvItemToQueue=<optimized out>, pxHigherPriorityTaskWoken=0x3ffb0c88, 
    xCopyPosition=<optimized out>) at /opt/esp-idf-5.2/components/freertos/FreeRTOS-Kernel/queue.c:1181
#4  0x40083345 in system_os_post (prio=<optimized out>, sig=1074687320, par=1073470200)
    at ../../../../../../../Arch/Esp32/Components/esp32/src/tasks.cpp:96
#5  0x40083f9c in SystemClass::queueCallback (callback=<optimized out>, param=0x3ffbdaf8) at ../../../../../../../Platform/System.cpp:87
#6  0x400e6d24 in SystemClass::queueCallback(Delegate<void ()>) (callback=...) at ../../../../../../../Platform/System.cpp:116
#7  0x400d4c90 in SystemClass::onReady(Delegate<void ()>) (this=<optimized out>, readyHandler=...)
    at ../../../../../../../../../Sming/Platform/System.h:166
#8  init () at ../../../../../../../app/application.cpp:74
#9  0x400d4724 in app_main () at ../../../../../../../Arch/Esp32/Components/esp32/src/startup.cpp:44
#10 0x4012cc02 in main_task (args=0x0) at /opt/esp-idf-5.2/components/freertos/app_startup.c:208
#11 0x40088a08 in vPortTaskWrapper (pxCode=0x4012cb98 <main_task>, pvParameters=0x0)
    at /opt/esp-idf-5.2/components/freertos/FreeRTOS-Kernel/portable/xtensa/port.c:134
(gdb) 

NB. I'm forcing this exception by patching system_os_post to call xQueueSendToBackFromISR(nullptr, ...).

NOTE: For some reason the esp32c3 isn't responding to GDB over serial (esp32 is fine). I'm not sure why, but it's annoying.

@mikee47
Copy link
Contributor

mikee47 commented Nov 27, 2024

OK, so at least with my esp32c3 dev. board I suspect that the auto-reset lines (RTS/CTS) on the serial port aren't being set correctly with GDB (both should be low). Can't find any gdb commands to override that behaviour, closest is flow control (which is disabled). Here's a workaround:

make tcp-serial-redirect
make gdb COM_PORT_GDB=192.168.1.40:7780

Replacing the IP as required.

@pljakobs
Copy link
Contributor Author

not sure how helpful this is, seems that gdb cannot identify the error?

Registered storage devices:
  spiFlash: type flash, size 0x400000. Partitions:
    spiFlash/nvs, data/nvs @ 0x9000, size 0x6000
    spiFlash/phy_init, data/phy @ 0xf000, size 0x1000
    spiFlash/factory, app/factory @ 0x10000, size 0x160000
    spiFlash/lfs0, data/littlefs @ 0x170000, size 0x90000
    spiFlash/rom1, app/ota1 @ 0x200000, size 0x180000
    spiFlash/lfs1, data/littlefs @ 0x380000, size 0x80000


[/home/pjakobs/devel/esp_rgbww_firmware/app/otaupdate.cpp:512] ApplicationOTA::saveStatus 3 to rom partition rom0

[/home/pjakobs/devel/esp_rgbww_firmware/app/application.cpp:558] mouting primary littlefs partition at 170000, length 589824
[Core/FileSystem.cpp:45] mount() returned 0 (Success)
[Core/FileSystem.cpp:54] File system initialised
Guru Meditation Error: Core  0 panic'ed (Unhandled debug exception). 
Debug exception reason: Stack canary watchpoint triggered (main) 
Core  0 register dump:
PC      : 0x40089036  PS      : 0x00060a36  A0      : 0x4008318d  A1      : 0x3ffb1200  
A2      : 0x00000000  A3      : 0x00060a23  A4      : 0x00060a20  A5      : 0x3ffb15c4  
A6      : 0x00000002  A7      : 0x3ffcb450  A8      : 0x801212ac  A9      : 0x3ffb13a0  
A10     : 0x3ffcb624  A11     : 0x0000005d  A12     : 0x00000003  A13     : 0x3ffb17fc  
A14     : 0x00000002  A15     : 0x3ffb16d0  SAR     : 0x0000001b  EXCCAUSE: 0x00000001  
EXCVADDR: 0x00000000  LBEG    : 0x4000c2e0  LEND    : 0x4000c2f6  LCOUNT  : 0xffffffff  


Backtrace: 0x40089033:0x3ffb1200 0x4008318a:0x3ffb12d0 0x4008c572:0x3ffb12f0 0x4008c630:0x3ffb1310 0x40081d2d:0x3ffb1330 0x40081d59:0x3ffb1360 0x40081d89:0x3ffb1380 0x4008e82d:0x3ffb13a0 0x401212a9:0x3ffb13c0 0x40121317:0x3ffb13e0 0x40121416:0x3ffb1410 0x4011fda3:0x3ffb1430 0x4011ff6d:0x3ffb1450 0x40125917:0x3ffb1480 0x4012486f:0x3ffb14a0 0x40125155:0x3ffb14d0 0x4012516f:0x3ffb1590 0x4013fd42:0x3ffb15b0 0x40140027:0x3ffb15f0 0x401403fd:0x3ffb1610 0x40140431:0x3ffb1630 0x40124615:0x3ffb1690 0x4011f65a:0x3ffb1c30 0x401225b3:0x3ffb1cd0 0x40122ee2:0x3ffb1d70 0x40123c16:0x3ffb1dd0 0x40123d8e:0x3ffb1e20 0x4012195d:0x3ffb1e70 0x400d5d83:0x3ffb1e90 0x400d7849:0x3ffb1ec0 0x400d7cc1:0x3ffb2120 0x400d4551:0x3ffb2170 0x401899bc:0x3ffb2190 0x40088ded:0x3ffb21b0




ELF file SHA256: 6e7439947

Entering gdb stub now.
$T1f#eb
--- exit ---
pjakobs@t460 ~/d/esp_rgbww_firmware (feature/pinConfig)> make gdb

esp_rgbww_firmware: Invoking 'gdb' for Esp32/esp32 (debug) architecture
COM_PORT is /dev/serial/by-id/usb-Silicon_Labs_CP2102_USB_to_UART_Bridge_Controller_0001-if00-port0@115200 for Esp32
using firmware version pj-1-441-gc0ab-dirty"-["feature/pinConfig"]"
using WEBapp V5-pj-235-g2390-[feature/manualPinConfig]
using SMING 6.0.x-dev-23-g7b3ef-dirty"-["develop"]"
trap '' INT; /opt/esp32/tools/xtensa-esp-elf-gdb/12.1_20231023/xtensa-esp-elf-gdb/bin/xtensa-esp32-elf-gdb -x /opt/Sming/Sming/Arch/Esp32/Components/gdbstub/gdbcmds -b 115200 -ex "target remote /dev/serial/by-id/usb-Silicon_Labs_CP2102_USB_to_UART_Bridge_Controller_0001-if00-port0" out/Esp32/esp32/debug/build/app.out
GNU gdb (esp-gdb) 12.1_20231023
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "--host=x86_64-linux-gnu --target=xtensa-esp-elf".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from out/Esp32/esp32/debug/build/app.out...

Welcome to SMING!
Type 'c' (continue) to run application

Remote debugging using /dev/serial/by-id/usb-Silicon_Labs_CP2102_USB_to_UART_Bridge_Controller_0001-if00-port0
warning: multi-threaded target stopped without sending a thread-id, using first non-exited thread
_frxt_int_enter ()
--Type <RET> for more, q to quit, c to continue without paging--
    at /opt/esp-idf-5.2/components/freertos/FreeRTOS-Kernel/portable/xtensa/portasm.S:127
127	    s32i    a13, a1, XT_STK_A13
(gdb) 

but, here's the resulting backtrace:

(gdb) bt
#0  _frxt_int_enter ()
    at /opt/esp-idf-5.2/components/freertos/FreeRTOS-Kernel/portable/xtensa/portasm.S:127
#1  0x4008318d in _xt_lowint1 () at /opt/esp-idf-5.2/components/xtensa/xtensa_vectors.S:1225
#2  0x4008c575 in vPortExitCriticalSafe (mux=0x3ffaea40)
    at /opt/esp-idf-5.2/components/freertos/FreeRTOS-Kernel/portable/xtensa/include/freertos/portmacro.h:600
#3  multi_heap_internal_unlock (heap=0x3ffc6b90)
    at /opt/esp-idf-5.2/components/heap/multi_heap.c:170
#4  0x4008c633 in multi_heap_realloc_impl (heap=0x3ffc6b90, p=0x3ffcb624, size=93)
    at /opt/esp-idf-5.2/components/heap/multi_heap.c:265
#5  0x40081d30 in heap_caps_realloc_base (ptr=0x3ffcb624, size=93, caps=6144)
    at /opt/esp-idf-5.2/components/heap/heap_caps.c:456
#6  0x40081d5c in heap_caps_realloc (ptr=0x3ffcb624, size=93, caps=6144)
    at /opt/esp-idf-5.2/components/heap/heap_caps.c:492
#7  0x40081d8c in heap_caps_realloc_default (ptr=0x3ffcb624, size=93)
    at /opt/esp-idf-5.2/components/heap/heap_caps.c:261
#8  0x4008e830 in realloc (ptr=0x3ffcb624, size=93)
    at /opt/esp-idf-5.2/components/newlib/heap.c:34
#9  0x401212ac in ConfigDB::PoolData::ensureCapacity (this=0x3ffcb438, capacity=93)
    at /opt/Sming/Sming/Libraries/ConfigDB/src/Pool.cpp:55
#10 0x4012131a in ConfigDB::PoolData::allocate (this=0x3ffcb438, itemCount=17)
    at /opt/Sming/Sming/Libraries/ConfigDB/src/Pool.cpp:76
#11 0x40121419 in ConfigDB::StringPool::add (this=0x3ffcb438, string=...)
--Type <RET> for more, q to quit, c to continue without paging--c
    at /opt/Sming/Sming/Libraries/ConfigDB/src/Pool.cpp:119
#12 0x4011fda6 in ConfigDB::StringPool::findOrAdd (this=0x3ffcb438, string=...) at /opt/Sming/Sming/Libraries/ConfigDB/src/include/ConfigDB/Pool.h:197
#13 0x4011ff70 in ConfigDB::Store::parseString (this=0x3ffcb424, prop=..., dst=..., defaultData=0x3f4126c6 <AppConfig::ContainedHardware::ContainedPinconfigsItem::defaultData+2>, value=0x3ffb1801 "Lightinator mini", valueLength=16) at /opt/Sming/Sming/Libraries/ConfigDB/src/Store.cpp:99
#14 0x4012591a in ConfigDB::Property::setJsonValue (this=0x3ffb14d0, value=0x3ffb1801 "Lightinator mini", valueLength=16) at /opt/Sming/Sming/Libraries/ConfigDB/src/Property.cpp:54
#15 0x40124872 in ConfigDB::Json::WriteStream::setProperty (this=0x3ffb169c, element=..., object=..., prop=...) at /opt/Sming/Sming/Libraries/ConfigDB/src/Json/WriteStream.cpp:305
#16 0x40125158 in ConfigDB::Json::WriteStream::startElement (this=0x3ffb169c, element=...) at /opt/Sming/Sming/Libraries/ConfigDB/src/Json/WriteStream.cpp:138
#17 0x40125172 in ConfigDB::Json::WriteStream::_ZThn12_N8ConfigDB4Json11WriteStream12startElementERKN4JSON7ElementE (this=0x3ffb169c, element=...) at /opt/Sming/Sming/Libraries/ConfigDB/src/Json/WriteStream.h:95
#18 0x4013fd45 in JSON::StreamingParser::startElement (this=0x3ffb17b4, type=JSON::Element::Type::String) at /opt/Sming/Sming/Libraries/JsonStreamingParser/src/StreamingParser.cpp:356
#19 0x4014002a in JSON::StreamingParser::parse (this=0x3ffb17b4, c=34 '"') at /opt/Sming/Sming/Libraries/JsonStreamingParser/src/StreamingParser.cpp:148
#20 0x40140400 in JSON::StreamingParser::parse (this=0x3ffb17b4, data=0x3ffb165d ",\"soc\":\"esp32c3\",\"m", length=19) at /opt/Sming/Sming/Libraries/JsonStreamingParser/src/StreamingParser.cpp:110
#21 0x40140434 in JSON::StreamingParser::parse (this=0x3ffb17b4, stream=...) at /opt/Sming/Sming/Libraries/JsonStreamingParser/src/StreamingParser.cpp:123
#22 0x40124618 in ConfigDB::Json::WriteStream::parse (object=..., source=...) at /opt/Sming/Sming/Libraries/ConfigDB/src/Json/WriteStream.cpp:49
#23 0x4011f65d in ConfigDB::Json::Format::importFromStream (this=0x3ffbe068 <ConfigDB::Json::format>, object=..., source=...) at /opt/Sming/Sming/Libraries/ConfigDB/src/Json/Format.cpp:66
#24 0x401225b6 in ConfigDB::Object::importFromFile (this=0x3ffcb424, format=..., filename=...) at /opt/Sming/Sming/Libraries/ConfigDB/src/Object.cpp:449
#25 0x40122ee5 in ConfigDB::Store::importFromFile (this=0x3ffcb424, format=...) at /opt/Sming/Sming/Libraries/ConfigDB/src/include/ConfigDB/Store.h:145
#26 0x40123c19 in ConfigDB::Database::loadStore (this=<optimized out>, storeInfo=...) at /opt/esp32/tools/xtensa-esp-elf/esp-13.2.0_20230928/xtensa-esp-elf/xtensa-esp-elf/include/c++/13.2.0/bits/shared_ptr_base.h:1665
#27 0x40123d91 in ConfigDB::Database::openStore (this=0x3ffcb1bc, index=<optimized out>) at /opt/Sming/Sming/Libraries/ConfigDB/src/Database.cpp:105
#28 0x40121960 in ConfigDB::Object::openStore (this=0x3ffb1f54, db=..., storeIndex=5) at /opt/Sming/Sming/Libraries/ConfigDB/src/Object.cpp:126
#29 0x400d5d86 in ConfigDB::OuterObjectTemplate<AppConfig::ContainedHardware, AppConfig::HardwareUpdater, AppConfig, 5u, AppConfig, 5u, 0u>::OuterObjectTemplate (this=0x3ffb1f54, db=...) at /opt/Sming/Sming/Libraries/ConfigDB/src/include/ConfigDB/Object.h:365
#30 0x400d784c in AppConfig::Hardware::OuterObjectTemplate (this=0x3ffb1f54) at /home/pjakobs/devel/esp_rgbww_firmware/out/ConfigDB/app-config.h:1846
#31 Application::init (this=0x3ffc1130 <app>) at /home/pjakobs/devel/esp_rgbww_firmware/app/application.cpp:289
#32 0x400d7cc4 in init () at /home/pjakobs/devel/esp_rgbww_firmware/app/application.cpp:145
#33 0x400d4554 in app_main () at /opt/Sming/Sming/Arch/Esp32/Components/esp32/src/startup.cpp:45
#34 0x401899bf in main_task (args=0x0) at /opt/esp-idf-5.2/components/freertos/app_startup.c:208
#35 0x40088df0 in vPortTaskWrapper (pxCode=0x40189978 <main_task>, pvParameters=0x0) at /opt/esp-idf-5.2/components/freertos/FreeRTOS-Kernel/portable/xtensa/port.c:134
(gdb) 

@mikee47
Copy link
Contributor

mikee47 commented Nov 27, 2024

So this is a different error to the one you reported above. The inconsistency suggests memory corruption of some sort.

@pljakobs
Copy link
Contributor Author

let's pause this for a bit, there's currently too much going on otherwise, I can't fully concentrate.
I might be able to provide more concise data tomorrow, right now I'm just trying to get stuff between other things.

@slaff
Copy link
Contributor

slaff commented Nov 27, 2024

@pljakobs when you have more time make sure to get the latest develop, run make -C $SMING_HOME dist-clean, clean your project and then compile again.

@pljakobs
Copy link
Contributor Author

pljakobs commented Dec 3, 2024

I'm currently chasing a bug in my own code that seems to try to configure an empty channels array, which might explain the stack canary error message.

@pljakobs
Copy link
Contributor Author

pljakobs commented Dec 12, 2024

gdb on the esp32 keeps defeating me. I had no luck using an esp-prog to jtag debug (kept getting errors that core 1 was inaccessible) and I seem to not understand the built-in gdb either.

Those results might not be overly helpful, but I could not get a true functional debug session going.

I have built with the following GDB related sdk-config options:

ESP_GDBSTUB_ENABLED(=y)
ESP_GDBSTUB_MAX_TASKS(=32) "Maximum number of tasks supported by GDB Stub"
ESP_GDBSTUB_SUPPORT_TASKS(=y) "Enable listing FreeRTOS tasks through GDB Stub"
ESP_SYSTEM_GDBSTUB_RUNTIME(=n) "GDBStub at runtime"
ESP_SYSTEM_PANIC_GDBSTUB(=y) "GDBStub on panic"

which is the only way I could get a stack trace as seen below.

I've tried ESP_SYSTEM_GDBSTUB_RUNTIME but then gdb can't connect - I'm not sure what the output from the system is then, it is certainly not the "normal" log output but seems not to be the gdb stub either.

Anyway, the below stacktrace seems to be what is reproducible in this setup.
I've also seen two different errors: stack corruption

[/home/pjakobs/devel/esp_rgbww_firmware/app/otaupdate.cpp:512] ApplicationOTA::saveStatus 3 to rom partition rom0

[/home/pjakobs/devel/esp_rgbww_firmware/app/application.cpp:558] mouting primary littlefs partition at 170000, length 589824
[Core/FileSystem.cpp:45] mount() returned 0 (Success)
[Core/FileSystem.cpp:54] File system initialised
Guru Meditation Error: Core  0 panic'ed (Unhandled debug exception). 
Debug exception reason: Stack canary watchpoint triggered (main) 
Core  0 register dump:
PC      : 0x40083189  PS      : 0x00060636  A0      : 0x3ffb12f0  A1      : 0x3ffb1230  
A2      : 0x3ffbf540  A3      : 0x00000100  A4      : 0xefffffff  A5      : 0x00000002  
A6      : 0x80000003  A7      : 0x3ffcae74  A8      : 0x00040000  A9      : 0x3ff42000  
A10     : 0x00800000  A11     : 0x00002000  A12     : 0xdfffffff  A13     : 0x00000000  
A14     : 0xbfffffff  A15     : 0x3ffcadc0  SAR     : 0x00000008  EXCCAUSE: 0x00000001  
EXCVADDR: 0x00000000  LBEG    : 0x4000c46c  LEND    : 0x4000c477  LCOUNT  : 0x00000000  


Backtrace: 0x40083186:0x3ffb1230 0x3ffb12ed:0x3ffb1310 |<-CORRUPTED




ELF file SHA256: 5bce551a5

CPU halted.

and I've also seen doble exceptions:

[/home/pjakobs/devel/esp_rgbww_firmware/app/networking.cpp:162] reading network config from configDB
Guru Meditation Error: Core  0 panic'ed (Double exception). 

Core  0 register dump:
PC      : 0x400941b2  PS      : 0x00040236  A0      : 0x80114cec  A1      : 0x3ffb1190  
A2      : 0x3ffbb168  A3      : 0x0017b000  A4      : 0x3ffcbf28  A5      : 0x00000020  
A6      : 0x00000000  A7      : 0x3ffb1628  A8      : 0x40080080  A9      : 0x3ffb1290  
A10     : 0x00060436  A11     : 0x00040026  A12     : 0x00000020  A13     : 0x00000000  
A14     : 0x3ffcae64  A15     : 0x3ffb1b94  SAR     : 0x00000018  EXCCAUSE: 0x00000002  
EXCVADDR: 0xffffffe0  LBEG    : 0x40113090  LEND    : 0x4011309c  LCOUNT  : 0x00000000  


Backtrace: 0x400941af:0x3ffb1190 0x40114ce9:0x3ffb11b0 0x400941af:0x3ffb11d0 0x400941af:0x3ffb11f0 0x400941af:0x3ffb1210 0x400941af:0x3ffb1230 0x400941af:0x3ffb1250 0x400941af:0x3ffb1290 0x4008007d:0x3ffb1190 0x400d32bf:0x3ffb11b0 0x400941af:0x3ffb11d0 0x400941af:0x3ffb11f0 0x400941af:0x3ffb1210 0x400941af:0x3ffb1230 0x400941af:0x3ffb1250 0x400941af:0x3ffb1290 0x4008007d:0x3ffb1190 0x400d32bf:0x3ffb11b0 0x400941af:0x3ffb11d0 0x400941af:0x3ffb11f0 0x400941af:0x3ffb1210 0x400941af:0x3ffb1230 0x400941af:0x3ffb1250 0x400941af:0x3ffb1290 0x4008007d:0x3ffb1190 0x400d32bf:0x3ffb11b0 0x400941af:0x3ffb11d0 0x400941af:0x3ffb11f0 0x400941af:0x3ffb1210 0x400941af:0x3ffb1230 0x400941af:0x3ffb1250 0x400941af:0x3ffb1290 0x4008007d:0x3ffb1190 0x400d32bf:0x3ffb11b0 0x400941af:0x3ffb11d0 0x400941af:0x3ffb11f0 0x400941af:0x3ffb1210 0x400941af:0x3ffb1230 0x400941af:0x3ffb1250 0x400941af:0x3ffb1290 0x4008007d:0x3ffb1190 0x400d32bf:0x3ffb11b0 0x400941af:0x3ffb11d0 0x400941af:0x3ffb11f0 0x400941af:0x3ffb1210 0x400941af:0x3ffb1230 0x400941af:0x3ffb1250 0x400941af:0x3ffb1290 0x4008007d:0x3ffb1190 0x400d32bf:0x3ffb11b0 0x400941af:0x3ffb11d0 0x400941af:0x3ffb11f0 0x400941af:0x3ffb1210 0x400941af:0x3ffb1230 0x400941af:0x3ffb1250 0x400941af:0x3ffb1290 0x4008007d:0x3ffb1190 0x400d32bf:0x3ffb11b0 0x400941af:0x3ffb11d0 0x400941af:0x3ffb11f0 0x400941af:0x3ffb1210 0x400941af:0x3ffb1230 0x400941af:0x3ffb1250 0x400941af:0x3ffb1290 0x4008007d:0x3ffb1190 0x400d32bf:0x3ffb11b0 0x400941af:0x3ffb11d0 0x400941af:0x3ffb11f0 0x400941af:0x3ffb1210 0x400941af:0x3ffb1230 0x400941af:0x3ffb1250 0x400941af:0x3ffb1290 0x4008007d:0x3ffb1190 0x400d32bf:0x3ffb11b0 0x400941af:0x3ffb11d0 0x400941af:0x3ffb11f0 0x400941af:0x3ffb1210 0x400941af:0x3ffb1230 0x400941af:0x3ffb1250 0x400941af:0x3ffb1290 0x4008007d:0x3ffb1190 0x400d32bf:0x3ffb11b0 0x400941af:0x3ffb11d0 0x400941af:0x3ffb11f0 0x400941af:0x3ffb1210 0x400941af:0x3ffb1230 0x400941af:0x3ffb1250 0x400941af:0x3ffb1290 0x4008007d:0x3ffb1190 0x400d32bf:0x3ffb11b0 0x400941af:0x3ffb11d0 0x400941af:0x3ffb11f0 0x400941af:0x3ffb1210 0x400941af:0x3ffb1230 0x400941af:0x3ffb1250 0x400941af:0x3ffb1290 0x4008007d:0x3ffb1190 0x400d32bf:0x3ffb11b0 0x400941af:0x3ffb11d0 0x400941af:0x3ffb11f0 0x400941af:0x3ffb1210 |<-CONTINUES




ELF file SHA256: 816370be7

CPU halted.

the following is with ENABLE_GDB=1 and the GDB on panic handler. The backtrace pointing to somewhere in lfs / flash read is rather consistent.

Welcome to SMING!
Type 'c' (continue) to run application

Remote debugging using /dev/serial/by-id/usb-Silicon_Labs_CP2102_USB_to_UART_Bridge_Controller_0001-if00-port0
warning: multi-threaded target stopped without sending a thread-id, using first non-exited thread
0x4000c2e4 in ?? ()
(gdb) bt
#0  0x4000c2e4 in ?? ()
#1  0x4011a83e in lfs_bd_read (lfs=0x3ffcb424, pcache=0x0, rcache=0x3ffcc4d0, hint=4096, block=11, off=0, buffer=<optimized out>, 
    buffer@entry=0x3ffb1240, size=size@entry=64) at /opt/Sming/Sming/Libraries/LittleFS/littlefs/lfs.c:79
#2  0x4011b6e2 in lfs_file_flushedread (lfs=0x3ffcb424, file=0x3ffcc490, buffer=<optimized out>, size=64)
    at /opt/Sming/Sming/Libraries/LittleFS/littlefs/lfs.c:3539
#3  0x4011c6ba in lfs_file_read_ (lfs=0x3ffcb424, file=0x3ffcc490, buffer=0x3ffb1240, size=64)
    at /opt/Sming/Sming/Libraries/LittleFS/littlefs/lfs.c:3574
#4  0x4011f2ac in lfs_file_read (lfs=0x3ffcb424, file=0x3ffcc490, buffer=0x3ffb1240, size=64)
    at /opt/Sming/Sming/Libraries/LittleFS/littlefs/lfs.c:6365
#5  0x40118aa6 in IFS::LittleFS::FileSystem::read (this=0x3ffcb370, file=<optimized out>, data=0x3ffb1240, size=64)
    at /opt/Sming/Sming/Libraries/LittleFS/src/FileSystem.cpp:414
#6  0x40112220 in IFS::FileStream::readBytes (this=0x3ffb18f8, buffer=0x3ffb1240 '\245' <repeats 64 times>, length=<optimized out>)
    at /opt/Sming/Sming/Core/Data/Stream/IFS/FileStream.cpp:76
#7  0x40142922 in JSON::StreamingParser::parse (this=0x3ffb13c4, stream=...)
    at /opt/Sming/Sming/Libraries/JsonStreamingParser/src/StreamingParser.cpp:122
#8  0x401245a4 in ConfigDB::Json::WriteStream::parse (object=..., source=...) at /opt/Sming/Sming/Libraries/ConfigDB/src/Json/WriteStream.cpp:49
#9  0x4011f5e1 in ConfigDB::Json::Format::importFromStream (this=0x3ffbe068 <ConfigDB::Json::format>, object=..., source=...)
    at /opt/Sming/Sming/Libraries/ConfigDB/src/Json/Format.cpp:66
#10 0x4012253e in ConfigDB::Object::importFromFile (this=0x3ffcb7c0, format=..., filename=...)
    at /opt/Sming/Sming/Libraries/ConfigDB/src/Object.cpp:449
#11 0x40122e71 in ConfigDB::Store::importFromFile (this=0x3ffcb7c0, format=...)
    at /opt/Sming/Sming/Libraries/ConfigDB/src/include/ConfigDB/Store.h:145
#12 0x40123ba5 in ConfigDB::Database::loadStore (this=<optimized out>, storeInfo=...)
    at /opt/esp32/tools/xtensa-esp-elf/esp-13.2.0_20230928/xtensa-esp-elf/xtensa-esp-elf/include/c++/13.2.0/bits/shared_ptr_base.h:1665
#13 0x40123d1d in ConfigDB::Database::openStore (this=0x3ffcb2e8, index=<optimized out>) at /opt/Sming/Sming/Libraries/ConfigDB/src/Database.cpp:105
#14 0x401218e4 in ConfigDB::Object::openStore (this=0x3ffb1e28, db=..., storeIndex=4) at /opt/Sming/Sming/Libraries/ConfigDB/src/Object.cpp:126
#15 0x400d6132 in ConfigDB::OuterObjectTemplate<AppConfig::ContainedNetwork, AppConfig::NetworkUpdater, AppConfig, 4u, AppConfig, 4u, 0u>::OuterObjectTemplate (this=0x3ffb1e28, db=...) at /opt/Sming/Sming/Libraries/ConfigDB/src/include/ConfigDB/Object.h:365
#16 0x400eecbc in AppConfig::Network::OuterObjectTemplate (this=0x3ffb1e28) at /home/pjakobs/devel/esp_rgbww_firmware/out/ConfigDB/app-config.h:1835
#17 AppWIFI::init (this=0x3ffc1140 <app>) at /home/pjakobs/devel/esp_rgbww_firmware/app/networking.cpp:163
#18 0x400d7b32 in Application::init (this=0x3ffc1140 <app>) at /home/pjakobs/devel/esp_rgbww_firmware/app/application.cpp:361
#19 0x400d7c88 in init () at /home/pjakobs/devel/esp_rgbww_firmware/app/application.cpp:145
#20 0x400d4514 in app_main () at /opt/Sming/Sming/Arch/Esp32/Components/esp32/src/startup.cpp:44
#21 0x4018c19b in main_task (args=0x0) at /opt/esp-idf-5.2/components/freertos/app_startup.c:208
--Type <RET> for more, q to quit, c to continue without paging--c
#22 0x40088e08 in vPortTaskWrapper (pxCode=0x4018c154 <main_task>, pvParameters=0x0) at /opt/esp-idf-5.2/components/freertos/FreeRTOS-Kernel/portable/xtensa/port.c:134

@pljakobs
Copy link
Contributor Author

also, I can confirm that checking out 6d94935b18801fa02fff35320fedc426b4993bd1 (the commit adding the RP2040 pwm) 'fixes' the issue, so it seems to be related to the changes after that.

@mikee47
Copy link
Contributor

mikee47 commented Dec 12, 2024

If you can point me to the code you're trying to debug, I'll see if I can reproduce the issue.

@pljakobs
Copy link
Contributor Author

it is, as usual, my rgbww firmware (https://github.com/pljakobs/esp_rgbww_firmware) - I wish I could provide a demonstrator here, but all I have is a basic hunch where this goes wrong (which is in initializing the network - or thereabout - as said, the stack trace seems to always be somewhere in reading flash for ConfigDB - which might just be coincidental)

@mikee47
Copy link
Contributor

mikee47 commented Dec 12, 2024

Which branch should I test?

@pljakobs
Copy link
Contributor Author

actually, while trying for a preproducer, I just re-built the Basic_IFS example and got this:

I (31) boot: ESP-IDF v5.2.1-sming-1 2nd stage bootloader
I (31) boot: compile time Dec 12 2024 08:36:12
W (31) boot: Unicore bootloader
I (35) boot: chip revision: v1.0
I (39) boot.esp32: SPI Speed      : 40MHz
I (44) boot.esp32: SPI Mode       : DIO
I (48) boot.esp32: SPI Flash Size : 4MB
I (53) boot: Enabling RNG early entropy source...
I (58) boot: Partition Table:
I (62) boot: ## Label            Usage          Type ST Offset   Length
I (69) boot:  0 spiFlash         unknown          02 01 00000000 00400000
I (77) boot:  1 nvs              WiFi data        01 02 00009000 00006000
I (84) boot:  2 phy_init         RF data          01 01 0000f000 00001000
I (92) boot:  3 factory          factory app      00 00 00010000 000f0000
I (99) boot:  4 lfs1             Unknown data     01 f2 00100000 00203000
I (107) boot:  5 spiffs0          Unknown data     01 82 00303000 00080000
I (114) boot:  6 fwfs1            Unknown data     01 f1 00383000 00077000
I (122) boot: End of partition table
I (126) esp_image: segment 0: paddr=00010020 vaddr=3f400020 size=1f08ch (127116) map
I (178) esp_image: segment 1: paddr=0002f0b4 vaddr=3ffbdb60 size=00f64h (  3940) load
I (180) esp_image: segment 2: paddr=00030020 vaddr=400d0020 size=7fc0ch (523276) map
I (363) esp_image: segment 3: paddr=000afc34 vaddr=3ffbeac4 size=02018h (  8216) load
I (366) esp_image: segment 4: paddr=000b1c54 vaddr=40080000 size=1403ch ( 81980) load
I (411) boot: Loaded app from partition at offset 0x10000
I (411) boot: Disabling RNG early entropy source...
I (422) cpu_start: Unicore app
I (423) cpu_start: Single core mode
I (431) cpu_start: Pro cpu start user code
I (431) cpu_start: cpu freq: 160000000 Hz
I (431) cpu_start: Application information:
I (436) cpu_start: Project name:     Sming
I (441) cpu_start: App version:      6.0.x-dev-26-g4a691cf2
I (447) cpu_start: Compile time:     Dec 12 2024 10:42:02
I (453) cpu_start: ELF file SHA256:  a93816ccd...
I (458) cpu_start: ESP-IDF:          v5.2.1-sming-1
I (464) cpu_start: Min chip rev:     v0.0
I (468) cpu_start: Max chip rev:     v3.99 
I (473) cpu_start: Chip rev:         v1.0
I (478) heap_init: Initializing. RAM available for dynamic allocation:
I (485) heap_init: At 3FFAE6E0 len 0000F480 (61 KiB): DRAM
I (491) heap_init: At 3FFC5360 len 0001ACA0 (107 KiB): DRAM
I (498) heap_init: At 3FFE0440 len 0001FBC0 (126 KiB): D/IRAM
I (504) heap_init: At 40078000 len 00008000 (32 KiB): IRAM
I (510) heap_init: At 4009403C len 0000BFC4 (47 KiB): IRAM
I (516) heap_init: At 3FF80000 len 00002000 (8 KiB): RTCRAM
I (523) spi_flash: detected chip: generic
I (527) spi_flash: flash io: dio

assert failed: xQueueGenericSendFromISR queue.c:1180 (pxQueue)


Backtrace: 0x4008112e:0x3ffe3a00 0x40087e31:0x3ffe3a20 0x4008e521:0x3ffe3a40 0x40088486:0x3ffe3b60 0x40083353:0x3ffe3b90 0x40083fa5:0x3ffe3bc0 0x400ffe1d:0x3ffe3be0 0x400d1b8e:0x3ffe3c00 0x40081051:0x3ffe3c50 0x400795f6:0x3ffe3c90 0x400799e1:0x3ffe3cc0 0x4008067d:0x3ffe3df0 0x40007c31:0x3ffe3eb0 0x4000073d:0x3ffe3f20




ELF file SHA256: a93816ccd

CPU halted.

(I have set the sdk to print registers and halt on panic instead of rebooting. Seemed easier)

@pljakobs
Copy link
Contributor Author

otherwise, the current branch I'm actively working on for the rgbww firmware is feature/pinConfig
sorry, forgot that one

@pljakobs
Copy link
Contributor Author

okay, after a full rebild (dist-clean, components-clean, clean, flash) the above error in Basic_IFS is gone. Sigh.

@mikee47
Copy link
Contributor

mikee47 commented Dec 12, 2024

Is the webapp available anywhere pre-built?

@mikee47
Copy link
Contributor

mikee47 commented Dec 12, 2024

I can get the firmware built by excising the contents of include/fileList.h (exception for VERSION, which I hacked) and runs without issue on an esp-c3-32s dev board. I just get no filesystem mounted in the web API though so need some way to get the filesystem. Any hints on how to build it?

@mikee47
Copy link
Contributor

mikee47 commented Dec 12, 2024

OK, so hacked together some files and debugging throws on use of std::string in VersionListener because key is null. Just change that to String instead. Now getting the above filing system exception so looking at that.

if (std::string(element.key) == "version")

to

if (String(element.key) == "version")

or, better

if (F("version") == element.key)

@mikee47
Copy link
Contributor

mikee47 commented Dec 12, 2024

In application.cpp you have uint32_t getVersion() returning a signed value.

@mikee47
Copy link
Contributor

mikee47 commented Dec 12, 2024

app/networking.cpp line 149 dangerous printf should read printf("%s", SSID.c_str());.

@mikee47
Copy link
Contributor

mikee47 commented Dec 12, 2024

The problem is due to stack overflow. Because RAM is fragmented by freeRTOS we have to decide how much stack to allocate for each task. The init() function is run in the main task context which now get just the default stack allocation of 3584 bytes.

This issue arose because of #2913. The init function is called in the main task context. After that's done, the task is destroyed and Sming continues running in the tiT (tcpip) task which has 16K of stack by default.

Two ways to fix this:

  1. Increase main task stack allocation via make sdk-menuconfig - value is ESP_MAIN_TASK_STACK_SIZE
  2. Move filesystem initialisation code to ready handler which runs with larger stack.
  3. Consider if stack usage is excessive and modify code to use heap instead

I've done a full clean and rebuild and the crash has disappeared. Re-build with ENABLE_GDB=1 and it crashes again.

@pljakobs
Copy link
Contributor Author

pljakobs commented Dec 12, 2024

increasing the stack size will only impact esp32/eps32c3 builds, right? there, overall memory is not an issue so this should be good enough then.

also: will changing the sdk-config be reflected in the project? (I can't see a change so far) or is there a way to make it part of a commit?

also: I'm impressed with how fast you got to that.

@mikee47
Copy link
Contributor

mikee47 commented Dec 12, 2024

increasing the stack size will only impact esp32/eps32c3 builds, right? there, overall memory is not an issue so this should be good enough then.

It took a lot of fiddling with #2913 so couldn't actually remember how it ended up. At one point I had the main task managing the IDF event queue, so inflating the stack for that would be very wasteful.

However, turns out that for both networked and non-networked builds the main task (and stack) gets destroyed. The initialisation therefore goes something like this:

  1. Create main task and stack (the IDF does this)
  2. Initialise networking: wifi and tiT tasks are created
  3. Call application init
  4. Activate processing of Sming callbacks from tiT task (networked tasks)
  5. Destroy main task and stack

For non-networked applications step(2) is omitted and the main Sming task and heap is created at step (4).

So the only side effect is heap fragmentation.

@mikee47
Copy link
Contributor

mikee47 commented Dec 13, 2024

@pljakobs OK, latest develop should fix this, default stack size is now 16K for init.

@pljakobs
Copy link
Contributor Author

I need to double check it, but I assume it will be fixed since changing the stack value in sdk-menuconfig already fixed it locally.
Thank you for all the help once again.
(I still wonder what I was missing on the gdb side. I used to be able to use it with the 8266 but here, I did not get results that looked meaningful to me)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants