Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wifi:trc.c rcReachRetryLimit 1741 (IDFGH-6267) #7936

Closed
achao1104 opened this issue Nov 19, 2021 · 21 comments
Closed

wifi:trc.c rcReachRetryLimit 1741 (IDFGH-6267) #7936

achao1104 opened this issue Nov 19, 2021 · 21 comments
Labels
Resolution: Done Issue is done internally Status: Done Issue is done internally

Comments

@achao1104
Copy link

idf4.3.1, we connect two cameras with esp32s2 and keep it running, after a couple of hours, it will crash as below.

E (187517575) wifi:sched is null, if=0 tid=0 flags=0
E (187517575) wifi:trc.c rcReachRetryLimit 1741

E (187522565) task_wdt: Task watchdog got triggered. The following tasks did not reset the watchdog in time:
E (187522565) task_wdt: - IDLE (CPU 0)
E (187522565) task_wdt: Tasks currently running:
E (187522565) task_wdt: CPU 0: wifi

abort() was called at PC 0x40027357 on core 0

Backtrace:0x4002eade:0x3ffcd8e0 0x4002f259:0x3ffcd900 0x40038972:0x3ffcd920 0x40027357:0x3ffcd990 0x4002748d:0x3ffcd9c0 0x4002751a:0x3ffcd9e0 0x400f8af2:0x3ffcda10 0x400fbb25:0x3ffcdd20 0x40105c72:0x3ffcdd50 0x40034b4d:0x3ffcdd80 0x400c6ae1:0x3ffcddd0 0x400d950d:0x3ffcde40 0x400c86c6:0x3ffcde90 0x400cbf55:0x3ffcdeb0 0x400ccb6d:0x3ffcded0 0x400cbbae:0x3ffcdef0 0x400cd802:0x3ffcdf10 0x400c5d42:0x3ffcdf30 0x4002895d:0x3ffcdf60 0x4008abf7:0x3ffcdf80 0x400c3a13:0x3ffcdfa0 0x40027132:0x3ffcdfc0 0x400df6c9:0x3ffe3830 0x400db865:0x3ffe3860 0x400dbce5:0x3ffe38a0 0x400dbf47:0x3ffe38d0 0x4003a199:0x3ffe3910

ELF file SHA256: b017d89a7614548a

@espressif-bot espressif-bot added the Status: Opened Issue is new label Nov 19, 2021
@github-actions github-actions bot changed the title wifi:trc.c rcReachRetryLimit 1741 wifi:trc.c rcReachRetryLimit 1741 (IDFGH-6267) Nov 19, 2021
@achao1104
Copy link
Author

commit:2e74914051

@atanisoft
Copy link

@achao1104 please decode the backtrace.

@achao1104
Copy link
Author

Can I have your email? I can send you the elf file.

@achao1104
Copy link
Author

D:\project\esp32\idf-v4.3.1>xtensa-esp32-elf-addr2line -pfiaC -e D:\project\esp32\temp\2021-11-19-wifidump\solardevice.elf 0x4002eade:0x3ffcd8e0 0x4002f259:0x3ffcd900 0x40038972:0x3ffcd920 0x40027357:0x3ffcd990 0x4002748d:0x3ffcd9c0 0x4002751a:0x3ffcd9e0 0x400f8af2:0x3ffcda10 0x400fbb25:0x3ffcdd20 0x40105c72:0x3ffcdd50 0x40034b4d:0x3ffcdd80 0x400c6ae1:0x3ffcddd0 0x400d950d:0x3ffcde40 0x400c86c6:0x3ffcde90 0x400cbf55:0x3ffcdeb0 0x400ccb6d:0x3ffcded0 0x400cbbae:0x3ffcdef0 0x400cd802:0x3ffcdf10 0x400c5d42:0x3ffcdf30 0x4002895d:0x3ffcdf60 0x4008abf7:0x3ffcdf80 0x400c3a13:0x3ffcdfa0 0x40027132:0x3ffcdfc0 0x400df6c9:0x3ffe3830 0x400db865:0x3ffe3860 0x400dbce5:0x3ffe38a0 0x400dbf47:0x3ffe38d0 0x4003a199:0x3ffe3910
0x4002eade: panic_abort at D:/project/esp32/idf-v4.3.1/components/esp_system/panic.c:379
0x4002f259: esp_system_abort at D:/project/esp32/idf-v4.3.1/components/esp_system/system_api.c:112
0x40038972: abort at D:/project/esp32/idf-v4.3.1/components/newlib/abort.c:46
0x40027357: lock_acquire_generic at D:/project/esp32/idf-v4.3.1/components/newlib/locks.c:139
0x4002748d: _lock_acquire_recursive at D:/project/esp32/idf-v4.3.1/components/newlib/locks.c:167
0x4002751a: __retarget_lock_acquire_recursive at D:/project/esp32/idf-v4.3.1/components/newlib/locks.c:323
0x400f8af2: _vfprintf_r at /builds/idf/crosstool-NG/.build/HOST-x86_64-w64-mingw32/xtensa-esp32s2-elf/src/newlib/newlib/libc/stdio/vfprintf.c:853 (discriminator 1)
0x400fbb25: vprintf at /builds/idf/crosstool-NG/.build/HOST-x86_64-w64-mingw32/xtensa-esp32s2-elf/src/newlib/newlib/libc/stdio/vprintf.c:34 (discriminator 5)
0x40105c72: esp_log_writev at D:/project/esp32/idf-v4.3.1/components/log/log.c:189
0x40034b4d: esp_log_write at D:/project/esp32/idf-v4.3.1/components/log/log.c:199
0x400c6ae1: wifi_log at ??:?
0x400d950d: cnx_node_leave at ??:?
0x400c86c6: wifi_softap_stop at ??:?
0x400cbf55: _do_wifi_stop at ??:?
0x400ccb6d: wifi_stop_process at ??:?
0x400cbbae: ieee80211_ioctl_process at ??:?
0x400cd802: ieee80211_ioctl at ??:?
0x400c5d42: esp_wifi_stop at ??:?
0x4002895d: esp_restart at D:/project/esp32/idf-v4.3.1/components/esp_system/system_api.c:71
0x4008abf7: esp_task_wdt_isr_user_handler at D:\project\esp32\project\solardevice\build/../main/main.c:56
0x400c3a13: task_wdt_isr at D:/project/esp32/idf-v4.3.1/components/esp_common/src/task_wdt.c:173 (discriminator 1)
0x40027132: _xt_lowint1 at D:/project/esp32/idf-v4.3.1/components/freertos/port/xtensa/xtensa_vectors.S:1105
0x400df6c9: rcReachRetryLimit at ??:?
0x400db865: lmacProcessTxSuccess at ??:?
0x400dbce5: lmacProcessAckTimeout at ??:?
0x400dbf47: lmacProcessTxComplete at ??:?
0x4003a199: ppTask at ??:?

@atanisoft
Copy link

it looks like something in your code might be blocking wifi from acquiring a semaphore lock while an ISR is running. I'd suggest reviewing what is going on in:

0x4008abf7: esp_task_wdt_isr_user_handler at D:\project\esp32\project\solardevice\build/../main/main.c:56

@achao1104
Copy link
Author

thanks for your reply, but not really.
we added a idle task watch dog like below:
void esp_task_wdt_isr_user_handler(void)
{
//ESP_LOGE(TAG, "watch dog restart");
esp_restart();
}
it is the wifi which is blocking the idle task, so it triggers the interrupt where it call restart.

@achao1104
Copy link
Author

line 56 is: esp_restart();

@achao1104
Copy link
Author

since there are two cameras connected with the esp32s2, they might send too much traffic, so the wifi is blocking other task, just my guess.

@atanisoft
Copy link

your call to esp_restart is blocking WiFi from printing a message (likely about shutdown) due to your ISR hook point blocking the UART ISR from locking itself etc.

ISRs have very strict requirements as to what can/can not be done inside them. This looks like a case where you should be waking up another task to do the restart without blocking the ISR context. You can do this via xTaskNotifyFromISR or posting to a semaphore that a task is blocked on.

@achao1104
Copy link
Author

D:\project\esp32\idf-v4.3.1>python D:\project\esp32\idf-v4.3.1\components\espcoredump\espcoredump.py info_corefile -t raw -c D:\project\esp32\temp\2021-11-19-wifidump\863949040975157-2021-11-19-17-32-40-0913277.dump D:\project\esp32\project\solardevice\build\solardevice.elf
espcoredump.py v0.4-dev

==================== ESP32 CORE DUMP START ====================

Crashed task handle: 0x3ffe3950, name: 'wifi', GDB name: 'process 1073625424'

================== CURRENT THREAD REGISTERS ===================
exccause 0x1d (StoreProhibitedCause)
excvaddr 0x0
epc1 0x0
epc2 0x0
epc3 0x0
epc4 0x0
epc5 0x0
epc6 0x0
epc7 0x0
eps2 0x0
eps3 0x0
eps4 0x0
eps5 0x0
eps6 0x0
eps7 0x40089248
pc 0x4002eae1 0x4002eae1 <panic_abort+21>
lbeg 0x0 0
lend 0x0 0
lcount 0x0 0
sar 0x8 8
ps 0x60623 394787
threadptr
br
scompare1
acclo
acchi
m0
m1
m2
m3
expstate
f64r_lo
f64r_hi
f64s
fcr
fsr
a0 0x8002f25c -2147290532
a1 0x3ffcd8e0 1073535200
a2 0x3ffcd920 1073535264
a3 0x3ffcd96d 1073535341
a4 0xa 10
a5 0x14 20
a6 0x3 3
a7 0x3ffd2cac 1073556652
a8 0x0 0
a9 0x1 1
a10 0x3ffcd96d 1073535341
a11 0x3ffcd96d 1073535341
a12 0xa 10
a13 0xc 12
a14 0x3 3
a15 0x1c 28

==================== CURRENT THREAD STACK =====================
#0 0x4002eae1 in panic_abort (details=0x3ffcd920 "abort() was called at PC 0x40027357 on core 0") at D:/project/esp32/idf-v4.3.1/components/esp_system/panic.c:379
#1 0x4002f25c in esp_system_abort (details=0x3ffcd920 "abort() was called at PC 0x40027357 on core 0") at D:/project/esp32/idf-v4.3.1/components/esp_system/system_api.c:112
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

======================== THREADS INFO =========================
Id Target Id Frame

  • 1 process 1073625424 0x4002eae1 in panic_abort (details=0x3ffcd920 "abort() was called at PC 0x40027357 on core 0") at D:/project/esp32/idf-v4.3.1/components/esp_system/panic.c:379
    2 process 1073344660 0x4001a8d4 in ?? ()
    3 process 1073598240 0x4002fd12 in xQueueReceive (xQueue=0x3fffd818, pvBuffer=0x3ffdcee0, xTicksToWait=) at D:/project/esp32/idf-v4.3.1/components/hal/esp32s2/include/hal/cpu_ll.h:33
    4 process 1073619708 0x4002fd12 in xQueueReceive (xQueue=0x3ffff650, pvBuffer=0x3ffe22c0, xTicksToWait=) at D:/project/esp32/idf-v4.3.1/components/hal/esp32s2/include/hal/cpu_ll.h:33
    5 process 1073615264 0x4002fd12 in xQueueReceive (xQueue=0x3ffef788, pvBuffer=0x3ffe0f80, xTicksToWait=) at D:/project/esp32/idf-v4.3.1/components/hal/esp32s2/include/hal/cpu_ll.h:33
    6 process 1073347272 0x40103e6e in esp_pm_impl_waiti () at D:/project/esp32/idf-v4.3.1/components/hal/esp32s2/include/hal/cpu_ll.h:202
    7 process 1073345388 0x4002fe66 in xQueueSemaphoreTake (xQueue=0x3ff9ff14, xTicksToWait=) at D:/project/esp32/idf-v4.3.1/components/hal/esp32s2/include/hal/cpu_ll.h:33
    8 process 1073347616 0x40031a50 in prvProcessTimerOrBlockTask (xListWasEmpty=, xNextExpireTime=0) at D:/project/esp32/idf-v4.3.1/components/hal/esp32s2/include/hal/cpu_ll.h:33
    9 process 1073610528 0x4002fd12 in xQueueReceive (xQueue=0x3fffe078, pvBuffer=0x3ffdfee8, xTicksToWait=) at D:/project/esp32/idf-v4.3.1/components/hal/esp32s2/include/hal/cpu_ll.h:33
    10 process 1073601404 0x4002fd12 in xQueueReceive (xQueue=0x3fffd9bc, pvBuffer=0x3ffddb10, xTicksToWait=) at D:/project/esp32/idf-v4.3.1/components/hal/esp32s2/include/hal/cpu_ll.h:33
    11 process 1073606084 0x4002fd12 in xQueueReceive (xQueue=0x3fffdc6c, pvBuffer=0x3ffded50, xTicksToWait=) at D:/project/esp32/idf-v4.3.1/components/hal/esp32s2/include/hal/cpu_ll.h:33

==================== THREAD 1 (TCB: 0x3ffe3950, name: 'wifi') =====================
#0 0x4002eae1 in panic_abort (details=0x3ffcd920 "abort() was called at PC 0x40027357 on core 0") at D:/project/esp32/idf-v4.3.1/components/esp_system/panic.c:379
#1 0x4002f25c in esp_system_abort (details=0x3ffcd920 "abort() was called at PC 0x40027357 on core 0") at D:/project/esp32/idf-v4.3.1/components/esp_system/system_api.c:112
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

==================== THREAD 2 (TCB: 0x3ff9f094, name: 'esp_timer') =====================
#0 0x4001a8d4 in ?? ()
#1 0x40031ce8 in vPortExitCritical (mux=0x3ffcdfc8 ) at D:/project/esp32/idf-v4.3.1/components/freertos/port/xtensa/port.c:473
#2 0x40031260 in ulTaskNotifyTake (xClearCountOnExit=1, xTicksToWait=4294967295) at D:/project/esp32/idf-v4.3.1/components/freertos/tasks.c:5203
#3 0x4008a27a in timer_task (arg=) at D:/project/esp32/idf-v4.3.1/components/esp_timer/src/esp_timer.c:347

==================== THREAD 3 (TCB: 0x3ffdcf20, name: 'tiT') =====================
#0 0x4002fd12 in xQueueReceive (xQueue=0x3fffd818, pvBuffer=0x3ffdcee0, xTicksToWait=) at D:/project/esp32/idf-v4.3.1/components/hal/esp32s2/include/hal/cpu_ll.h:33
#1 0x40036ad8 in sys_arch_mbox_fetch (mbox=, msg=0x3ffdcee0, timeout=250) at D:/project/esp32/idf-v4.3.1/components/lwip/port/esp32/freertos/sys_arch.c:325
#2 0x40036d62 in tcpip_timeouts_mbox_fetch (mbox=, msg=) at D:/project/esp32/idf-v4.3.1/components/lwip/lwip/src/api/tcpip.c:110
#3 tcpip_thread (arg=) at D:/project/esp32/idf-v4.3.1/components/lwip/lwip/src/api/tcpip.c:148

==================== THREAD 4 (TCB: 0x3ffe22fc, name: 'uart_event') =====================
#0 0x4002fd12 in xQueueReceive (xQueue=0x3ffff650, pvBuffer=0x3ffe22c0, xTicksToWait=) at D:/project/esp32/idf-v4.3.1/components/hal/esp32s2/include/hal/cpu_ll.h:33
#1 0x4009fad4 in uart_event_task_entry (param=0x3ffff148) at ../components/modem/src/esp_modem.c:240

==================== THREAD 5 (TCB: 0x3ffe11a0, name: 'cloud_action_ta') =====================
#0 0x4002fd12 in xQueueReceive (xQueue=0x3ffef788, pvBuffer=0x3ffe0f80, xTicksToWait=) at D:/project/esp32/idf-v4.3.1/components/hal/esp32s2/include/hal/cpu_ll.h:33
#1 0x40036ad8 in sys_arch_mbox_fetch (mbox=, msg=0x3ffe0f80, timeout=3000) at D:/project/esp32/idf-v4.3.1/components/lwip/port/esp32/freertos/sys_arch.c:325
#2 0x40034ccb in netconn_recv_data (conn=0x3ffefc0c, new_buf=0x3ffe1020, apiflags=) at D:/project/esp32/idf-v4.3.1/components/lwip/lwip/src/api/api_lib.c:628
#3 0x40034da3 in netconn_recv_data_tcp (conn=0x3ffefc0c, new_buf=0x3ffe1020, apiflags=8 '\b') at D:/project/esp32/idf-v4.3.1/components/lwip/lwip/src/api/api_lib.c:740
#4 0x40034e15 in netconn_recv_tcp_pbuf_flags (conn=, new_buf=0x3ffe1020, apiflags=8 '\b') at D:/project/esp32/idf-v4.3.1/components/lwip/lwip/src/api/api_lib.c:821
#5 0x400362fb in lwip_recv_tcp (sock=0x3ffd416c <sockets+24>, mem=0x3fd80974, len=, flags=0) at D:/project/esp32/idf-v4.3.1/components/lwip/lwip/src/api/sockets.c:1005
#6 0x400363c5 in lwip_recvfrom (s=, mem=0x3fd80974, len=1151, flags=0, from=0x0, fromlen=) at D:/project/esp32/idf-v4.3.1/components/lwip/lwip/src/api/sockets.c:1272
#7 0x40109031 in lwip_recv (s=55, mem=0x3fd80974, len=1151, flags=0) at D:/project/esp32/idf-v4.3.1/components/lwip/lwip/src/api/sockets.c:1337
#8 0x4009410e in recv (flags=0, len=1151, mem=, s=) at D:/project/esp32/idf-v4.3.1/components/lwip/lwip/src/include/lwip/sockets.h:646
#9 server_send_execute (musthaversp=0 '\000', recvtimeoutshort=3 '\003', datalen=0, data=0x0, sockin=0x3ffce188 , prefix=0x3f00b0b9 "cloud") at ../main/bizutil.c:392
#10 server_send (sockin=0x3ffce188 , prefix=0x3f00b0b9 "cloud", data=0x0, datalen=0, retrytimes=1 '\001', recvtimeoutshort=3 '\003', musthaversp=0 '\000', issetfailed=0 '\000') at ../main/bizutil.c:426
#11 0x40094580 in cloud_action_task (pvParameters=) at ../main/bizutil.c:84

==================== THREAD 6 (TCB: 0x3ff9fac8, name: 'IDLE') =====================
#0 0x40103e6e in esp_pm_impl_waiti () at D:/project/esp32/idf-v4.3.1/components/hal/esp32s2/include/hal/cpu_ll.h:202
#1 0x400c366e in esp_vApplicationIdleHook () at D:/project/esp32/idf-v4.3.1/components/esp_common/src/freertos_hooks.c:63
#2 0x40030247 in prvIdleTask (pvParameters=) at D:/project/esp32/idf-v4.3.1/components/freertos/tasks.c:3839

==================== THREAD 7 (TCB: 0x3ff9f36c, name: 'main') =====================
#0 0x4002fe66 in xQueueSemaphoreTake (xQueue=0x3ff9ff14, xTicksToWait=) at D:/project/esp32/idf-v4.3.1/components/hal/esp32s2/include/hal/cpu_ll.h:33
#1 0x40094fec in system_sleep () at ../main/bizutil.c:999
#2 0x4008b0be in app_main () at ../main/main.c:287
#3 0x40105a72 in main_task (args=) at D:/project/esp32/idf-v4.3.1/components/freertos/port/port_common.c:133

==================== THREAD 8 (TCB: 0x3ff9fc20, name: 'Tmr Svc') =====================
#0 0x40031a50 in prvProcessTimerOrBlockTask (xListWasEmpty=, xNextExpireTime=0) at D:/project/esp32/idf-v4.3.1/components/hal/esp32s2/include/hal/cpu_ll.h:33
#1 prvTimerTask (pvParameters=) at D:/project/esp32/idf-v4.3.1/components/freertos/timers.c:555

==================== THREAD 9 (TCB: 0x3ffdff20, name: 'pir_task') =====================
#0 0x4002fd12 in xQueueReceive (xQueue=0x3fffe078, pvBuffer=0x3ffdfee8, xTicksToWait=) at D:/project/esp32/idf-v4.3.1/components/hal/esp32s2/include/hal/cpu_ll.h:33
#1 0x4008df59 in pir_task (pvParameters=) at ../main/util.c:1210

==================== THREAD 10 (TCB: 0x3ffddb7c, name: 'sys_evt') =====================
#0 0x4002fd12 in xQueueReceive (xQueue=0x3fffd9bc, pvBuffer=0x3ffddb10, xTicksToWait=) at D:/project/esp32/idf-v4.3.1/components/hal/esp32s2/include/hal/cpu_ll.h:33
#1 0x40104d74 in esp_event_loop_run (event_loop=0x3fffd9a0, ticks_to_run=4294967295) at D:/project/esp32/idf-v4.3.1/components/esp_event/esp_event.c:624
#2 0x40104ee7 in esp_event_loop_run_task (args=0x3fffd9a0) at D:/project/esp32/idf-v4.3.1/components/esp_event/esp_event.c:115

==================== THREAD 11 (TCB: 0x3ffdedc4, name: 'lora_task') =====================
#0 0x4002fd12 in xQueueReceive (xQueue=0x3fffdc6c, pvBuffer=0x3ffded50, xTicksToWait=) at D:/project/esp32/idf-v4.3.1/components/hal/esp32s2/include/hal/cpu_ll.h:33
#1 0x4008fdec in lora_task (pvParameters=) at ../main/lorautil.c:320

======================= ALL MEMORY REGIONS ========================
Name Address Size Attrs
.rtc.text 0x40070000 0x2c R XA
.rtc.force_fast 0x3ff9e02c 0x0 RW
.rtc.data 0x50000000 0x1c RW A
.rtc_noinit 0x5000001c 0x0 RW
.rtc.force_slow 0x5000001c 0x0 RW
.iram0.vectors 0x40024000 0x403 R XA
.iram0.text 0x40024404 0x1915c R XA
.dram0.data 0x3ffcd560 0x56e0 RW A
.noinit 0x3ffd2c40 0x0 RW
.flash.appdesc 0x3f000020 0x100 R A
.flash.rodata 0x3f000120 0x22ca8 RW A
.flash.text 0x40080020 0x89b9f R XA
.iram0.text_end 0x4003d560 0x0 RW
.dram0.heap_start 0x3ffd81e8 0x0 RW
.coredump.tasks.data 0x3ffe3950 0x154 RW
.coredump.tasks.data 0x3ffe37a0 0x1a8 RW
.coredump.tasks.data 0x3ff9f094 0x154 RW
.coredump.tasks.data 0x3ffd9a40 0x118 RW
.coredump.tasks.data 0x3ffdcf20 0x154 RW
.coredump.tasks.data 0x3ffdcde0 0x138 RW
.coredump.tasks.data 0x3ffe22fc 0x154 RW
.coredump.tasks.data 0x3ffe21f0 0x104 RW
.coredump.tasks.data 0x3ffe11a0 0x154 RW
.coredump.tasks.data 0x3ffe0e80 0x318 RW
.coredump.tasks.data 0x3ff9fac8 0x154 RW
.coredump.tasks.data 0x3ff9f9d0 0xf0 RW
.coredump.tasks.data 0x3ff9f36c 0x154 RW
.coredump.tasks.data 0x3ffda9e0 0x17c RW
.coredump.tasks.data 0x3ff9fc20 0x154 RW
.coredump.tasks.data 0x3ffdb2a0 0xc0 RW
.coredump.tasks.data 0x3ffdff20 0x154 RW
.coredump.tasks.data 0x3ffdfdf0 0x128 RW
.coredump.tasks.data 0x3ffddb7c 0x154 RW
.coredump.tasks.data 0x3ffdda30 0x144 RW
.coredump.tasks.data 0x3ffdedc4 0x154 RW
.coredump.tasks.data 0x3ffdec60 0x15c RW

===================== ESP32 CORE DUMP END =====================

Exception in thread Thread-2:
Traceback (most recent call last):
File "C:\Users\admin\AppData\Local\Programs\Python\Python37\lib\threading.py", line 917, in _bootstrap_inner
self.run()
File "C:\Users\admin\AppData\Local\Programs\Python\Python37\lib\threading.py", line 865, in run
self._target(*self._args, **self._kwargs)
File "C:\Users\admin\AppData\Local\Programs\Python\Python37\lib\subprocess.py", line 1238, in _readerthread
buffer.append(fh.read())
OSError: [Errno 22] Invalid argument
Exception in thread Thread-1:
Traceback (most recent call last):
File "C:\Users\admin\AppData\Local\Programs\Python\Python37\lib\threading.py", line 917, in _bootstrap_inner
self.run()
File "C:\Users\admin\AppData\Local\Programs\Python\Python37\lib\threading.py", line 865, in run
self._target(*self._args, **self._kwargs)
File "C:\Users\admin\AppData\Local\Programs\Python\Python37\lib\subprocess.py", line 1238, in _readerthread
buffer.append(fh.read())
OSError: [Errno 22] Invalid argument

WARNING: Attempt to terminate the GDB process failed, because it is already terminated. Skip
Done!

@achao1104
Copy link
Author

thanks, I got your point.
but what is triggering the watch dog? if I just connect one camera, it runs ok. I checked the full dump(as above), all the application level tasks are not busy.

@atanisoft
Copy link

It has little to do with your current tasks being blocked or not. Your ISR is calling esp_restart() which will suspend all tasks:

void IRAM_ATTR esp_restart(void)
{
for (int i = SHUTDOWN_HANDLERS_NO - 1; i >= 0; i--) {
if (shutdown_handlers[i]) {
shutdown_handlers[i]();
}
}
// Disable scheduler on this core.
vTaskSuspendAll();
bool digital_reset_needed = false;
#if CONFIG_ESP_SYSTEM_MEMPROT_FEATURE
if (esp_memprot_is_intr_ena_any() || esp_memprot_is_locked_any()) {
digital_reset_needed = true;
}
#endif
if (digital_reset_needed) {
esp_restart_noos_dig();
}
esp_restart_noos();
}

Since you are calling this within an ISR context there are restrictions on acquiring semaphores as seen here:

if (mutex_type == queueQUEUE_TYPE_RECURSIVE_MUTEX) {
abort(); /* recursive mutexes make no sense in ISR context */
}

Which is triggering the abort from your backtrace. The solution I'd suggest is a light weight task that runs at a high priority that blocks on a semaphore OR uses ulTaskNotifyTake to wait for wake up from the ISR callback context you are using. Inside that task you can safely call esp_restart() and not have the esp32 crash and restart.

@achao1104
Copy link
Author

achao1104 commented Nov 19, 2021 via email

@atanisoft
Copy link

I can't answer why your ISR is being called as you haven't shared how you connected it to the rest of FreeRTOS callbacks.

But I CAN tell you why it crashed and it is not a watchdog crash. It is crashing due to your code calling esp_restart() within an ISR context which prevents all other calls from acquiring their semaphores since the code being called is not expected to be called from within an ISR context.

@achao1104
Copy link
Author

achao1104 commented Nov 19, 2021 via email

@xueyunfei998
Copy link

D:\project\esp32\idf-v4.3.1>xtensa-esp32-elf-addr2line -pfiaC -e D:\project\esp32\temp\2021-11-19-wifidump\solardevice.elf 0x4002eade:0x3ffcd8e0 0x4002f259:0x3ffcd900 0x40038972:0x3ffcd920 0x40027357:0x3ffcd990 0x4002748d:0x3ffcd9c0 0x4002751a:0x3ffcd9e0 0x400f8af2:0x3ffcda10 0x400fbb25:0x3ffcdd20 0x40105c72:0x3ffcdd50 0x40034b4d:0x3ffcdd80 0x400c6ae1:0x3ffcddd0 0x400d950d:0x3ffcde40 0x400c86c6:0x3ffcde90 0x400cbf55:0x3ffcdeb0 0x400ccb6d:0x3ffcded0 0x400cbbae:0x3ffcdef0 0x400cd802:0x3ffcdf10 0x400c5d42:0x3ffcdf30 0x4002895d:0x3ffcdf60 0x4008abf7:0x3ffcdf80 0x400c3a13:0x3ffcdfa0 0x40027132:0x3ffcdfc0 0x400df6c9:0x3ffe3830 0x400db865:0x3ffe3860 0x400dbce5:0x3ffe38a0 0x400dbf47:0x3ffe38d0 0x4003a199:0x3ffe3910 0x4002eade: panic_abort at D:/project/esp32/idf-v4.3.1/components/esp_system/panic.c:379 0x4002f259: esp_system_abort at D:/project/esp32/idf-v4.3.1/components/esp_system/system_api.c:112 0x40038972: abort at D:/project/esp32/idf-v4.3.1/components/newlib/abort.c:46 0x40027357: lock_acquire_generic at D:/project/esp32/idf-v4.3.1/components/newlib/locks.c:139 0x4002748d: _lock_acquire_recursive at D:/project/esp32/idf-v4.3.1/components/newlib/locks.c:167 0x4002751a: __retarget_lock_acquire_recursive at D:/project/esp32/idf-v4.3.1/components/newlib/locks.c:323 0x400f8af2: _vfprintf_r at /builds/idf/crosstool-NG/.build/HOST-x86_64-w64-mingw32/xtensa-esp32s2-elf/src/newlib/newlib/libc/stdio/vfprintf.c:853 (discriminator 1) 0x400fbb25: vprintf at /builds/idf/crosstool-NG/.build/HOST-x86_64-w64-mingw32/xtensa-esp32s2-elf/src/newlib/newlib/libc/stdio/vprintf.c:34 (discriminator 5) 0x40105c72: esp_log_writev at D:/project/esp32/idf-v4.3.1/components/log/log.c:189 0x40034b4d: esp_log_write at D:/project/esp32/idf-v4.3.1/components/log/log.c:199 0x400c6ae1: wifi_log at ??:? 0x400d950d: cnx_node_leave at ??:? 0x400c86c6: wifi_softap_stop at ??:? 0x400cbf55: _do_wifi_stop at ??:? 0x400ccb6d: wifi_stop_process at ??:? 0x400cbbae: ieee80211_ioctl_process at ??:? 0x400cd802: ieee80211_ioctl at ??:? 0x400c5d42: esp_wifi_stop at ??:? 0x4002895d: esp_restart at D:/project/esp32/idf-v4.3.1/components/esp_system/system_api.c:71 0x4008abf7: esp_task_wdt_isr_user_handler at D:\project\esp32\project\solardevice\build/../main/main.c:56 0x400c3a13: task_wdt_isr at D:/project/esp32/idf-v4.3.1/components/esp_common/src/task_wdt.c:173 (discriminator 1) 0x40027132: _xt_lowint1 at D:/project/esp32/idf-v4.3.1/components/freertos/port/xtensa/xtensa_vectors.S:1105 0x400df6c9: rcReachRetryLimit at ??:? 0x400db865: lmacProcessTxSuccess at ??:? 0x400dbce5: lmacProcessAckTimeout at ??:? 0x400dbf47: lmacProcessTxComplete at ??:? 0x4003a199: ppTask at ??:?

hi @atanisoft

I have a question. Why does wifi call esp_restart in your project?
Is your esp32s2 as ap mode?

@achao1104
Copy link
Author

when there is a problem, the device will restart itself.
As discussed in above,, the applicaiton codes call restart in the interrupt of watch dog handler. we have known it is an issue, but it is not root cause, the root cause is why the watch dog is triggered. because we used two cameras to connect the esp32 at the same time, so we guess it might be due to the wifi is too busy, so we want to know if there is problem in sdk to processing huge traffic.

@achao1104
Copy link
Author

Hello,

I used blow codes before to restart device once task wdt is triggered, we found some times 'wifi' task will be always busy to trigger the task wdt, but once esp_task_wdt_isr_user_handler called esp_restart(), it will cause another panic due to ungracefully shutdown of the wifi.
void esp_task_wdt_isr_user_handler(void)
{
esp_restart();
}
so I am told to code like below, the handler will send to release a semaphore and then in turn, in another task (named restart task) which is waiting for this semaphore, there are codes to restart device, however, due to the cpu is occupied by the 'wifi' task, I guess the 'restart' task’won't get chance to execute, thus, the device won't be restarted though the wdt handler is called for muliple times. so how to solve it?

void esp_task_wdt_isr_user_handler(void)
{
BaseType_t xHigherPriorityTaskWoken = pdFALSE;
xSemaphoreGiveFromISR( restart_semaphore_handle, &xHigherPriorityTaskWoken );
}

E (1753869) task_wdt: Task watchdog got triggered. The following tasks did not reset the watchdog in time:
E (1753869) task_wdt: - IDLE (CPU 0)
E (1753869) task_wdt: Tasks currently running:
E (1753869) task_wdt: CPU 0: wifi
E (1753869) task_wdt: Print CPU 0 (current core) backtrace

Backtrace:0x400C3CD6:0x3FFCDFA0 0x40027132:0x3FFCDFC0 0x400DF8ED:0x3FFE0D60 0x400DBA89:0x3FFE0D90 0x400DBF09:0x3FFE0DD0 0x400DC05F:0x3FFE0E00 0x400DC17B:0x3FFE0E30 0x4003A199:0x3FFE0E70
0x400c3cd6: adc_gpio_init at D:/project/esp32/idf-v4.3.1/components/driver/adc_common.c:279 (discriminator 4)

0x40027132: _xt_lowint1 at D:/project/esp32/idf-v4.3.1/components/freertos/port/xtensa/xtensa_vectors.S:1105

0x400df8ed: rcUpdatePhyMode at ??:?

0x400dba89: ic_set_trc at ??:?

0x400dbf09: lmacInit at ??:?

0x400dc05f: lmac_record_txtime at ??:?

0x400dc17b: lmacRetryTxFrame at ??:?

0x4003a199: ppResortTxAMPDU at ??:?

@xueyunfei998
Copy link

hi @achao1104

I see your backtrace contains the operation of adc.

In your project, can you shield the function of adc, test WIFI separately, and eliminate the interference of other modules on WIFI

@jack0c
Copy link
Collaborator

jack0c commented May 20, 2022

@achao1104 can you try latest v4.3? since we fix a bug that ADC impact Wi-Fi recently.

@Alvin1Zhang
Copy link
Collaborator

As suggested in previous post, let us track in #7897

@espressif-bot espressif-bot added Resolution: Done Issue is done internally Status: Done Issue is done internally and removed Status: Opened Issue is new labels May 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Resolution: Done Issue is done internally Status: Done Issue is done internally
Projects
None yet
Development

No branches or pull requests

6 participants