-
Notifications
You must be signed in to change notification settings - Fork 74
ESPAsync_WiFiManager::startConfigPortal() will cause a watchdog timeout when called from a higher-priority task. #39
Comments
ESPAsync_WiFiManager::startConfigPortal()
will cause a watchdog timeout when called from a higher-priority task.
Hi @russelljahn Thanks for your interests and good research on the issue. I'm currently doing exact the same thing, introducing delay(1) additional to the mentioned yield(), to provide support to ESP32-S2. But this delay(1) is designed only for ESP32-S2. For your WDT issue, I think it's better to change the priority of WDT task to be higher than that of the task you're monitoring. Check Watchdog-Timers
|
I appreciate the quick response. 👍 I'll check out the article you linked. |
@khoih-prog I've read the article and done more research. I'm not convinced that increasing the WDT task's priority is a good design decision. The lower-priority watchdog is doing its job and catching a legitimate bug - The WiFi Portal's loop is starving any lower priority tasks from running. There are other tasks in my system (logging, battery, sensor polling) that aren't able to meet their schedules on time while the WiFi Portal's loop is hogging the CPU. |
In that case, you have to lower the priority of the Config-Portal, even to the lowest possible priority. |
Thanks for the replies and perspective, as always. My vantage is from a seasoned software design perspective of designing commercial APIs and SDKs for others. If end-users have to understand the internal mechanics inside the library, rather than using a well-designed API that works for a variety of use-cases, it breaks prevents the basic principle of encapsulation. As this is an asynchronous library, I imagine you'll get similar tickets to mine. There is a non-arbitrary rhyme and reason for assigning priority order. If you're reluctant to allow the library to delay() rather than yield(), I do think wrapping the Config Manager in its own low priority task will be a good compromise to avoid modifying this library, and simplify updating it. May I understand your reluctance on switching out the yield()? If the library needs to apply a delay anyways for the ESP32-S2 boards, wouldn't it make sense to have one common code path that works for both ESP32 and ESP32-S2 in a broader variety of situations? |
You actually don't need to know the internal details of the library, just what is the function you're using it for. Then assign the priority accordingly after comparing to all other tasks.. Config-Portal, and GUI related tasks, are generally time-consuming and don't need high priority. Some acceptable delay will still be OK. The other tasks, measuring, etc., requires higher/highest priority as they must do something, even very fast, in-time, and delay is normally not acceptable.
I just do something necessary, as delay() is generally a waste of MPU time. For ESP32-S2, I believe we still need delay() because of certain mistake in design of that still immature esp32-s2 core. Especially this is just 1-core MPU. Just rechecking this ESPAsync_WiFiManager library code, it's so bad and inefficient when using the AsyncWebServer. I'll rewrite the library soon to make it more efficient and not so unnecessarily time-consuming because I'm now even having issue to convert it to support ESP32-S2. The unnecessary time-hogging and inefficiency are the reasons you got issue with Config Portal. Try this better ESPAsync_WiFiManager_Lite and see if your issue disappear. |
Yep, I think we see the same thing. Even my suggestion of a delay() is a patch on top of that spin-loop, which is appears to be an artifact from the non-async version. Appreciate the link, the help, and what you're doing for the community! I'll stop adding to the thread here. 😊 |
Hi @russelljahn You can test the new release v1.5.0, which I think fixes the time-hogging issue when scanning Networks If you still need delay(), you can change to whatever value (ms) as necessary.
Major Releases v1.5.0
|
Thanks for the update! I'll check out the update when I'm able. |
Thanks for the feedback. I've just tested and got this preliminary results
Are you running ESP32 core v1.0.5? If so, try core v1.0.4. I'm not sure that will finally fix your |
@khoih-prog In my build toolchain, I use the master branch of the ESP32 core to be able to use the latest C++ capabilities: https://github.com/espressif/arduino-esp32.git#master (if you use PlatformIO). I understand your sentiment about the complexity of diagnosing priority/watchdog issues. For now this isn't a blocker, as I'm sticking with the v1.4.3 with my patch, which appears the most stable/reliable solution for my project. I can't afford more development time to troubleshoot these issues, but as always, appreciate your responses. |
I already found out and will fix and release today a new version, which combine the best working code of v1.4.3 as well as v1.5.0. |
The new ESPAsync_WiFiManager v1.6.0 has just been released to fix the mentioned WiFi Scanning. The following
Releases v1.6.0
|
Hi, first I want to say thank you for this library. It's fantastically well documented. I've also compared the old version of this library to this newer async version, and the improved snappiness of the experience is night & day.
I'm working on fairly complex firmware for a sensor-based project, and can consistently repro a watchdog timeout crash that only happens when
startConfigPortal()
is called from a task with a priority above the default of 0.Debugging my firmware, I've simplified the code down to these 2 tasks:
ESPAsync_WiFiManager::startConfigPortal()
inside of a task.I noted that if I decrease the Network Task's priority to 0, the watchdog crash will not happen. This is intuitive if I examine the source code:
ESPAsync_WiFiManager.cpp
, there's ayield()
statement in the spin loop waiting for theESPAsync_WiFiManager
config portal to timeout or be closed.delay(milliseconds)
/vTaskDelay(ticks)
, which suspends the currently running Task and triggers the Task scheduler to schedule the next available Task,yield()
doesn't necessarily suspend the current Task.yield()
actually moves the current Task to the bottom of the current priority group, and triggers the Task scheduler to run again. If the currently running Task is the highest priority one,yield()
will reschedule this same Task again. This is the expected FreeRTOS behaviour:https://www.freertos.org/FreeRTOS_Support_Forum_Archive/February_2007/freertos_taskYIELD_from_highest_priority_task._1663277.html
yield()
instartConfigPortal()
. This starves the watchdog and eventually triggering a crash.yield()
withdelay()
fixes the crash(!)I'd suggest using
delay()
in line 887 instead, as that will allow yielding processing time to lower priority tasks. This allows using the library without modifications for more complex projects like mine, which require different task priorities. If you'd like to keep the same default behaviour for existing users, this would be easy to #ifdef with a flag like your other opt-in features.I'm more than happy to create a PR with the fix if you'd review it. 😃
The text was updated successfully, but these errors were encountered: