Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue realted ESP32-S2 CDC to the Serial.setDebugOutput(true); #6766

Closed
SuGlider opened this issue May 17, 2022 · 17 comments · Fixed by #7284
Closed

Issue realted ESP32-S2 CDC to the Serial.setDebugOutput(true); #6766

SuGlider opened this issue May 17, 2022 · 17 comments · Fixed by #7284
Assignees
Labels
Area: Peripherals API Relates to peripheral's APIs. Status: In Progress Issue is in progress
Milestone

Comments

@SuGlider
Copy link
Collaborator

SuGlider commented May 17, 2022

I am starting to think that the current cause of the problem (on v2.0.3 stable) is some kind of race/lockup between two or more tasks, all of which want to write to the USB CDC. In my case, this happens due to concurrent use of log_X statements in more than one task. This scenario works correctly when using native serial pins.

Consider the following sketch:

#include <Arduino.h>

void setup()
{
  Serial.begin(115200);
  // setDebugOutput() call commented out for test
  //Serial.setDebugOutput(true);
  delay(2000);
}

void loop()
{
  static uint32_t counter = 1;

  Serial.printf("Using Serial.printf(), this is line %u of output, at msec %u...\r\n", counter, millis());
  //log_i("Using log_i(), this is line %u of output, at msec %u...", counter, millis());
  counter++;

  // No delay at all, blast port at full speed!
}

This is a single task using the serial port. When using USB CDC as the serial port, no combination of use or non-use of setDebugOutput and Serial.printf/log_i manages to lockup the board.

However, my sketches use WiFi, which starts a couple of tasks, each outputting via log_X. Also, AsyncTCPSock, which starts its own task. With this, I have noticed that the startup process, where all of these tasks display debugging strings, is the more likely place where the sketch hangs on output. After this, only the main loop keeps outputting strings, and it runs more or less normally.

Originally posted by @avillacis in #6221 (comment)

I have been reviewing the code of USBCDC.cpp , which handles the USB serial port in ESP32-S2. So far I have noticed the following:

  • The setDebugOutput() call installs a function, cdc0_write_char as a character output for debugging with ets_install_putc1.
  • The cdc0_write_char function, in turn, calls USBCDC::write(uint8_t), and through it, USBCDC::write(const uint8_t *, size_t).
  • The USBCDC::write(const uint8_t *, size_t) appears to acknowledge that it might be called from within an ISR context, as it checks with xPortInIsrContext(), and uses xSemaphoreTakeFromISR() or xSemaphoreTake() depending on the result.
  • According to the Espressif documentation, xSemaphoreTakeFromISR() outputs a flag, pxHigherPriorityTaskWoken, that is supposed to inform whether a higher-priority task was awoken as a result of calling the ISR-specific function. And if so, the function is supposed to request a context switch by calling portYIELD_FROM_ISR().
  • HOWEVER... the current implementation of USBCDC::write(const uint8_t *, size_t) collects this flag, only to discard it without taking any action based on its value, and certainly without calling portYIELD_FROM_ISR() anywhere in the function body.

So, what possible consequences might arise from failing to invokeportYIELD_FROM_ISR() as indicated in the documentation?

#6221 (comment)

@SuGlider SuGlider self-assigned this May 17, 2022
@SuGlider
Copy link
Collaborator Author

@avillacis
Let's follow the debug CDC issue here.

@VojtechBartoska VojtechBartoska moved this to Under investigation in Arduino ESP32 Core Project Roadmap May 18, 2022
@VojtechBartoska VojtechBartoska added Area: Peripherals API Relates to peripheral's APIs. Status: Needs investigation We need to do some research before taking next steps on this issue labels May 18, 2022
@avillacis
Copy link

avillacis commented May 18, 2022

At last I have a minimal reproducer sketch that reliably locks up the ESP32S2 board when using USB CDC as debug output:

#include <Arduino.h>

static void test_taskfunc(void * p)
{
    uint32_t tasknum = (uint32_t)p;

    while (true) {
        log_i("This is task instance %u running...", tasknum);
        
        delay(10 * random(1, 10));
    }
}

void setup()
{
    Serial.begin(115200);
    Serial.setDebugOutput(true);
    delay(2000);

    for (auto i = 0; i < 5; i++) {
        TaskHandle_t th;

        // Created tasks all need to have different priorities in order to trigger lockup
        UBaseType_t taskPrio = 1 + 2 + i;

        if (pdPASS == xTaskCreate(test_taskfunc, "test_taskfunc", 4096, (void *)(i + 1), taskPrio, &th)) {
            log_i("Task instance %u created with priority %u", i+1, 3 + i);
        } else {
            log_e("Task instance %u creation FAILED", i+1);
        }
    }
}

void loop()
{
    log_i("This is the main loop running...");

    delay(10 * random(1, 10));
}

Note that the sketch creates multiple tasks with different priorities. If all created tasks share the same priority, the output gets garbled but the sketch keeps running along. However, the sketch as shown above locks up the board as soon as the /dev/ttyACM0 (Linux) character device is opened for monitoring using any terminal program:

[  2758][I][test-esp32s2-multiple-tasks.ino:8] test_taskfunc(): This is task instance 1 running...
[  2764][I][test-esp32s2-multiple-tasks.ino:27] setup(): Task instance 1 created with priority 3
[  2770][I][test-esp32s2-multiple-tasks.ino:8] test_taskfunc(): This is task instance 2 running...
[  2776][I][test-esp32s2-multiple-tasks.ino:27] setup(): Task instance 2 created with priority 4
[  2783][I][test-esp32s2-multiple-tasks.ino:8] test_taskfunc(): This is task instance 3 running...
[  2785][I][test-esp32s2-multiple-tasks.ino:27] setup(): Task instance 3 created with priority 5
[  2791][I][

Also note that task creation itself without monitoring does NOT trigger the lockup. If the device is left alone without opening the character device, all tasks are created and run normally. But if the character device is then opened, the device locks up again:

$ miniterm.py /dev/ttyACM0 115200
--- Miniterm on /dev/ttyACM0  115200,8,N,1 ---
--- Quit: Ctrl+] | Menu: Ctrl+T | Help: Ctrl+T followed by Ctrl+H ---
[  8361][I][test-esp32s2-multiple-tasks.ino:8] test_taskfunc(): This is task instance 2 running...
[  8368][I][test-esp32s2-multiple-tasks.ino:36] loop(): Thi[  837

@avillacis
Copy link

Also note, the lockup requires the use of the log_X functions. If every log_X is replaced by direct Serial.printf calls, the lockup does not happen, even when monitored.

@SuGlider
Copy link
Collaborator Author

OK, I'll look into it and try to find out what is the issue here.

@avillacis
Copy link

Still happening on just-released Arduino-ESP32 2.0.4

@VojtechBartoska VojtechBartoska added this to the 2.1.0 milestone Jul 12, 2022
@tablatronix
Copy link
Contributor

tablatronix commented Sep 9, 2022

Ditto, Have to throttle USBSerial tx or else serial stops and or board freezes.

Have to disable debugoutput altogether or I get 2 lines and it stops

@avillacis
Copy link

Still happening on just-released Arduino-ESP32 2.0.5

@SuGlider
Copy link
Collaborator Author

I'll investigate it...

@SuGlider SuGlider moved this from Under investigation to In Progress in Arduino ESP32 Core Project Roadmap Sep 21, 2022
@SuGlider
Copy link
Collaborator Author

@avillacis @tablatronix
Please try the Fix on PR #7284

@SuGlider SuGlider moved this from In Progress to In Review in Arduino ESP32 Core Project Roadmap Sep 21, 2022
@avillacis
Copy link

@avillacis @tablatronix Please try the Fix on PR #7284

The lockup I was experiencing is technically "fixed" with the PR. The strings themselves appear truncated, but the rest appears when the program outputs more data using log_X functions.

@tablatronix
Copy link
Contributor

Looks much better, no lockup , wifi works, I see all logging as far as I can tell.
Not seeing anything truncated, ill try turning on more debugging

@tablatronix
Copy link
Contributor

tablatronix commented Oct 4, 2022

BUMP @me-no-dev @SuGlider we need traction on this issue

@me-no-dev
Copy link
Member

@tablatronix still into investigation of the cause. @SuGlider is on vacation currently and will get back on it next week.

@tablatronix
Copy link
Contributor

Bumpity bump bump

@KishCom
Copy link

KishCom commented Nov 13, 2022

Hit this bug on a project this weekend. Can confirm #7284 fixes things.

@VojtechBartoska VojtechBartoska removed this from the 2.0.6 milestone Nov 30, 2022
@VojtechBartoska VojtechBartoska added Status: In Progress Issue is in progress and removed Status: Needs investigation We need to do some research before taking next steps on this issue labels Nov 30, 2022
@paclema
Copy link
Contributor

paclema commented May 19, 2023

I can confirm it too!

@VojtechBartoska VojtechBartoska added this to the 2.0.15 milestone Dec 20, 2023
@VojtechBartoska
Copy link
Contributor

applied for both, v2.X and also v3.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: Peripherals API Relates to peripheral's APIs. Status: In Progress Issue is in progress
Projects
Development

Successfully merging a pull request may close this issue.

7 participants