Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scheduler: Unexpected behaviour on 'sched_yield' when task priority is 0 (minimum) and 'configIDLE_SHOULD_YIELD = 1' (IDFGH-5528) #7256

Closed
carlessole opened this issue Jul 9, 2021 · 15 comments
Assignees
Labels
Resolution: Done Issue is done internally Status: Done Issue is done internally

Comments

@carlessole
Copy link

carlessole commented Jul 9, 2021

  • Development Kit: ESP32-Wrover-Kit
  • Kit version: v4
  • Module or chip used: ESP32-WROVER
  • IDF version: v4.3
  • Build System: CMake/idf.py (usingEclipse Plugin)
  • Compiler version: esp-2020r3-8.4.0
  • Operating System: Windows
  • (Windows only) environment type: -.
  • Using an IDE?: Yes. Eclipse + ESP-IDF plugin
  • Power Supply: USB

I have found these 3 configuration parameters that I consider very important on deciding the scheduler behaviour. Their default values on ESP-IDF are:

configUSE_PREEMPTION = 1
configUSE_TIME_SLICING = 1
configIDLE_SHOULD_YIELD = 0

With that default configuration if we create 3 tasks with minimum priority (0) that just perform a sched_yield (yes I have a lot of code duplications on 'test_scheduler_task_X' but I want these way to make debugging better:

#include <stdio.h>
#include "freertos/FreeRTOS.h"
#include "freertos/task.h"

#include "intesis_main.h"

TaskHandle_t pTestSchedulerTaskAHandle = NULL;
TaskHandle_t pTestSchedulerTaskBHandle = NULL;
TaskHandle_t pTestSchedulerTaskCHandle = NULL;

void test_scheduler_task_A(void *arg)
{
    int64_t us_since_boot_previous = esp_timer_get_time();
    uint64_t task_executions_counter = 0;

    sched_yield();
    while (1)
    {
        task_executions_counter++;
        if (task_executions_counter >= 1000)
        {
            int64_t us_since_boot_current = esp_timer_get_time();

            printf("'test_scheduler_task_A' has been called %llu times in %lldus. Period = %lldus\r\n",
                   task_executions_counter, (us_since_boot_current - us_since_boot_previous),
                   (us_since_boot_current - us_since_boot_previous) / task_executions_counter);

            task_executions_counter = 0;
            us_since_boot_previous = esp_timer_get_time();
        }

        sched_yield();
    }

    // This code should never be reached
    vTaskDelete(pTestSchedulerTaskAHandle);
}

void test_scheduler_task_B(void *arg)
{
    int64_t us_since_boot_previous = esp_timer_get_time();
    uint64_t task_executions_counter = 0;

    sched_yield();
    while (1)
    {
        task_executions_counter++;
        if (task_executions_counter >= 1000)
        {
            int64_t us_since_boot_current = esp_timer_get_time();

            printf("'test_scheduler_task_B' has been called %llu times in %lldus. Period = %lldus\r\n",
                   task_executions_counter, (us_since_boot_current - us_since_boot_previous),
                   (us_since_boot_current - us_since_boot_previous) / task_executions_counter);

            task_executions_counter = 0;
            us_since_boot_previous = esp_timer_get_time();
        }

        sched_yield();
    }

    // This code should never be reached
    vTaskDelete(pTestSchedulerTaskBHandle);
}

void test_scheduler_task_C(void *arg)
{
    int64_t us_since_boot_previous = esp_timer_get_time();
    uint64_t task_executions_counter = 0;

    sched_yield();
    while (1)
    {
        task_executions_counter++;
        if (task_executions_counter >= 1000)
        {
            int64_t us_since_boot_current = esp_timer_get_time();

            printf("'test_scheduler_task_C' has been called %llu times in %lldus. Period = %lldus\r\n",
                   task_executions_counter, (us_since_boot_current - us_since_boot_previous),
                   (us_since_boot_current - us_since_boot_previous) / task_executions_counter);

            task_executions_counter = 0;
            us_since_boot_previous = esp_timer_get_time();
        }

        sched_yield();
    }

    // This code should never be reached
    vTaskDelete(pTestSchedulerTaskCHandle);
}

void app_main(void)
{
    BaseType_t rsp = pdPASS;

    rsp = xTaskCreate(test_scheduler_task_A, "test_scheduler_task_A", 4096, &pTestSchedulerTaskAHandle, 0,
                      &pTestSchedulerTaskAHandle);
    if (rsp != pdPASS)
    {
        printf("'test_scheduler_task_A' can not be created\r\n");
    }
    vTaskDelay(500 / portTICK_PERIOD_MS);
    rsp = xTaskCreate(test_scheduler_task_B, "test_scheduler_task_B", 4096, &pTestSchedulerTaskBHandle, 0,
                      &pTestSchedulerTaskBHandle);
    if (rsp != pdPASS)
    {
        printf("'test_scheduler_task_B' can not be created\r\n");
    }
    vTaskDelay(500 / portTICK_PERIOD_MS);
    rsp = xTaskCreate(test_scheduler_task_C, "test_scheduler_task_C", 4096, &pTestSchedulerTaskCHandle, 0,
                      &pTestSchedulerTaskCHandle);
    if (rsp != pdPASS)
    {
        printf("'test_scheduler_task_C' can not be created\r\n");
    }

    while (1)
    {
        vTaskDelay(10000 / portTICK_PERIOD_MS);
    }
}

With freeRTOS above configuration and these code, these are the results:

'test_scheduler_task_A' has been called 1000 times in 9999992us. Period = 9999us
'test_scheduler_task_B' has been called 1000 times in 10000000us. Period = 10000us
'test_scheduler_task_C' has been called 1000 times in 10000000us. Period = 10000us
'test_scheduler_task_A' has been called 1000 times in 9999345us. Period = 9999us
'test_scheduler_task_B' has been called 1000 times in 9999865us. Period = 9999us
'test_scheduler_task_C' has been called 1000 times in 9999869us. Period = 9999us
'test_scheduler_task_A' has been called 1000 times in 9999872us. Period = 9999us
'test_scheduler_task_B' has been called 1000 times in 9999872us. Period = 9999us
'test_scheduler_task_C' has been called 1000 times in 9999872us. Period = 9999us

As it can be seen, our tasks are executed every 10ms. We don't like it because we need to return to the tasks faster. I assume that so is happening because

configIDLE_SHOULD_YIELD = 0

and so, IDLE task are not YIELDING until a tick happens. And as freeRTOS frequency is 100Hz -> Here are the 10ms. I know I can configure the freeRTOS tick up to 1000Hz -> period = 1ms but I would like to avoid being dependent on the freeRTOS tick.

So, as here (https://www.freertos.org/a00110.html#configIDLE_SHOULD_YIELD) is explained, I tried to set configIDLE_SHOULD_YIELD = 1 (on esp-idf\components\freertos\port\xtensa\include\freertos\FreeRTOSConfig.h. So here are my current configuration parameters:

configUSE_PREEMPTION = 1
configUSE_TIME_SLICING = 1
configIDLE_SHOULD_YIELD = 1

Just doing this change on configIDLE_SHOULD_YIELD I see that task A is active every 20us (or less), that's great! But it seems like just task A is being executed with more latency (20us). The other two tasks are still running every 10ms.

'test_scheduler_task_A' has been called 50000 times in 1008372us. Period = 20us<CR>
'test_scheduler_task_A' has been called 50000 times in 1013367us. Period = 20us<CR>
'test_scheduler_task_A' has been called 50000 times in 1008321us. Period = 20us<CR>
'test_scheduler_task_A' has been called 50000 times in 1013367us. Period = 20us<CR>
'test_scheduler_task_A' has been called 50000 times in 1013333us. Period = 20us<CR>
'test_scheduler_task_A' has been called 50000 times in 1008370us. Period = 20us<CR>
'test_scheduler_task_A' has been called 50000 times in 1013333us. Period = 20us<CR>
'test_scheduler_task_B' has been called 1000 times in 9449897us. Period = 9449us<CR>
'test_scheduler_task_A' has been called 50000 times in 1013155us. Period = 20us<CR>
'test_scheduler_task_C' has been called 1000 times in 9989909us. Period = 9989us<CR>
'test_scheduler_task_A' has been called 50000 times in 1013355us. Period = 20us<CR>
'test_scheduler_task_A' has been called 50000 times in 1008370us. Period = 20us<CR>
'test_scheduler_task_A' has been called 50000 times in 1013334us. Period = 20us<CR>
'test_scheduler_task_A' has been called 50000 times in 1008372us. Period = 20us<CR>
'test_scheduler_task_A' has been called 50000 times in 1013331us. Period = 20us<CR>
'test_scheduler_task_A' has been called 50000 times in 1008372us. Period = 20us<CR>
'test_scheduler_task_A' has been called 50000 times in 1013366us. Period = 20us<CR>

.
.
.

Witht he help of the debugger and this plugin (https://mcuoneclipse.com/2016/07/06/freertos-kernel-awareness-for-eclipse-from-nxp/) I can also get this information:

imagen

Is this the normal behviour?

Am I missing something?

Thanks

@espressif-bot espressif-bot added the Status: Opened Issue is new label Jul 9, 2021
@github-actions github-actions bot changed the title Scheduler: Unexpected behaviour on 'sched_yield' when task priority is 0 (minimum) and 'configIDLE_SHOULD_YIELD = 1' Scheduler: Unexpected behaviour on 'sched_yield' when task priority is 0 (minimum) and 'configIDLE_SHOULD_YIELD = 1' (IDFGH-5528) Jul 9, 2021
@carlessole
Copy link
Author

Hello again,
I have set again configIDLE_SHOULD_YIELD = 0 (as default).
But this time I actively force a IDLE sched_yield when calling IDLE hook:

    esp_err_t err = esp_register_freertos_idle_hook_for_cpu(my_idle_hook, 0);
    if (err != ESP_OK)
    {
        printf("'esp_register_freertos_idle_hook_for_cpu' on coreID 0 failed\r\n");
    }
    err = esp_register_freertos_idle_hook_for_cpu(my_idle_hook, 1);
    if (err != ESP_OK)
    {
        printf("'esp_register_freertos_idle_hook_for_cpu' on coreID 1 failed\r\n");
    }
bool my_idle_hook(void)
{   
    sched_yield();
    return true;
}

Doing that the behaviour is the same as setting configIDLE_SHOULD_YIELD = 1 (unexpected and non-desired)

Reading the documentation I see that on "esp_register_freertos_idle_hook_for_cpu":

  * @warning Idle callbacks MUST NOT, UNDER ANY CIRCUMSTANCES, CALL
  *          A FUNCTION THAT MIGHT BLOCK.

So ESP-IDF is not prepared to make IDLE tasks to YIELD maybe?

Thanks

@espressif-bot espressif-bot added Status: In Progress Work is in progress and removed Status: Opened Issue is new labels Oct 5, 2021
@Dazza0
Copy link
Contributor

Dazza0 commented Oct 13, 2021

@carlessole

We don't support modification of FreeRTOSConfig.h directly (otherwise we'd have to support every possible combination of FreeRTOS configurations). Any FreeRTOS related configuration that we do support changing, we have exposed via menuconfig. But if this is a feature request, I think adding support for yielding IDLE shouldn't be too difficult (only blocker I can see is the DFS or sleep code that's embedded into the IDLE task).

But I'll double check the scheduling behavior. Is the code you posted above run with both cores enabled (i.e., is CONFIG_FREERTOS_UNICORE set)?

@Dazza0 Dazza0 self-assigned this Oct 15, 2021
@carlessole
Copy link
Author

Thanks @Dazza0 ,
Both cores are enabled!
imagen

Carles

@Dazza0
Copy link
Contributor

Dazza0 commented Oct 15, 2021

@carlessole We've located the bug. There's a bug that can cause the same task to be repeatedly schedule under certain conditions. We have an MR in review to fix the "skipping" behavior of the round robin scheduling.

@AxelLin
Copy link
Contributor

AxelLin commented Oct 18, 2021

@Dazza0
Is it possible to post the fix here for testing?

@Dazza0
Copy link
Contributor

Dazza0 commented Oct 19, 2021

@AxelLin Regarding temporary code for testing, based off the current master, replace the vTaskSwitchContext() block in components/freertos/tasks.c with the following:

static void taskSelectHighestPriorityTaskSMP( void )
{
    /* This function is called from a critical section. So some optimizations are made */
    BaseType_t uxCurPriority;
    BaseType_t xTaskScheduled = pdFALSE;
    BaseType_t xNewTopPrioritySet = pdFALSE;
    BaseType_t xCoreID = xPortGetCoreID();  /* Optimization: Read once */

    /* Search for tasks, starting form the highest ready priority. If nothing is
     * found, we eventually default to the IDLE tasks at priority 0 */
    for ( uxCurPriority = uxTopReadyPriority; uxCurPriority >= 0 && xTaskScheduled == pdFALSE; uxCurPriority-- )
    {
        /* Check if current priority has one or more ready tasks. Skip if none */
        if( listLIST_IS_EMPTY( &( pxReadyTasksLists[ uxCurPriority ] ) ) )
        {
            continue;
        }

        /* Save a copy of highest priority that has a ready state task */
        if( xNewTopPrioritySet == pdFALSE )
        {
            xNewTopPrioritySet = pdTRUE;
            uxTopReadyPriority = uxCurPriority;
        }

        /* We now search this priority's ready task list for a runnable task */

        /* We always start searching from the head of the list, so we reset
         * pxIndex to point to the tail so that we start walking the list from
         * the first item */
        pxReadyTasksLists[ uxCurPriority ].pxIndex = ( ListItem_t * ) &( pxReadyTasksLists[ uxCurPriority ].xListEnd );

        /* Get the first item on the list */
        TCB_t * pxTCBCur;
        TCB_t * pxTCBFirst;
        listGET_OWNER_OF_NEXT_ENTRY( pxTCBCur, &( pxReadyTasksLists[ uxCurPriority ] ) );
        pxTCBFirst = pxTCBCur;
        do
        {
            /* Check if the current task is currently being executed. However, if
             * it's being executed by the current core, we can still schedule it.
             * Todo: Each task can store a xTaskRunState, instead of needing to
             *       check each core */
            UBaseType_t ux;
            for( ux = 0; ux < ( UBaseType_t )configNUM_CORES; ux++ )
            {
                if ( ux == xCoreID )
                {
                    continue;
                }
                else if ( pxCurrentTCB[ux] == pxTCBCur )
                {
                    /* Current task is already being executed. Get the next task */
                    goto get_next_task;
                }
            }

            /* Check if the current task has has a compatible affinity */
            if ( ( pxTCBCur->xCoreID != xCoreID ) && ( pxTCBCur->xCoreID != tskNO_AFFINITY ) )
            {
                goto get_next_task;
            }

            /* The current task is runnable. Schedule it */
            pxCurrentTCB[ xCoreID ] = pxTCBCur;
            xTaskScheduled = pdTRUE;

            /* Move the current tasks list item to the back of the list in order
             * to implement best effort round robin. To do this, we need to reset
             * the pxIndex to point to the tail again. */
            pxReadyTasksLists[ uxCurPriority ].pxIndex = ( ListItem_t * ) &( pxReadyTasksLists[ uxCurPriority ].xListEnd );
            uxListRemove( &( pxTCBCur->xStateListItem ) );
            vListInsertEnd( &( pxReadyTasksLists[ uxCurPriority ] ), &( pxTCBCur->xStateListItem ) );
            break;

get_next_task:
            /* The current task cannot be scheduled. Get the next task in the list */
            listGET_OWNER_OF_NEXT_ENTRY( pxTCBCur, &( pxReadyTasksLists[ uxCurPriority ] ) );
        } while( pxTCBCur != pxTCBFirst);  /* Check to see if we've walked the entire list */
    }

    assert( xTaskScheduled == pdTRUE ); /* At this point, a task MUST have been scheduled */
}

void vTaskSwitchContext( void )
{
    //Theoretically, this is only called from either the tick interrupt or the crosscore interrupt, so disabling
    //interrupts shouldn't be necessary anymore. Still, for safety we'll leave it in for now.
    int irqstate=portENTER_CRITICAL_NESTED();

    if( uxSchedulerSuspended[ xPortGetCoreID() ] != ( UBaseType_t ) pdFALSE )
    {
        /* The scheduler is currently suspended - do not allow a context
         * switch. */
        xYieldPending[ xPortGetCoreID() ] = pdTRUE;
    }
    else
    {
        xYieldPending[ xPortGetCoreID() ] = pdFALSE;
        xSwitchingContext[ xPortGetCoreID() ] = pdTRUE;
        traceTASK_SWITCHED_OUT();

        #if ( configGENERATE_RUN_TIME_STATS == 1 )
            {
                #ifdef portALT_GET_RUN_TIME_COUNTER_VALUE
                    portALT_GET_RUN_TIME_COUNTER_VALUE( ulTotalRunTime );
                #else
                    ulTotalRunTime = portGET_RUN_TIME_COUNTER_VALUE();
                #endif

                /* Add the amount of time the task has been running to the
                 * accumulated time so far.  The time the task started running was
                 * stored in ulTaskSwitchedInTime.  Note that there is no overflow
                 * protection here so count values are only valid until the timer
                 * overflows.  The guard against negative values is to protect
                 * against suspect run time stat counter implementations - which
                 * are provided by the application, not the kernel. */
                taskENTER_CRITICAL_ISR();
                if( ulTotalRunTime > ulTaskSwitchedInTime[ xPortGetCoreID() ] )
                {
                    pxCurrentTCB[ xPortGetCoreID() ]->ulRunTimeCounter += ( ulTotalRunTime - ulTaskSwitchedInTime[ xPortGetCoreID() ] );
                }
                else
                {
                    mtCOVERAGE_TEST_MARKER();
                }
                taskEXIT_CRITICAL_ISR();
                ulTaskSwitchedInTime[ xPortGetCoreID() ] = ulTotalRunTime;
            }
        #endif /* configGENERATE_RUN_TIME_STATS */

        /* Check for stack overflow, if configured. */
        taskFIRST_CHECK_FOR_STACK_OVERFLOW();
        taskSECOND_CHECK_FOR_STACK_OVERFLOW();

        /* Select a new task to run */

        /*
         We cannot do taskENTER_CRITICAL_ISR(); here because it saves the interrupt context to the task tcb, and we're
         swapping that out here. Instead, we're going to do the work here ourselves. Because interrupts are already disabled, we only
         need to acquire the mutex.
        */
        vPortCPUAcquireMutex( &xTaskQueueMutex );

        #if !configUSE_PORT_OPTIMISED_TASK_SELECTION

        taskSelectHighestPriorityTaskSMP();

        #else
        //For Unicore targets we can keep the current FreeRTOS O(1)
        //Scheduler. I hope to optimize better the scheduler for
        //Multicore settings -- This will involve to create a per
        //affinity ready task list which will impact hugely on
        //tasks module
        taskSELECT_HIGHEST_PRIORITY_TASK();
        #endif

        traceTASK_SWITCHED_IN();
        xSwitchingContext[ xPortGetCoreID() ] = pdFALSE;

#ifdef ESP_PLATFORM
        //Exit critical region manually as well: release the mux now, interrupts will be re-enabled when we
        //exit the function.
        vPortCPUReleaseMutex( &xTaskQueueMutex );
#endif // ESP_PLATFORM

        #if CONFIG_FREERTOS_WATCHPOINT_END_OF_STACK
        vPortSetStackWatchpoint(pxCurrentTCB[xPortGetCoreID()]->pxStack);
        #endif

    }
    portEXIT_CRITICAL_NESTED(irqstate);
}

The new algorithm basically inserts a scheduled tasks to the back of the list, thus making sure that unscheduled are selected first on the next tick.

@AxelLin
Copy link
Contributor

AxelLin commented Oct 19, 2021

@Dazza0

Just curious if #7351 is also an solution?

@Dazza0
Copy link
Contributor

Dazza0 commented Oct 19, 2021

@AxelLin Yes, that is also a solution. But it retains the "Skipping Behavior" described in the documentation.

Edit: Had another look. Looks like it does the same thing as the current algorithm.

@AxelLin
Copy link
Contributor

AxelLin commented Oct 19, 2021

@Dazza0
To prepare testing existing issue v.s. the solutions, I need to know that:
In what conditions the original code will always select the same task to schedule?

(I do have multiple tasks with the same priority in my own project, so I'm nervous about the impact of this issue.)

@Dazza0
Copy link
Contributor

Dazza0 commented Oct 19, 2021

@AxelLin

1. Assume A and B can be scheduled by the current CPU, and X cannot
 idx
[ A , X , B ]
2. listGET_OWNER_OF_NEXT_ENTRY() increments index, but cannot schedule. Keep searching
     idx
[ A , X , B ]
3. listGET_OWNER_OF_NEXT_ENTRY() increments index. This time we can schedule. B is selected to run
         idx
[ A , X , B ]
4. Buggy code resets `pxIndex` even though Task B was selected to be scheduled
 idx
[ A , X , B ]
5. On the next tick. Repeat from step 1, resulting in Task B being scheduled again

Edit: In summary, most application scenarios won't run into this issue as we don't expect applications to have scenarios where two tasks of the same priority run forever (the task watch dog will trigger). So as long as those tasks block, the pattern will eventually break.

@espressif-bot espressif-bot added Resolution: NA Issue resolution is unavailable Status: Done Issue is done internally Resolution: Done Issue is done internally and removed Status: In Progress Work is in progress Resolution: NA Issue resolution is unavailable labels Nov 2, 2021
@AxelLin
Copy link
Contributor

AxelLin commented Nov 4, 2021

@Dazza0

I notice your commit removed taskENTER_CRITICAL_ISR(); and taskEXIT_CRITICAL_ISR(); around the code calculating ulRunTimeCounter
8babb3b#diff-dc94871a7cae5e76d227db279a7b26d27a9a648faf63359143d43ae3997281abL3394
This is not mentioned in the commit log so I'm not sure if it's intentional.

Also FYI, now the ulTotalRunTime is volatile in upstream code:
FreeRTOS/FreeRTOS-Kernel@e4e4fb0
https://forums.freertos.org/t/make-runtime-stats-working-with-compiler-optimization-enabled/9846

@Dazza0
Copy link
Contributor

Dazza0 commented Nov 4, 2021

@AxelLin

  • Critical sections are still present in vTaskSwitchContext() which calls taskSelectHighestPriorityTaskSMP() when already in a critical section.
  • We haven't run into ulTotalRunTime being optimized yet. But I don't see any harm in making it volatile. Could you file a separate GH issue?

espressif-bot pushed a commit that referenced this issue Nov 16, 2021
The previous SMP freertos round robin would skip over tasks when
time slicing. This commit implements a Best Effort Round Robin
where selected tasks are put to the back of the list, thus
makes the time slicing more fair.

- Documentation has been updated accordingly.
- Tidy up vTaskSwitchContext() to match v10.4.3 more
- Increased esp_ipc task stack size to avoid overflow

Closes #7256
@AxelLin
Copy link
Contributor

AxelLin commented Nov 17, 2021

@Dazza0
Just to remind this report is against v4.3, should backport the fix to release/v4.3.

@AxelLin
Copy link
Contributor

AxelLin commented Dec 24, 2021

v4.3 branch still does not include this fix.

@Alvin1Zhang
Copy link
Collaborator

@AxelLin Thanks for the attention, we have back ported this fix to 4.3, but still under internal reviewing, will sync up here once the fix is available on GitHub.

espressif-bot pushed a commit that referenced this issue Jan 3, 2022
The previous SMP freertos round robin would skip over tasks when
time slicing. This commit implements a Best Effort Round Robin
where selected tasks are put to the back of the list, thus
makes the time slicing more fair.

- Documentation has been updated accordingly.
- Tidy up vTaskSwitchContext() to match v10.4.3 more
- Increased esp_ipc task stack size to avoid overflow

Closes #7256
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Resolution: Done Issue is done internally Status: Done Issue is done internally
Projects
None yet
Development

No branches or pull requests

5 participants