-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect processing in Engine::ScheduleRun() #14596
Comments
…Layer->ScheduleWork() returns. (project-chip#14596) - setup mRunScheduled = true before execute ScheduleWork()
…mLayer->ScheduleWork() returns. (project-chip#14596) - setup mRunScheduled = true before execute ScheduleWork()
This all assumes that This specific call gives the impression that it's ok to call from a different threading context but in fact, it actually isn't. It accesses a number of SDK state that can change on it during a shutdown (like the |
@lnikulin can you provide more details on the original observed failure and how that translates to an unsafe call to |
Specifically, what would be most useful here is a stack to the |
@lnikulin Do we have the stack dump when the issue occurs? wondering which thread other than CHIP main thread is calling Engine::ScheduleRun() |
Call Stack looks like following:
on MIPS platform
|
Where are the "SchedWork() begin" and "Signal begin" prints coming from? That is, what code locations? |
File
|
OK, but that's not what we were asking. We are asking: what code is calling Someone is calling |
Write attribute emberAfWriteAttribute() from application thread (Write UniqueId attribute of Basic Cluster EP0) Main application thread = 2012425860
Main application thread callstack:
|
Ok, this is the thing that's not allowed. This needs to happen off a task queued on the Matter event loop. Otherwise you can race not only with Engine::ScheduleRun but also with writes to attributes happening as a result of Matter message processing, so can end up with a corrupted attribute store. |
@bzbarsky-apple |
Yes, that is what you should do.
This is after it has already done a bunch of work, like writing to the attribute store, and the remaining (async) work is sending subscription updates as needed. Writing to the attribute store should only be happening from the Matter event loop. If you do it from a random thread, that write will race with writes due to incoming messages, and you will get corrupted data. Does that make sense?
It's not thread agnostic at all. |
In case, we use wrapper for |
Problem
Any attempt to read/write any attribute results in TIMEOUT error.
Platform
MIPS24k
Root-cause and Proposed Solution
Looks like there is a race between threads, introduced after the following PR #13093.
In function
Engine::ScheduleRun()
(/src/app/reporting/Engine.cpp)mRunScheduled
flag is set to true after callingsystemLayer->ScheduleWork(Run, this);
and intended to be set to false in callback function (Engine::Run()).
However, if callback finishes before ScheduleWork returns, flag mRunScheduled will be set to true, but the scheduled task has been done already.
There is the following statement in
src/include/platform/PlatformManager.h
which is violated:History of the issue
Original commit (#11155) introduced mRunScheduled flag
ea68796ba9
sometimes crashes, see #13093This was fixed in the following commit #13093
2ddefedca8
which introduced the race issue described above.Proposed solution
Set mRunScheduled to true before calling
ScheduleWork()
The text was updated successfully, but these errors were encountered: