-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
async_work: worker callback is called twice #992
Comments
I have to take a closer look. But normaly
These functions should only called once I think. It's more realistic in real app usage and each test should leave a clean state
I think this hides only possible problems. |
Thinking generally: If the last Regarding Edit: I forgot: The debugging showed that the worker thread processed the |
So the job entry is reused after |
I guess both is true. Test 1 does:
Test 2:
This is my theory derived from the valgrind callstack. I try to verify this with debugging or maybe with a retest. |
The debugging commits of branch shows that the async work callback is called twice. So there is no interference from previous test case. It seems to be a different problem with async_work or mqueue. I update the title of the issue.
|
Just a quick guess but can you try this. it should not be needed, but maybe there is a strange race condition: diff --git a/src/async/async.c b/src/async/async.c
index 7d37ac7f..35fb0814 100644
--- a/src/async/async.c
+++ b/src/async/async.c
@@ -70,6 +70,7 @@ static int worker_thread(void *arg)
mtx_lock(work->mtx);
if (work->workh)
work->err = work->workh(work->arg);
+ work->workh = NULL;
mtx_unlock(work->mtx);
mtx_lock(&a->mtx);
@@ -151,6 +152,7 @@ static void queueh(int id, void *data, void *arg)
mtx_unlock(work->mtx);
mtx_lock(&async->mtx);
+ work->cb = NULL;
list_move(&work->le, &async->freel);
mtx_unlock(&async->mtx);
} |
I tied ~10 times. This solves the crash with double mtx_lock(work->mtx);
- if (work->cb)
+ if (work->cb) {
work->cb(work->err, work->arg);
+ work->cb =NULL;
+ }
mtx_unlock(work->mtx); Do you have an idea why the same object is read twice from the |
Yes looks fine.
Is |
One
Can be reproduced with branch
|
This solve some strange race condition which results in calling `cb` twice for one async work object.
Workaround: #993 The workaround does not solve the root of the problem which seems to be in |
I will try to reproduce this today, maybe there is a strange bug within event loop handling. |
Thanks! Maybe the |
I think I have found the root cause. Looks like both epoll loops are Will try to fix this. edit: wrong assumption, it's executed by the same main thread. Very strange. |
mqueue: ... [DATA Pointer] [Thread ID]
Now we are getting closer, there is another mqueue push from a different thread before execute, hard to see on first look. |
Observed this with baresip selftest in branch https://github.com/cspiel1/baresip/tree/rx_thread_activate, e.g. when executing these two tests in a row
then, not always but often I get:
Which states that
test_call_rtcp
creates the async work and the callback (for the result) is called during next testtest_call_rtp_timeout
. Are there multiple bugs?mqueue
ofre_async
is not cleared byre_async_cancel(async, id)
. A simple clear would not be correct, because theid
has to be respected.libre_init()
,libre_close()
resp.re_thread_async_init()
,re_thread_async_close()
in order to have each test independent from previous?libre_flush(void)
that flushes the complete mainloop, clears all async worker and clears all file descriptor handlers?The text was updated successfully, but these errors were encountered: