-
Notifications
You must be signed in to change notification settings - Fork 913
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] (Risk of a) deadlock in ros::Timer impl? #1980
Comments
Any ideas on this? Facing this issue again in a different situation...really annoying. Any help is (still) highly appreciated... |
@CodeFinder2, problem is that the timer cannot be stopped when a callback associated to that timer is being executed. Combined with the external lock this can currently cause a deadlock. We encountered this problem in a production system. I've reproduced it with the attached minimal-non-working example here. I also have a proposed fix / mitigation for this which I'll be filing shortly. |
Thanks for your reply and for providing a fix! I also came up with a temporary solution (by creating a new thread in the |
Resolved in #2121 |
Hi all,
I've already described my issue here in detail: https://answers.ros.org/question/355644/possible-risk-of-a-deadlock-in-rostimer-impl/
But since I didn't get any reply, I would like to post it here as well since it may be a bug.
Please refer to my linked post above on ROS Answers for the details.
TL;DR: I am using async. spinners with multiple threads and a ROS timer. All my callbacks have to lock a specific mutex first before doing anything in the callbacks. But what happened was that my
timerCallback()
was invoked and the above mentioned mutex was already locked by another thread - perfectly fine so far. That other thread was callingtimer.stop()
and since mytimerCallback()
did not return (and cannot because it is still passively waiting for the mutex to be released),timer.stop()
waited infinitely -> deadlock.Im my opinion, this should not happen. I mean, when I issue a
timer.stop()
but thetimerCallback()
was already invoked slightly before, it should be allowed to continue without any locking.timer.stop()
, once called, should prevent to invoke any new callbacks but shouldn't care about a callback / event that has been triggered already.Please take a look and let me know what do you think!
This was happening on Ubuntu 20.04 LTS with ROS Noetic. Unfortunately, because this was a race condition happening only very rarely, I may not simply be able to reproduce it.
Thanks for taking a look!
The text was updated successfully, but these errors were encountered: