-
Notifications
You must be signed in to change notification settings - Fork 913
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roscpp multithreaded spinners eat up CPU when callbacks take too long #1545
Comments
The MWE:
Play with the duration of sleep in the callback to see effects of this bug. |
I found the busy-wait loop. It's caused by returning
The loops in I verified two workarounds:
I'm thinking (and I'll try to write) of a solution with something like a conditional variable that'd be used in case If somebody has a better idea, please, help :) |
@peci1 |
No, I'm on indigo. However, I looked through the changes since then till melodic and none seem to be affecting this behavior (at least from looking at the code and commit messages). |
@fujitatomoya I tested in melodic now and the behavior is the same. |
@fujitatomoya See PR #1602 , I have the tests and a solution for this issue. |
PR #1608 against melodic-devel. |
Fixed by #1684 . |
When debugging a performance problem with my node, I created an MWE in C++ which shows interesting behavior with serious performance hit. In case you use MultiThreadedSpinner or AsyncSpinner with >1 threads and one of your callbacks takes too long (messages arrive quicker), the spinner eats up a lot of CPU.
I'll post the MWE on Monday, so for now just a textual description. I have a normal node with the MT/Async spinner, I subscribe to one 50 Hz PointCloud2 topic with a single subscriber. The callback contains just
ros::Duration::sleep(time)
.When running the code with a single thread (either with SingleThreadedSpinner, or even MT spinner with 1 thread), the node takes up about 5% of CPU and just throws away the messages it did not have time to process.
As soon as there are at least 2 threads, CPU consumption goes high (40-400% of CPU) and scales (lineraly?) with the duration I wait in the callback and the number of threads.
My research ended in
spinner.cpp
, where I see that in case of 1 thread,callAvailable()
is used, whereas with more threads,callOne()
is used. However, I can't see through all the multi-threading magic in there. I ran a profiler on the node, and most of the time is spent with mutexes and other sync primitives.I haven't tried yet with
SubscribeOptions.allow_concurrent_callbacks
.Possibly related:
The text was updated successfully, but these errors were encountered: