Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multithreaded executor concurency problem #702

Closed
guillaumeautran opened this issue Apr 18, 2019 · 1 comment
Closed

multithreaded executor concurency problem #702

guillaumeautran opened this issue Apr 18, 2019 · 1 comment

Comments

@guillaumeautran
Copy link
Contributor

When processing a MutuallyExclusive callback_group, the multi-threaded executor ends up getting into a race condition with the can_be_taken_from_ atomic boolean. The problem is that the Executor::execute_any_executable resets the MutuallyExclusive callback_group can_be_taken_from_ atomic boolean and the AnyExecutable destructor also does reset the calllback group flag as well.
These two action cause the can_be_taken_from_ exclusion flag to get out of sync with reality causing a concurrency issue.

The resulting observable behaviour could be a re-ordering of messages within the same topic subscription (ie: older messages being delivered after newer messages).

Bug report

Required Info:

  • Operating System:
    • Ubuntu 16.04
  • Installation type:
    • From source
  • Version or commit hash:
    -0.6.4
  • DDS implementation:
    • Fast-RTPS
  • Client library (if applicable):
    • rclcpp

Steps to reproduce issue

Subscriber node (where the issue will be seen):

  • Create a ROS2 rclcpp node with the multi-threaded executor (default arguments are fine)
  • Register to a topic caring some basic ordered data (a topic of Header type with a properly field timestamp would be fine)
  • In the topic callback, verify that the message timestamps are in the proper order (newer after older)
    Publisher Node:
  • Create a ROS2 node with the topic the subscriber node subscribed to (as stated earlier)
  • Publish Header messages with properly filled timestamps (as fast as possible)

Expected behavior

Every Header messages are received in order

Actual behavior

Header messages are randomly re-ordered (older message arriving last)

guillaumeautran added a commit to clearpathrobotics/rclcpp that referenced this issue Apr 18, 2019
Both, the `Executor::execute_any_executable` and the destructor for the `AnyExecutable` object used by the multithreaded executor, reset the `can_be_taken_from_` flag on a MutuallyExclusive group. This cause the variable to get out of sync and threads to process executables out of sequence.

This fix clears the callback group variable of the `AnyExecutable` instance effectively preventing its destructor from modifying the variable at the wrong time.

Issue: ros2#702
guillaumeautran added a commit to clearpathrobotics/rclcpp that referenced this issue Apr 18, 2019
Both, the `Executor::execute_any_executable` and the destructor for the `AnyExecutable` object used by the multithreaded executor, reset the `can_be_taken_from_` flag on a MutuallyExclusive group. This cause the variable to get out of sync and threads to process executables out of sequence.

This fix clears the callback group variable of the `AnyExecutable` instance effectively preventing its destructor from modifying the variable at the wrong time.

Issue: ros2#702
Signed-off-by: Guillaume Autran <[email protected]>
guillaumeautran added a commit to clearpathrobotics/rclcpp that referenced this issue Apr 22, 2019
Both, the `Executor::execute_any_executable` and the destructor for the `AnyExecutable` object used by the multithreaded executor, reset the `can_be_taken_from_` flag on a MutuallyExclusive group. This cause the variable to get out of sync and threads to process executables out of sequence.

This fix clears the callback group variable of the `AnyExecutable` instance effectively preventing its destructor from modifying the variable at the wrong time.

Issue: ros2#702
Signed-off-by: Guillaume Autran <[email protected]>
sloretz pushed a commit that referenced this issue Apr 22, 2019
Both, the `Executor::execute_any_executable` and the destructor for the `AnyExecutable` object used by the multithreaded executor, reset the `can_be_taken_from_` flag on a MutuallyExclusive group. This cause the variable to get out of sync and threads to process executables out of sequence.

This fix clears the callback group variable of the `AnyExecutable` instance effectively preventing its destructor from modifying the variable at the wrong time.

Issue: #702
Signed-off-by: Guillaume Autran <[email protected]>
@sloretz
Copy link
Contributor

sloretz commented Apr 22, 2019

Fixed by #703

@sloretz sloretz closed this as completed Apr 22, 2019
guillaumeautran added a commit to clearpathrobotics/rclcpp that referenced this issue Apr 22, 2019
Reimplement the fix to correct a concurrency problem in the multithreaded executor.
This time, the AnyExecutable class has a boolean flag to reset / not reset the callback group `can_be_taken_from_` variable on destruction.
The multithreaded executor initializes the executor with that flag set to false.

Issue: ros2#702
guillaumeautran added a commit to clearpathrobotics/rclcpp that referenced this issue Apr 22, 2019
Reimplement the fix to correct a concurrency problem in the multithreaded executor.
This time, the AnyExecutable class has a boolean flag to reset / not reset the callback group `can_be_taken_from_` variable on destruction.
The multithreaded executor initializes the executor with that flag set to false.

Issue: ros2#702
Signed-off-by: Guillaume Autran <[email protected]>
cho3 pushed a commit to cho3/rclcpp that referenced this issue Jun 3, 2019
Both, the `Executor::execute_any_executable` and the destructor for the `AnyExecutable` object used by the multithreaded executor, reset the `can_be_taken_from_` flag on a MutuallyExclusive group. This cause the variable to get out of sync and threads to process executables out of sequence.

This fix clears the callback group variable of the `AnyExecutable` instance effectively preventing its destructor from modifying the variable at the wrong time.

Issue: ros2#702
Signed-off-by: Guillaume Autran <[email protected]>
DensoADAS pushed a commit to DensoADAS/rclcpp that referenced this issue Aug 5, 2022
…os2#702)

* rosbag2_py pybind wrapper for "record" verb

Signed-off-by: Emerson Knapp <[email protected]>
DensoADAS pushed a commit to DensoADAS/rclcpp that referenced this issue Aug 5, 2022
* Fix rosbag2_py transport test for py capsule

Change introduced by ros2/rclpy#741 conflicted with parallel work in ros2#702, the combination causing nightly test failure for `rosbag2_py`'s `_transport.cpp` tests. Fix to use the new `py::object` for `rmw_qos_profile_t` to fix the tests.

Signed-off-by: Emerson Knapp <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants