Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

service introspection event does not have unique GID during transaction #357

Open
fujitatomoya opened this issue Aug 25, 2023 · 7 comments
Assignees
Labels
bug Something isn't working

Comments

@fujitatomoya
Copy link
Collaborator

Bug report

Required Info:

  • Operating System:
    • Ubuntu 22.04
  • Installation type:
    • source build
  • Version or commit hash:
  • DDS implementation:
    • rmw_fastrtps, rmw_cyclonedds (but rmw_connextdds)
  • Client library (if applicable):
    • rcl

Steps to reproduce issue

### Terminal-1 start example
root@tomoyafujita:~/ros2_ws/colcon_ws# ros2 launch demo_nodes_cpp introspect_services_launch.py

### Terminal-2 enable service introspection and echo the events
root@tomoyafujita:~/ros2_ws/colcon_ws# ros2 param set /introspection_client client_configure_introspection "contents"
Set parameter successful
root@tomoyafujita:~/ros2_ws/colcon_ws# ros2 param set /introspection_service service_configure_introspection "contents"
Set parameter successful
root@tomoyafujita:~/ros2_ws/colcon_ws# ros2 topic echo /add_two_ints/_service_event --flow-style
...<snip>
---
info:
  event_type: 0
  stamp:
    sec: 1692984132
    nanosec: 971129256
  client_gid: [1, 15, 96, 219, 167, 4, 119, 93, 0, 0, 0, 0, 0, 0, 21, 3]
  sequence_number: 53
request: [{a: 2, b: 3}]
response: []
---
info:
  event_type: 1
  stamp:
    sec: 1692984132
    nanosec: 971575657
  client_gid: [1, 15, 96, 219, 167, 4, 119, 93, 0, 0, 0, 0, 0, 0, 20, 4]
  sequence_number: 53
request: [{a: 2, b: 3}]
response: []
---
info:
  event_type: 2
  stamp:
    sec: 1692984132
    nanosec: 971884575
  client_gid: [1, 15, 96, 219, 167, 4, 119, 93, 0, 0, 0, 0, 0, 0, 20, 4]
  sequence_number: 53
request: []
response: [{sum: 5}]
---
info:
  event_type: 3
  stamp:
    sec: 1692984132
    nanosec: 972157308
  client_gid: [1, 15, 96, 219, 167, 4, 119, 93, 0, 0, 0, 0, 0, 0, 21, 3]
  sequence_number: 53
request: []
response: [{sum: 5}]
---

Expected behavior

service introspection event message requires gid must be unique during transaction.
which uniquely identify a single request-reponse pair.
see more details for https://github.com/ros-infrastructure/rep/pull/360/files#diff-7739b7ffccd3276874ea9014c4cd07872efa4e2d60d7c31896e56929ad424455R144-R149.

Actual behavior

service introspection event message IS NOT unique so it does not identify a single request-reponse pair.

Additional information

Related Issue/PR: ros2/rosbag2#1414 (rosbag2 service record/play feature enhancement.)

@fujitatomoya
Copy link
Collaborator Author

@fujitatomoya
Copy link
Collaborator Author

@MiguelCompany
CC: @iuhilnehc-ynos

https://github.com/ros2/rmw_fastrtps/blob/b45e73500bc0a2056fdc31197a6f94434b327d55/rmw_fastrtps_shared_cpp/src/rmw_request.cpp#L116 replaces original writer (request publisher guid) into reader (response subscription guid) since it will be copies here https://github.com/ros2/rmw_fastrtps/blob/b45e73500bc0a2056fdc31197a6f94434b327d55/rmw_fastrtps_shared_cpp/src/rmw_request.cpp#L132-L134, and this guid is used to send the response.

this generates this issue since response subscription guid is not the same guid with request publisher guid, so that client_gid in service_event cannot be identical for client-server pair.

i believe that we need original request publisher guid for rcl_send_service_event_message to address this issue.
rmw interfaces expect that request guid can be used to send the response directly, so there is no such interfaces to get original request publisher guid. unless rcl can ask the client endpoint mapping clients_endpoints_ to get it when rcl calls rcl_send_service_event_message.

i do not really think good idea how to address this issue... any thoughts?

@wjwwood
Copy link
Member

wjwwood commented Jan 6, 2024

This definitely seems like an issue, but I haven't investigating it myself yet. It would be good to have a new test (maybe in rcl?) that checks that this condition is met, along with any clarification in the rmw docs to require it of the middleware.

That way we could take the new docs and test to the rmw implementation maintainers to ask them to fix it.

@fujitatomoya
Copy link
Collaborator Author

@wjwwood if you can take a look at ros2/rmw_fastrtps#779 which fixes this issue with rmw_fastrtps, that would be really appreciated.

note: i am not sure how to fix the rmw_cyclonedds...

@fujitatomoya
Copy link
Collaborator Author

ros2/rmw_fastrtps#779 is merged, so only the problem left with rmw_cyclonedds. and after that, it would be probably better to add test case either rcl or rmw_implementation to make sure service event client gid is unique.

@fujitatomoya
Copy link
Collaborator Author

Note: ros2/rmw_fastrtps#780 is going to be merged (revert ros2/rmw_fastrtps#779) because that generates the instability CI issue especially RHEL Debug job. see more details for ros2/rmw_fastrtps#780 (comment)

@fujitatomoya
Copy link
Collaborator Author

with ros2/rmw_fastrtps#780, i was unable to reproduce the hang-up issue reported with ros2/rmw_fastrtps#780... even with full source build and test... (including debug flag.) i cannot debug this problem with my local environment... i think we need to come up with another solution...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants