Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow enabling syscalls through ros2 trace or the Trace action #137

Merged
merged 2 commits into from
Oct 8, 2024

Conversation

christophebedard
Copy link
Member

@christophebedard christophebedard commented Oct 5, 2024

This adds support for enabling syscall tracing with the ros2 trace command or the Trace launch action.

While system calls are separate from kernel events, both can be traced using the LTTng kernel tracer: https://lttng.org/docs/v2.13/#doc-tracing-the-linux-kernel. Kernel events are "tracepoints" while system calls are their own kind of trace event. This explains why this PR adds a separate option for syscalls instead of just using the kernel events option. See the changes to lttngpy::enable_events in lttngpy/src/lttngpy/event.cpp.

How to use:

  1. ros2 trace command:
    $ ros2 trace --syscall openat  # interactive
    $ ros2 trace start my-session-name --syscall openat  # non-interactive
    
  2. Launch files (Python, XML, and YAML): syscalls arg

Note that, when enabling a syscall, e.g., openat, the resulting trace has syscall entry and exit events: syscall_entry_openat and syscall_exit_openat.

@christophebedard christophebedard added the enhancement New feature or request label Oct 5, 2024
@christophebedard christophebedard self-assigned this Oct 5, 2024
@christophebedard christophebedard force-pushed the christophebedard/ros2-trace-syscalls branch from 11d4ab5 to c105689 Compare October 6, 2024 22:40
@christophebedard
Copy link
Member Author

@mjcarroll would you be willing to review this?

@christophebedard
Copy link
Member Author

There's something wrong with the GitHub CI binary jobs: https://github.com/ros2/ros2_tracing/actions/runs/11205771796/job/31145663869?pr=137#step:4:1989. I see the same failures in the nightly GitHub CI binary jobs. Not really related to these changes.

Copy link
Contributor

@fujitatomoya fujitatomoya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall, lgtm with green CI. had a couple of comments.

README.md Show resolved Hide resolved
@christophebedard
Copy link
Member Author

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Linux-rhel Build Status
  • Windows Build Status

@christophebedard christophebedard force-pushed the christophebedard/ros2-trace-syscalls branch from 485dca3 to e12b804 Compare October 7, 2024 22:51
@christophebedard
Copy link
Member Author

Sorry, I realized that the way I had written the decorator meant it was evaluated during test collection, at the very beginning. This means that, if no session daemon was running before the test was run, then the test would not be allowed to try to spawn a session daemon and therefore the test would definitely be skipped. I modified the decorator so that it's evaluated before running the test: https://github.com/ros2/ros2_tracing/compare/485dca3ac871e2a26d603b2916ac3aa6281ebf28..e12b8044bfd1b8a15209df44c10e0f1e47bcf5b0.

Can you take another look?

@christophebedard
Copy link
Member Author

Argh this isn't even completely working.

@christophebedard christophebedard force-pushed the christophebedard/ros2-trace-syscalls branch from e12b804 to 2f0c39a Compare October 8, 2024 00:14
@christophebedard
Copy link
Member Author

OK, I fixed the decorator and I think this is all I can really do. Implementing the decorator this way ensures it's evaluated right before running the test, after the test class' @unittest.skipIf has been evaluated. Otherwise, running on Windows fails, since lttngpy is empty. See the previous Windows CI job: Windows Build Status.

Also, when running with colcon test on my system, test_kernel_tracing runs, because I have a root session daemon running and my user is part of the tracing group. However, if you try to run the test using a build system that does sandboxing, it fails because there's no root session daemon in it, and spawning one would be more tricky. I think this is fine for now.

New CI:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Linux-rhel Build Status
  • Windows Build Status

Copy link
Contributor

@fujitatomoya fujitatomoya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks for the explanation. lgtm with green CI.

@christophebedard
Copy link
Member Author

Thanks for the review @fujitatomoya!

CI looks good.

@christophebedard christophebedard merged commit 57e6ede into rolling Oct 8, 2024
6 of 9 checks passed
@christophebedard christophebedard deleted the christophebedard/ros2-trace-syscalls branch October 8, 2024 15:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants