-
Notifications
You must be signed in to change notification settings - Fork 427
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Timer that can be triggered a finite number of times. #2006
base: rolling
Are you sure you want to change the base?
Conversation
Signed-off-by: Voldivh <[email protected]>
Signed-off-by: Voldivh <[email protected]>
Signed-off-by: Voldivh <[email protected]>
This implementation adds overhead to existing timers. |
I agree. I thought about doing the implementation in another class, however, that option would imply copying all the code from an existing class and just modify a couple of lines and would also reduce the discoverability of the feature. What would you recommend to do in this case so that we can make sure the feature doesn't affect the existing timers? |
Signed-off-by: Voldivh <[email protected]>
Signed-off-by: Voldivh <[email protected]>
With adding generality there is always going to be overhead. There are different dimensions of overhead, there's pure runtime, there's memory footprint, there's development effort, and there's api complexity all of which are side effects of design decisions. For core C++ libraries often they can avoid runtime overhead by doing template meta-programming and other advanced features. Another alternative would be if we were to change this API to be a factory pattern with generators that leverage different potential backend implementations. I don't think that's something that we want to get into. But if I look at this for what the effect to the user will be. It's increasing the memory footprint by 2 unsigned integers. And in each timer cycle there's one if statement checking if an unsigned integer is non-zero. This is a very low overhead for having a generic API available. It keeps the API simple and compact. And does not increase the code complexity significantly. If we are to add another class with a parallel implmentation it will likely have a higher memory impact as both classes will be in parallel in the shared libraries and then potentially loaded into memory too, with a total extra space of much greater than 2 unsigned ints of space. And looking a little bit closer the user's callback is invoked before the if statement so to the user the only potential slowdown is that the CPU has to do approximately one extra cycle prior to recomputing the timeout and enqueuing itself back into the waitset. As such the user will not have any effect on their timing either delaying their callback, nor delaying their next callback. With the exception that they overran their last callback time buffer and then it will be approximately 1 cpu clock cycle later if the compiler has not optimized out the check that this code path will clearly not take. I don't think that there's a way to do this more efficiently that's both simple and easy for the user to leverage following our established API and design patterns as well as not adding significant extra lines of code for ongoing maintenance and testing. As such I would suggest going with this current quite simple solution. |
@tfoote IMO the ROS 2 design should be rigorous about how new features are added. This "callback counter" is a convenience utility and, as we discussed in the issue, it's something that users could very easily do from the application code. This particular PR is adding an overhead that is probably negligible, but still doing an exception for this PR today, then for another PR tomorrow and so is not good. Performance is one of the main problems of ROS 2, and there are a lot of on-going efforts to make ROS 2 more efficient, but this is possible only if performance is taken in serious consideration at all the layers. My recommendation is to handle this via a new |
@tfoote @Voldivh sorry for the short notice, but would you be available to have a discussion on this in the ROS 2 Client Library Working Group? It's today (August 31th) at 8AM Pacific Time: https://meet.google.com/pxe-ohqa-cjm |
Saying that there might be another PR tomorrow that might slow down performance is a reason to not take this one, is not really a valid critique of this PR.
Indeed, have there ever been any metrics at issue or proposed which this change will effect adversely? What is a metric on the performance of this PR that is acceptable vs unacceptable? As this is in the non-zero case an unchanging variable I suspect that the compiler will likely even detect that as such and this if statement never get even evaluated. Are there any known or perceived cases when this will ever become a bottleneck to computation that we will need to optimize? If not this seems like a case of premature optimization for runtime overhead at the cost of many other aspects of the system. Forcing things into separate classes and implementations has notable other deleterious effects such as slowing down developer/user velocity and increasing our support burden with the number of lines of code, tests and documentation to maintain. These are all real costs which will be born out by both us and users in the community and trades off one form of overhead for another.
This is a great design goal but it cannot be applied in the abstract on a single metric. All of the above costs I've outlined above also effect the entire community.
I think that this is one of the fundamental area where we're looking at this differently. I see the rclcpp API as a user facing API that should be designed to optimize the development of the end user. To the end user there is absolutely no difference whether the code is implemented under the hood using rcl or locally implemented in rclcpp. If under the hood in the implementation you want to make a distinction as to how it's implemented that's great. But it should not be something the the user sees or cares about. If we were to extend the functionality that is implemented in rcl and change the backend of that feature in rclcpp, an rclcpp API user should not notice or care. rclcpp's public API is an abstraction which should hide all of those underlying implementation details. |
So I was thinking about this a bit last night. And I think that a better solution that can both keep the default API generic and fully featured and separately maintain a highly optimized path. More concretely we will have the default Timer with all the features and it will have slightly higher overhead, and separately we have the LowOverheadTimer(naming TBD, using this for now for clarity) which is highly optimized and has a less user centric API that developers who are worried about the runtime overhead can opt into the more optimized one. If we're worried about a single if statement I suspect that there's more optimizations that can be achieved by stripping out more of the layers that already exist. This will then allow the majority of users who are not going to be effected by the overhead to have the familiar API with all the bells and whistles. And the small fraction of users who want/need to optimize things can choose to use the lower level interface which does not need to be as cleanly presented or fully featured which will allow even more runtime optimizations than we already offer. |
I like this proposal @tfoote! I really want to apologize if my comments before were sounding too strong, in hindsightI shouldn't have started this discussion on this very PR, but rather just started it separately. One of the features developed by iRobot is the Events Executor. |
@tfoote @alsora thanks for sharing your thoughts. IMO, this feature is basic control for timer, this is acceptable and reasonable.
This is so true. Is this something we really want to do as design driver? I am just confused that user client library driver... It will be really difficult to support new feature in Just a bit confused about where this is going to land. thanks! |
CC: @ros2/team |
So given where we are in the lifecycle in ROS 2, my opinion here is that performance doesn't need to be 1st priority, but it definitely needs to be a priority. We've heard a lot of grumbling from the community about how the performance of ROS 2 is worse than ROS 1 in a lot of ways. While we have a lot of knobs that people can use to make the system more performant, the fact of the matter is that out-of-the-box, ROS 2 currently isn't great at this. So any new features that go in need to be weighed against this fact. That said, for this particular feature, I agree that the cost of an extra branch is likely going to be lost in the noise here. There's an easy way to test; do some performance testing before and after this PR with timers. If we can't measure the overhead (as I suspect), then I would be for this PR going in more-or-less as-is. |
Our analysis showed that one of the major CPU bottlenecks of ROS 2 is caused by how its abstraction layers are implemented. While developing the events executor, we did some tests where we invoked the DDS APIs as close as possible to the application layer. The "fast-path" i.e. that sequence of operations that is invoked at very high frequency while an application is running has to be cleaned from all operations that aren't required. This PR is not a problem per se, merging it will have no impact on performance for 99.9% of the users (me included) and its overhead would be hard to measure, however it's a clear example of how the ROS 2 design does not follow the "you don't pay for what you don't use" principle in its fast-path User-friendly, feature-rich APIs and efficient primitives are in my opinion both equally important requirements for ROS 2, but they should be built one on top of the other rather than be mixed together. |
Signed-off-by: Voldivh <[email protected]>
After reading all the discussion and the common ground reached regarding the software design, I ran some simple test on my local host to check the overhead this PR would include. I basically created a timer with a callback and checked if the timer was being fired on time (according to the period) using As for the timer being triggered on time, I only ran it for a few callbacks and all were triggered within the time period:
As for the workload on the CPU and the memory usage, take a look at the yellow row from the next pictures: Feature BranchRollingAs we expected, the CPU/memory consumption is basically the same for both cases. I know this was a simple test to run that does not show an actual benchmark test for the feature. However, I believe it shows that the overhead is so small that it doesn't affect the behavior of the timers. |
Signed-off-by: Voldivh <[email protected]>
rclcpp/include/rclcpp/node.hpp
Outdated
*/ | ||
template<typename DurationRepT = int64_t, typename DurationT = std::milli, typename CallbackT> | ||
typename rclcpp::WallTimer<CallbackT>::SharedPtr | ||
create_wall_timer( | ||
std::chrono::duration<DurationRepT, DurationT> period, | ||
CallbackT callback, | ||
rclcpp::CallbackGroup::SharedPtr group = nullptr); | ||
rclcpp::CallbackGroup::SharedPtr group = nullptr, | ||
unsigned int amount_of_callbacks = 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this variable is of type uint32_t
in most of the other instances, please use the same type also here and in Node::create_wall_timer(
for consistency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure thing. Done.
Signed-off-by: Voldivh <[email protected]>
Signed-off-by: Voldivh <[email protected]>
Signed-off-by: Voldivh <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
@Voldivh could you address the conflicts? |
Sure thing, will do. |
same here, build failed https://build.ros2.org/job/Rpr__rclcpp__ubuntu_jammy_amd64/906/console 10:51:52 /tmp/ws/src/rclcpp/rclcpp/include/rclcpp/subscription.hpp:208:5: error: there are no arguments to ‘TRACETOOLS_TRACEPOINT’ that depend on a template parameter, so a declaration of ‘TRACETOOLS_TRACEPOINT’ must be available [-fpermissive]
10:51:52 208 | TRACETOOLS_TRACEPOINT(
10:51:52 | ^~~~~~~~~~~~~~~~~~~~~
10:51:52 gmake[2]: *** [CMakeFiles/rclcpp.dir/build.make:181: CMakeFiles/rclcpp.dir/src/rclcpp/any_executable.cpp.o] Error 1
10:51:52 gmake[1]: *** [CMakeFiles/Makefile2:138: CMakeFiles/rclcpp.dir/all] Error 2 |
Same question here with this commit. |
@alsora could you review your comment? |
@@ -91,7 +91,8 @@ create_timer( | |||
rclcpp::Duration period, | |||
CallbackT && callback, | |||
rclcpp::CallbackGroup::SharedPtr group = nullptr, | |||
bool autostart = true) | |||
bool autostart = true, | |||
uint32_t amount_of_callbacks = 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uint32_t amount_of_callbacks = 0) | |
uint32_t number_of_callbacks = 0) |
@@ -100,7 +101,8 @@ create_timer( | |||
group, | |||
node_base.get(), | |||
node_timers.get(), | |||
autostart); | |||
autostart, | |||
amount_of_callbacks); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
amount_of_callbacks); | |
number_of_callbacks); |
@@ -112,7 +114,8 @@ create_timer( | |||
rclcpp::Duration period, | |||
CallbackT && callback, | |||
rclcpp::CallbackGroup::SharedPtr group = nullptr, | |||
bool autostart = true) | |||
bool autostart = true, | |||
uint32_t amount_of_callbacks = 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uint32_t amount_of_callbacks = 0) | |
uint32_t number_of_callbacks = 0) |
@@ -121,7 +124,8 @@ create_timer( | |||
group, | |||
rclcpp::node_interfaces::get_node_base_interface(node).get(), | |||
rclcpp::node_interfaces::get_node_timers_interface(node).get(), | |||
autostart); | |||
autostart, | |||
amount_of_callbacks); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
amount_of_callbacks); | |
number_of_callbacks); |
* \param amount_of_callbacks Quantity of times the callback will be triggered. The default value is 0 | ||
* that means an infinite amount of callbacks i.e. the clock's default behavior. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* \param amount_of_callbacks Quantity of times the callback will be triggered. The default value is 0 | |
* that means an infinite amount of callbacks i.e. the clock's default behavior. | |
* \param number_of_callbacks number of times the callback will be triggered. The default value is 0 | |
* that means an infinite amount of callbacks i.e. the timer's default behavior. |
rclcpp/include/rclcpp/timer.hpp
Outdated
@@ -317,6 +335,7 @@ class GenericTimer : public TimerBase | |||
RCLCPP_DISABLE_COPY(GenericTimer) | |||
|
|||
FunctorT callback_; | |||
uint32_t amount_of_callbacks_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uint32_t amount_of_callbacks_; | |
uint32_t number_of_callbacks_; |
rclcpp/include/rclcpp/timer.hpp
Outdated
@@ -337,14 +356,17 @@ class WallTimer : public GenericTimer<FunctorT> | |||
* \param callback The callback function to execute every interval | |||
* \param context node context | |||
* \param autostart timer state on initialization | |||
* \param amount_of_callbacks Quantity of times the callback will be triggered. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* \param amount_of_callbacks Quantity of times the callback will be triggered. | |
* \param number_of_callbacks Number of times the callback will be triggered. |
rclcpp/include/rclcpp/timer.hpp
Outdated
*/ | ||
WallTimer( | ||
std::chrono::nanoseconds period, | ||
FunctorT && callback, | ||
rclcpp::Context::SharedPtr context, | ||
bool autostart = true) | ||
bool autostart = true, | ||
uint32_t amount_of_callbacks = 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uint32_t amount_of_callbacks = 0) | |
uint32_t number_of_callbacks = 0) |
rclcpp/include/rclcpp/timer.hpp
Outdated
: GenericTimer<FunctorT>( | ||
std::make_shared<Clock>(RCL_STEADY_TIME), period, std::move(callback), context, autostart) | ||
std::make_shared<Clock>(RCL_STEADY_TIME), period, std::move(callback), context, autostart, | ||
amount_of_callbacks) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
amount_of_callbacks) | |
number_of_callbacks) |
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary whitespace change.
as we talked before, this makes more sense if the canceled timer is removed from the IMO, i think this PR is still useful for syntax sugar for application. |
I took a closer look at the implementation of the cancel method and you are right, a canceled timer is not removed from the Taking that into account, I do feel that the PR would make more sense if it removes the timer from the |
Signed-off-by: Chris Lalancette <[email protected]>
I don't think we should destroy the timer object. In particular, I can imagine that the user may want to retrigger the same timer multiple times, which they should be able to do by calling the In order to remove it from the waitset, I think we'd need to do something more clever where we track inside of the timer object whether we are currently in the waitset. This would be However, it's even more complicated than that, since at the |
Ok I see, as I feared, there is no easy way around it then. I'll take some time to look into it and get back to you. |
This PR addresses the #1990. Basically it gives the timers API the possibility to select the number of times the timer will be triggered and be canceled after that using a variable. This variable (
amount_of_callbacks
) have the default value of0
which indicates that the timer will be triggered infinite times until it is 'manually' cancelled, i.e., the same behavior the times have at the moment. For example, initializing the timer withamount_of_callbacks = 1
, would create an one shot timer, which would be triggered once and then would be canceled.