Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regulated Purepursuit is slow #2332

Closed
vimalrajayyappan opened this issue May 10, 2021 · 21 comments
Closed

Regulated Purepursuit is slow #2332

vimalrajayyappan opened this issue May 10, 2021 · 21 comments
Labels
bug Something isn't working in progress

Comments

@vimalrajayyappan
Copy link

FollowPath:
plugin: "nav2_regulated_pure_pursuit_controller::RegulatedPurePursuitController"
desired_linear_vel: 7.2
max_linear_accel: 2.5
max_linear_decel: 2.5
lookahead_dist: 1.8 #0.6
min_lookahead_dist: 1.0 #0.3
max_lookahead_dist: 2.0 #0.9
lookahead_time: 3.0 #1.5
rotate_to_heading_angular_vel: 1.8
transform_tolerance: 0.1
use_velocity_scaled_lookahead_dist: false
min_approach_linear_velocity: 0.05
use_approach_linear_velocity_scaling: true
max_allowed_time_to_collision: 1.0
use_regulated_linear_velocity_scaling: true
use_cost_regulated_linear_velocity_scaling: false
regulated_linear_scaling_min_radius: 0.9
regulated_linear_scaling_min_speed: 0.25
use_rotate_to_heading: true
rotate_to_heading_min_angle: 0.785
max_angular_accel: 3.2

The tuning parameters mentioned above are the one I'm using. It may look wierd, but I tried giving the max values possible to check any reflecting changes. But I couldnt find any. The purepursuit is running at its slow speed as it used to be. Am I missing anything/ Can you please help me on that?

@SteveMacenski
Copy link
Member

SteveMacenski commented May 10, 2021

What do you mean "runs too slowly"? I can get this to run at 1khz easily on my 7th gen i5. Please provide much more detail about exactly what your issue is, what you have tried to solve it, and any relevant compute / robot platform details. Also provide your actual config file.

@SteveMacenski
Copy link
Member

@vimalrajayyappan can you give more context / info?

@SteveMacenski
Copy link
Member

@vimalrajayyappan we'll have to close this ticket in about a week if we don't receive a response

@fmrico
Copy link
Contributor

fmrico commented May 18, 2021

Hi,

While running the experiments with the pure pursuit in January, I experienced an effect that made the robot move slow, even if I configured the params to move faster. Maybe it is what @vimalrajayyappan describes.

I will try to remember the reason and the fix. I remember something related to the existence of the odom topic. Do you remember @jginesclavero?

Best

@vimalrajayyappan
Copy link
Author

Yes, exactly friend :). Kindly help @fmrico and @SteveMacenski :).

@SteveMacenski
Copy link
Member

SteveMacenski commented May 20, 2021

Happy to fix an issue if you guys can narrow it down a bit - so far I'm not seeing enough to really start debugging

Edit; Things to make sure that you've set your acceleration profiles so you can accelerate enough to get up to your maximum desired speed in a reasonable period of time. If you disable use_regulated_linear_velocity_scaling or use_cost_regulated_linear_velocity_scaling or use_velocity_scaled_lookahead_dist does it meet your speed requests (trying to see where the issue lies)?

To @fmrico comment, if your odometry in is slow, then it will cause problems because we use the current speed to determine the kinematic max accel / decel to use. OR if it didn't get your controller_frequency to be the same as the controller server is using, that will cause a critical issue in computing those limits as well. Good to know you're seeing it too, that means there's something wrong 😉

@SteveMacenski SteveMacenski added the bug Something isn't working label May 25, 2021
@padhupradheep
Copy link
Member

padhupradheep commented May 26, 2021

I did a short testing by changing the acceleration limits. I think the point made by @SteveMacenski seems to be vaild. Various acceleration limits are showing different behaviors. I have made a small plot showing the difference

The first image on the top is the RPP with acceleration limit set to 2.5 and the second image is when the acceleration limit is set to around 7.0. There is a clear difference.

low_acc

hig_acc

Hope this helps.

Edit: I did the test using the robot MPO-700 found in neo_simulation-2 package.

@SteveMacenski
Copy link
Member

SteveMacenski commented May 26, 2021

Yeah, if your acceleration limits are too small, then the RPP will cap the speed updates by the kinematic feasibility via (v_new = v_old + a * dt) which will ask the robot to execute a slower velocity -- thusly your odometry would be slower as the execution of that task.

But eventually you should be able to achieve that speed, it would just take longer. If you're seeing that you can't at all even achieve that speed, that's a different issue.

@padhupradheep did you try with the params @vimalrajayyappan provided? His max speed is something like desired_linear_vel: 7.2


@vimalrajayyappan, Pradheep is interested in working on this problem on your behalf, but some more information about what exactly you're seeing or reproduction instructions are necessary to make effective use of his limited time. Please answer the questions in my first comment in this ticket to provide the details needed to analyze the problem.

@padhupradheep
Copy link
Member

Some more findings:

  1. Increase the velocity higher than 2.5 m/s does not work, meaning the controller crashes with the below error at some point in the path.

[ERROR] [controller_server-4]: process has died [pid 21214, exit code -11, cmd '/opt/ros/foxy/lib/nav2_controller/controller_server --ros-args --params-file /tmp/tmps7dkfreq -r /tf:=tf -r /tf_static:=tf_static'].

  1. Tried to increase the rate of the controller. More than 20 Hz the controller does not produce any more velocity commands. The output shown in the screen are
    [controller_server-4] [INFO] [1622108386.041161415] [controller_server]: Passing new path to controller.
    [controller_server-4] [ERROR] [1622108386.401186248] [controller_server]: Failed to make progress

This above causes irrespective to the maximum acceleration or the velocity that has been set (Note, we have a good processor and neo_local_planner works well at 100 Hz). Also we have odom running at 100 Hz

FYI, I used the same parameters given above by @vimalrajayyappan.

@SteveMacenski
Copy link
Member

Increase the velocity higher than 2.5 m/s does not work, meaning the controller crashes with the below error at some point in the path.

Really? Now that is interesting. Did you try to get a traceback (this tutorial would help)?

I've definitely tested at > 20hz, that's how I found I could get it to run at 1 khz so maybe your param isn't being read in?

@padhupradheep
Copy link
Member

padhupradheep commented Jul 5, 2021

Really? Now that is interesting. Did you try to get a traceback (this tutorial would help)?

#0 0x00007ffff7c6d59f in nav2_costmap_2d::Costmap2D::getCost(unsigned int, unsigned int) const () from /home/pradheep/new2/install/nav2_costmap_2d/lib/libnav2_costmap_2d_core.so #1 0x00007ffff025a60d in nav2_regulated_pure_pursuit_controller::RegulatedPurePursuitController::inCollision(double const&, double const&) () from /home/pradheep/new2/install/nav2_regulated_pure_pursuit_controller/lib/libnav2_regulated_pure_pursuit_controller.so #2 0x00007ffff025c826 in nav2_regulated_pure_pursuit_controller::RegulatedPurePursuitController::isCollisionImminent(geometry_msgs::msg::PoseStamped_<std::allocator<void> > const&, double const&, double const&) () from /home/pradheep/new2/install/nav2_regulated_pure_pursuit_controller/lib/libnav2_regulated_pure_pursuit_controller.so #3 0x00007ffff025ce80 in nav2_regulated_pure_pursuit_controller::RegulatedPurePursuitController::computeVelocityCommands(geometry_msgs::msg::PoseStamped_<std::allocator<void> > const&, geometry_msgs::msg::Twist_<std::allocator<void> > const&) () from /home/pradheep/new2/install/nav2_regulated_pure_pursuit_controller/lib/libnav2_regulated_pure_pursuit_controller.so #4 0x000055555559748d in nav2_controller::ControllerServer::computeAndPublishVelocity() () #5 0x0000555555598166 in nav2_controller::ControllerServer::computeControl() () --Type <RET> for more, q to quit, c to continue without paging-- #6 0x00005555555bf3b6 in nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath, rclcpp::Node>::work() () #7 0x00005555555c0380 in std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::thread::_Invoker<std::tuple<nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath, rclcpp::Node>::handle_accepted(std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >)::{lambda()#1}> >, void> >::_M_invoke(std::_Any_data const&) () #8 0x000055555559ef2d in std::__future_base::_State_baseV2::_M_do_set(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*) () #9 0x00007ffff7a7f47f in __pthread_once_slow (once_control=0x7fffd8005e88, init_routine=0x7ffff795ac20 <__once_proxy>) at pthread_once.c:116 #10 0x00005555555a6978 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath, rclcpp::Node>::handle_accepted(std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >)::{lambda()#1}> >, void>::_Async_state_impl(std::tuple<nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath, rclcpp::Node>::handle_accepted(std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >)::{lambda()#1}>&&)::{lambda()#1}> > >::_M_run() () #11 0x00007ffff795bde4 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6 --Type <RET> for more, q to quit, c to continue without paging-- #12 0x00007ffff7a76609 in start_thread (arg=<optimized out>) at pthread_create.c:477 #13 0x00007ffff7649293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

@SteveMacenski
Copy link
Member

SteveMacenski commented Jul 6, 2021

That's really odd, #2437 is seeing that too -- which may or not be related. But that doesn't necessarily feel related to the initial ticket that if you set the param of speed > 2.5 m/s it doesn't reach those speeds, not that it crashes. That appears to be an unrelated issue that also should be addressed.

On what branch are you testing this? The only recent change I can find that makes me even somewhat suspect is c616cf0, maybe worth testing reverting that if you're on the main branch.

@padhupradheep
Copy link
Member

But that doesn't necessarily feel related to the initial ticket that if you set the param of speed > 2.5 m/s it doesn't reach those speeds, not that it crashes.

Note that the crash happens exactly at the point when the robot reaches 2.5 m/s. Also, @vimalrajayyappan stated that:

The purepursuit is running at its slow speed as it used to be

But he did not exactly give the details on what speed it was running at, for the given parameters.

On what branch are you testing this? The only recent change I can find that makes me even somewhat suspect is c616cf0, maybe worth testing reverting that if you're on the main branch.

I'm testing it on Foxy but I will do checkout the suspicious commit.

@SteveMacenski
Copy link
Member

SteveMacenski commented Jul 7, 2021

Well both the ticket issuer and Francisco mention that it just didn't go as fast as they wanted, they would have mentioned the crash if the crash was the issue. Again, its semantics, they're both problems and honestly the crash is a more serious problem anyway.

Given Galactic binaries are available, I'd be curious if you saw it there. You can also run that on 20.04 so it should be an easy transition to test. #2437 (comment) tested without those commits already and it was still occurring. It might be good to sanity check them, but I think we can tentatively assume its neither of those commits. It might be safe to try to address this problem assuming that its a standalone issue in RPP that we're trying to access a cell that is out of bounds for some reason.

Edit: Oooooh I think I know what this might be. To verify: print what the mx/my are and what the costmap size is https://github.com/ros-planning/navigation2/blob/main/nav2_regulated_pure_pursuit_controller/src/regulated_pure_pursuit_controller.cpp#L407. While its running, visualize the collision check arc points in rviz. What I think is happening is that since the costmap is a rolling costmap of finite size and if you set the robot's max speed to 10m/s (lets say) then if you set the look ahead collision checking time to 1 second, then it will try to look for collisions 10m away at the most. But if your rolling costmap size is only 5m, then it would go out of bounds and seg fault.

What I think we should do about that (assuming that I am correct) is to print a throttled warning to every 30 seconds that you've configured your costmap too small to safely collision check the full distance away at your high speeds, proceed at your own caution. We can prevent the seg fault by simply checking if mx/my are larger than the rolling costmap size.

That still doesn't answer the question of why it couldn't go faster than 2.5 m/s in particular, unless that's the magic number of roughly the lookahead time * rolling costmap half size.

@padhupradheep
Copy link
Member

padhupradheep commented Jul 8, 2021

Costmap size was indeed causing the problem. I updated the height and the width of the costmap. Now, there is no more issue. I gave it a run by setting the desired velocity to 8.0 m/s, you can see the plot below on how it works:

rrp_Bug

We just need to do the the costmap size checks and leave out a warning to avoid a Seg fault.

@SteveMacenski
Copy link
Member

Awesome!! Is that something you can open a PR on? I think it should be separate of your reversing PR (and that way you get credit for 2 merge commits 😉 ) to keep things isolated. I can totally understand why that caused the crash, not entirely sure why that caused the speed limit issues, but I won't argue with it 😄

@padhupradheep
Copy link
Member

Awesome!! Is that something you can open a PR on?

I will gladly do it !

I can totally understand why that caused the crash, not entirely sure why that caused the speed limit issues, but I won't argue with it smile

That's for another day! :D

@padhupradheep
Copy link
Member

padhupradheep commented Jul 16, 2021

Why don't we just have this as a feature to the Costmaps rather than having it "local" to the RPP ? By that way, if we or someone else plan to write a local planner in the future, this functionality might be already handy. They could just call this function from the costmap to check for it.

Edit: I believe, we already have this functionality in the Costmap API. While we convert the robots pose to map frame here. On further investigating the function here, we can just have a boolean that could check if the robot goes beyond the costmap bounds. By this way, we can see if the robot is going beyond the dimensions of the given cost map and just leave out the warning at that point.

@padhupradheep
Copy link
Member

padhupradheep commented Jul 16, 2021

      FollowPath:
      plugin: "nav2_regulated_pure_pursuit_controller::RegulatedPurePursuitController"

      desired_linear_vel: 10.0

      max_linear_accel: 3.0

      max_linear_decel: 3.0

      lookahead_dist: 1.8

      min_lookahead_dist: 1.8 #0.3

      max_lookahead_dist: 2.0 #0.9

      lookahead_time: 1.5 #1.5

      rotate_to_heading_angular_vel: 0.9

      transform_tolerance: 0.1

      use_velocity_scaled_lookahead_dist: false

      min_approach_linear_velocity: 0.01

      use_approach_linear_velocity_scaling: true

      max_allowed_time_to_collision: 1.0

      use_regulated_linear_velocity_scaling: true

      use_cost_regulated_linear_velocity_scaling: false

      regulated_linear_scaling_min_radius:  0.9

      regulated_linear_scaling_min_speed: 0.25

      use_rotate_to_heading: true

      rotate_to_heading_min_angle: 0.785

      max_angular_accel: 10.0

      allow_reversing: false

Here is a another scenario, where the controller tends to seg fault. The height and the width of my local costmap is 1x1. Even if we take care of the collision check scenario (as explained above). Here I have a lookahead distance certainly greater than that of the local costmap. This is another part which needs further investigation and needs to be taken care of as well!

Note that the above parameters works well, if the size of the local costmap was something greater than that of the lookahead distance.

Edit: Also we need to know that, irrespective of the Lookahead Distance, the carrot pose will be at the edge of the local costmap. With these parameters the segfault happens at the point when the robot goes beyond the carrot pose. The reason for the robot going beyond the carrot could be caused because the carrot continues to respect the local costmap, which apparently gets updated at a slower rate and also because of the fact that it moves relatively slower than that of the robot.

Edit-2: Okay, now I have to disagree with the above edit, for the point where I say:

Also we need to know that, irrespective of the Lookahead Distance, the carrot pose will be at the edge of the local costmap.

The rest of the points in the Edit still stays valid. It is not necessary for the pose of the carrot to be at the edge of the given local costmap. Sorry for the confusion!

@SteveMacenski
Copy link
Member

SteveMacenski commented Jul 16, 2021

By this way, we can see if the robot is going beyond the dimensions of the given cost map and just leave out the warning at that point.

That could be one way to do it, as we're querying points just checking bounds. Typically I don't use that API because bound-checking has a non-trivial performance hit when you're querying ALOT of nodes, but in this context I think that's the best solution. This controller was running in excess of 1 Khz in my testing, we can certainly handle a little performance hit for the sake of code readability and we're not querying on the order of thousands or millions of cells 😄

The second problem you brought up could be handled the same way, using the bound respecting API for getting info from the costmap

@SteveMacenski
Copy link
Member

Merged!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working in progress
Projects
None yet
Development

No branches or pull requests

4 participants