-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
removed the deep copy for the costmap_raw_ data using memcpy in prepareCostmap under costmap_2d_publisher.cpp #4919
Conversation
Signed-off-by: doublebrackets <[email protected]>
Codecov ReportAll modified and coverable lines are covered by tests ✅
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per our email, please get some metrics. You can create a simple program that creates the Costmap2DPublisher object, gives it a costmap of realistic size and call prepareCostmap on it in a loop 10000x. Before and after the loop, use Chrono to take the current time and print to screen how long it took to do that operation. Change to using memcpy and then run it again to see the performance change.
For the local costmap, a realistic size might be 10x10m at 0.05m resolution. For a global, maybe a 100x100m at 0.05m resolution.
Thanks for the review @SteveMacenski, I will try to do so and share some results here before and after using memcpy! |
I have added some results from benchmarking the prepareCostmap() method over 10,000 iterations and its performance improvements before and after using memcpy. For a local costmap of dimensions 10x10m: Original method: ~287ms After memcpy implementation: ~24ms For a global costmap of dimensions 100x100m: Original method: ~33,888ms After memcpy implementation: ~5,607ms I used the following code for benchmarking these results: #include <chrono>
#include <iostream>
#include <random>
#include "rclcpp/rclcpp.hpp"
#include "rclcpp_lifecycle/lifecycle_node.hpp"
#include "nav2_costmap_2d/costmap_2d_publisher.hpp"
#include "nav2_costmap_2d/costmap_2d.hpp"
int main() {
rclcpp::init(0, nullptr);
auto node = std::make_shared<rclcpp_lifecycle::LifecycleNode>("costmap_bench");
node->configure();
node->activate();
nav2_costmap_2d::Costmap2D cmap(2000, 2000, 0.05, 0.0, 0.0); // or 200, 200, 0.05 for local costmap
std::mt19937 gen(std::random_device{}());
std::uniform_int_distribution<> dist(0, 255);
for(unsigned int i = 0; i < cmap.getSizeInCellsX(); i++) {
for(unsigned int j = 0; j < cmap.getSizeInCellsY(); j++) {
cmap.setCost(i, j, dist(gen));
}
}
nav2_costmap_2d::Costmap2DPublisher pub(node, &cmap, "map", "costmap", true, 0.0);
auto t1 = std::chrono::high_resolution_clock::now();
for(int i = 0; i < 10000; i++) {
pub.prepareCostmap();
}
auto t2 = std::chrono::high_resolution_clock::now();
std::cout << "Time taken with memcpy: " << std::chrono::duration_cast<std::chrono::milliseconds>(t2 - t1).count() << "ms\n";
node->deactivate();
node->cleanup();
node->shutdown();
rclcpp::spin_some(node->get_node_base_interface());
rclcpp::shutdown();
return 0;
} |
Ah very nice and perfect, a good 5-10x speed up! |
@glitchhopcore I'm curious: anywhere else in this class we can optimize performance? These are the single largest data structures nav2 publishes internally, so improvements in the Costmap2DPublisher and Costmap2DSubscriber are significant. Also, might want to look at the subscriber too for how we copy data around 😉 https://github.com/ros-navigation/navigation2/blob/main/nav2_costmap_2d/src/costmap_subscriber.cpp |
Signed-off-by: doublebrackets <[email protected]> (cherry picked from commit 4694386)
…4920) Signed-off-by: doublebrackets <[email protected]> (cherry picked from commit 4694386) Co-authored-by: doublebrackets <[email protected]>
Basic Info
Description of contribution in a few bullet points
Description of documentation updates required from your changes
Future work that may be required in bullet points
For Maintainers: