Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use monotonic clock for triggering read-update-write cycles + fix for overruns #2046

Open
wants to merge 18 commits into
base: master
Choose a base branch
from

Conversation

saikishor
Copy link
Member

@saikishor saikishor commented Feb 10, 2025

For real-time applications and for certain hardware monotonic clock is pretty much needed and in certain occasions, it is also needed for the safety. This PR addressed this by having a monotonic clock for non simulation cases.

Apart from that, there is a bug in the CM statistics in the first loop, but then apart from that, when there is a controller that takes longer in the loop, this causes an overrun in the system. In order to fix it, I've added an overrun check and adjust the time properly to trigger the loops.

Earlier, the triggered overruns are the cause for such fast loops, resulting in high periodicity reports

Tested on real hardware TIAGo at 1kHz

Makes #2042 obselete
closes #2042

Copy link
Contributor

@christophfroehlich christophfroehlich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this fix. LGTM as far as I can tell.

controller_manager/src/ros2_control_node.cpp Show resolved Hide resolved
Copy link
Contributor

@christophfroehlich christophfroehlich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh

Terminate called after throwing an instance of 'std::runtime_error'
      what():  can't compare times with different time sources

@saikishor
Copy link
Member Author

oh

Terminate called after throwing an instance of 'std::runtime_error'
      what():  can't compare times with different time sources

on it

Copy link

codecov bot commented Feb 10, 2025

Codecov Report

Attention: Patch coverage is 71.79487% with 33 lines in your changes missing coverage. Please review.

Project coverage is 89.14%. Comparing base (254b6e8) to head (9cf4584).

Files with missing lines Patch % Lines
hardware_interface/src/resource_manager.cpp 46.42% 11 Missing and 4 partials ⚠️
controller_manager/src/ros2_control_node.cpp 52.94% 7 Missing and 1 partial ⚠️
...chainable_controller/test_chainable_controller.cpp 0.00% 2 Missing and 2 partials ⚠️
controller_manager/src/controller_manager.cpp 80.00% 2 Missing ⚠️
...r_manager/test/test_controller/test_controller.cpp 0.00% 1 Missing and 1 partial ⚠️
...ith_interfaces/test_controller_with_interfaces.cpp 0.00% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2046      +/-   ##
==========================================
- Coverage   89.34%   89.14%   -0.21%     
==========================================
  Files         139      139              
  Lines       15000    15046      +46     
  Branches     1291     1301      +10     
==========================================
+ Hits        13402    13413      +11     
- Misses       1113     1140      +27     
- Partials      485      493       +8     
Flag Coverage Δ
unittests 89.14% <71.79%> (-0.21%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
.../include/controller_manager/controller_manager.hpp 100.00% <ø> (ø)
...er_manager/test/controller_manager_test_common.hpp 89.83% <100.00%> (ø)
...ontroller_manager/test/test_controller_manager.cpp 96.10% <100.00%> (ø)
...t_controllers_chaining_with_controller_manager.cpp 99.23% <100.00%> (+<0.01%) ⬆️
..._interface/include/hardware_interface/actuator.hpp 100.00% <ø> (ø)
.../include/hardware_interface/actuator_interface.hpp 91.45% <100.00%> (+2.46%) ⬆️
...re_interface/include/hardware_interface/sensor.hpp 100.00% <ø> (ø)
...ce/include/hardware_interface/sensor_interface.hpp 91.42% <100.00%> (ø)
...re_interface/include/hardware_interface/system.hpp 100.00% <ø> (ø)
...ce/include/hardware_interface/system_interface.hpp 79.71% <100.00%> (ø)
... and 9 more

... and 3 files with indirect coverage changes

@saikishor
Copy link
Member Author

@christophfroehlich I changed my approach to the original one, as this way we have the tests running with the Monotonic clock, and any changes to the code can be early tested with the existing test. If not, we will have to test the changes every time on the real robot

@ileniap
Copy link

ileniap commented Feb 12, 2025

I tested this fix on TIAGo PRo that is running at 1khz, and it works perfectly.
Thank you Sai for the great work!

bmagyar
bmagyar previously approved these changes Feb 12, 2025
@saikishor saikishor dismissed stale reviews from bmagyar and christophfroehlich via cecf3c3 February 13, 2025 01:16
@saikishor
Copy link
Member Author

@bmagyar as you raised the point of possible discrepancies of time in the controllers. I've added a test to make sure that the controllers always receive the ROS time instead of the steady clock time

@mamueluth
Copy link
Member

@saikishor i have a question, is it not possible anymore to pass the desired clock type e.g. RCL_STEADY_TIME to a node? I know this was possible as i did this some time ago. And maybe i am mistaken but if we could simply set the clock type of the node we could then use the rclcpp interface for clock handling (now(), get_clock(), ...). This would probably eliminate the need for a get_trigger_clock function. But maybe i am mistaken.

@saikishor
Copy link
Member Author

@saikishor i have a question, is it not possible anymore to pass the desired clock type e.g. RCL_STEADY_TIME to a node? I know this was possible as i did this some time ago. And maybe i am mistaken but if we could simply set the clock type of the node we could then use the rclcpp interface for clock handling (now(), get_clock(), ...). This would probably eliminate the need for a get_trigger_clock function. But maybe i am mistaken.

@mamueluth Yes, you are right. We can also approach it that way. Indeed, that was my first approach to the problem. Look here: 5b4cb98

With the above approach of setting the CM's deafult clock with Steady clock, the time that we are going to pass to the update methods of the controllers will be of type Steady clock and then when we compare it with the time they receive from the messages will be in ROS clock and this comparision will result in mismatch and it will throw an exception about comarision

https://github.com/ros2/rclcpp/blob/605251bd71da0427ab2cdca26a9e737ca93e73d9/rclcpp/src/rclcpp/time.cpp#L114-L121
https://github.com/ros2/rclcpp/blob/605251bd71da0427ab2cdca26a9e737ca93e73d9/rclcpp/src/rclcpp/time.cpp#L182-L189

For instance, this part of code in diff_drive_controller will be affected and in someother places:

https://github.com/ros-controls/ros2_controllers/blob/c83cfb38b228c47fab4bf937b4cb5517c6fe4c81/diff_drive_controller/src/diff_drive_controller.cpp#L122

There could be many other controllers that this might be affected. I wanted to avoid that for the controllers. If we want to have it functional, then we will need a separate clock anyway. I think it is better to have a separate monotic steady clock rather than a separate ROS clock. This way all messages that we publish and receive from the controller_manager or any further integrations we do, can use the ROS clock and we can use the new steady clock only for the RT loop operations.

What do you think?

@mamueluth
Copy link
Member

Yes, absolutely I had a similar issue with some controllers about the different time sources. I think i got what you try to avoid by creating a seperate clock for the real-time loop, but I did not quite catch why we need a separate clock.

An idea I just had what about we pass the clock type of the ControllerManager to the controller? We could do this here:

rclcpp::NodeOptions ControllerManager::determine_controller_node_options(

Right before the controller is initialized. Do you think this would then solve the comparison of different time sources since a controller should still have its own clock, but they use the same clock type?

@saikishor
Copy link
Member Author

saikishor commented Feb 13, 2025

Right before the controller is initialized. Do you think this would then solve the comparison of different time sources since a controller should still have its own clock, but they use the same clock type?

@mamueluth Not really, because the clock received from the messages in ROS layer is in ROS time and when you set the steady clock to the controller as well, then you cannot compare the time at any instance. The problem is the steady clock doesn't follow the same format of the system clock, so we will have issues.

To give more context, the following is output from the steady clock (monotonic) and system clock printing their nanoseconds (now().nanoseconds())

[ros2_control_node-1] [ERROR] [1739446438.028262018] [controller_manager]: The steady clock time is : 10709869267971
[ros2_control_node-1] [ERROR] [1739446438.028307388] [controller_manager]: The ROS clock time is : 1739446438028306878

I couldn't use the node options, also for the fact that that option doesn't exist in Humble and moreover, if we want the CM to be able to publish topics like this #2006, we would definitely need another clock.

@mamueluth
Copy link
Member

Ah ok now i got it, thanks for the clarification!:)

@mamueluth
Copy link
Member

@saikishor Thanks for the deep dive here and the great work!

LGTM

@saikishor
Copy link
Member Author

@saikishor Thanks for the deep dive here and the great work!

LGTM

Thank you for the discussion. Discussions always bring new points of view.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants