Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Component Container crash #296

Open
2 tasks done
shawnhanna opened this issue Jan 16, 2025 · 6 comments
Open
2 tasks done

Component Container crash #296

shawnhanna opened this issue Jan 16, 2025 · 6 comments
Labels
bug Something isn't working

Comments

@shawnhanna
Copy link

Preliminary Checks

  • This issue is not a duplicate. Before opening a new issue, please search existing issues.
  • This issue is not a question, feature request, or anything other than a bug report directly related to this project.

Description

Running a single zed x camera using pos tracking, this crash happend:

[component_container_mt-14] terminate called after throwing an instance of 'std::invalid_argument'
[component_container_mt-14]   what():  The parent and child can not be the same
[ERROR] [component_container_mt-14]: process has died [pid 53921, exit code -6, cmd '/opt/ros/humble/lib/rclcpp_components/component_container_mt --ros-args --log-level info --ros-args -r __node:=front_container -r __ns:=/front -p use_sim_time:=False'].

Steps to Reproduce

Not able to consistently reproduce, but it happens eventually apparently at random.

NOTE: I also see seg faults (exited with code -9) on occasion

Expected Result

Container to not crash

Actual Result

Container Crashes

ZED Camera model

ZED

Environment

OS: Ubuntu 22.04, running in a custom docker image
Jetson Orin AGX
ZED SDK version: 4.2.2
ZED Driver: 1.2.1

Using the quad capture card, all 4 cameras connected, but only running the zed ros wrapper on one camera.

Anything else?

If there's additional else I can help provide, let me know

@shawnhanna shawnhanna added the bug Something isn't working label Jan 16, 2025
@Myzhar
Copy link
Member

Myzhar commented Jan 17, 2025

Hi @shawnhanna
Can you please upgrade the SDK to the latest v4.2.3 and let me know if the problem persists?

Eventually, please also change the positional tracking mode to GEN_1 to understand if the cause of the problem is the algorithm running with GEN_2.

@WorstCodeWay
Copy link

Hi @Myzhar , I meet this crash too, and have contacted your technical supporting for more than two week by email, no solutions, no replies for 4 days recently. Here is my latest finding,

the call stacks when core dump happens

 Thread 8 "component_conta" received signal SIGSEGV, Segmentation fault.
   [Switching to Thread 1666943.1667341]
   0x0000fffff79a0610 in ___pthread_mutex_lock (mutex=0x1bf8) at ./nptl/pthread_mutex_lock.c:80
   80    ./nptl/pthread_mutex_lock.c: No such file or directory.
   (gdb) bt
   #0  0x0000fffff79a0610 in ___pthread_mutex_lock (mutex=0x1bf8) at ./nptl/pthread_mutex_lock.c:80
   #1  0x0000ffffd9864600 in IOFTRACKER::KeyFrame::isBad() () from target:/usr/local/zed/lib/libsl_zed.so
   #2  0x0000ffffd98caa20 in IOFTRACKER::MiddleEnd::UpdateFrameIMU(float, IOFTRACKER::IMU::Bias const&, std::shared_ptr<IOFTRACKER::KeyFrame>) () from target:/usr/local/zed/lib/libsl_zed.so
   #3  0x0000ffffd98832b0 in IOFTRACKER::LocalMapping::InitializeIMU(float, float, bool) () from target:/usr/local/zed/lib/libsl_zed.so
   #4  0x0000ffffd9885418 in IOFTRACKER::LocalMapping::Run() () from target:/usr/local/zed/lib/libsl_zed.so
   #5  0x0000fffff7bd31fc in ?? () from target:/lib/aarch64-linux-gnu/libstdc++.so.6
   #6  0x0000fffff799d5c8 in start_thread (arg=0x0) at ./nptl/pthread_create.c:442
   #7  0x0000fffff7a05edc in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:79
   (gdb) 

and the crash log from zed component nodes

zed_wrapper.log

@Myzhar
Copy link
Member

Myzhar commented Jan 17, 2025

It seems that the cause of the problem is positional tracking GEN_2.
Please use GEN_1 while the SDK Team does not release all the stability fixes to GEN_2 (expected with the next patch release).

@WorstCodeWay
Copy link

It seems that the cause of the problem is positional tracking GEN_2.
Please use GEN_1 while the SDK Team does not release all the stability fixes to GEN_2 (expected with the next patch release).

Thanks. I'll try GEN_1 config.

Hi @shawnhanna
Can you please upgrade the SDK to the latest v4.2.3 and let me know if the problem persists?

Seems no help.
BTW, what is your meaning by next patch release? Is it 4.2.4(currently latest version is 4.2.3)? I ask it because I have tried GEN_2 with latest release sdk - 4.2.3 JUST NOW, and crash is still there.

@Myzhar
Copy link
Member

Myzhar commented Jan 22, 2025

Hopefully, the cumulative fixes will be released with the ZED SDK v4.2.4

@leondavi
Copy link

leondavi commented Feb 19, 2025

Container crash on Orin AGX docker image zed_ros2_l4t_36.3.0_sdk_4.2.3 when running zed_wrapper.
Camera: zed2i
OS: Ubuntu 22.04.5 LTS Jetpack 6.2
GPU: Orin AGX Jetson
CPU: ARM Cortex-A78AE
ZED-SDK version: 4.2.3

ZED ROS2 Docker Image
---------------------
ROS distro:  humble
DDS middleware:  rmw_fastrtps_cpp
ROS 2 Workspaces: /root/ros2_ws/install:/opt/ros/humble/install
ROS 2 Domain ID: 0
 * Note: Host and Docker image Domain ID must match to allow communication
Local IPs: 192.168.0.72 172.17.0.1 fd1a:7613:5aa3:2941:4aaa:9428:9784:d769 fd1a:7613:5aa3:2941:937e:9c3c:d814:48df
---
Available ZED packages:
zed_components
zed_msgs
zed_ros2
zed_wrapper
---------------------
To start a ZED camera node:
  ros2 launch zed_wrapper zed_camera.launch.py camera_model:=<zed|zedm|zed2|zed2i|zedx|zedxm|zedxonegs|zedxone4k>
---------------------
[INFO] [launch]: All log files can be found below /root/.ros/log/2025-02-18-21-49-49-140695-adilk-orin-ai3-219864
[INFO] [launch]: Default logging verbosity is set to INFO
Using common configuration file: /root/ros2_ws/install/zed_wrapper/share/zed_wrapper/config/common_stereo.yaml
[INFO] [launch.user]: * Loading ZED node: zed_node in container: /zed/zed_container
[INFO] [robot_state_publisher-1]: process started with pid [219950]
[INFO] [component_container_isolated-2]: process started with pid [219952]
[robot_state_publisher-1] [INFO] [1739911789.662497414] [zed.zed_state_publisher]: got segment zed_camera_center
[robot_state_publisher-1] [INFO] [1739911789.662704135] [zed.zed_state_publisher]: got segment zed_camera_link
[robot_state_publisher-1] [INFO] [1739911789.662725832] [zed.zed_state_publisher]: got segment zed_left_camera_frame
[robot_state_publisher-1] [INFO] [1739911789.662740872] [zed.zed_state_publisher]: got segment zed_left_camera_optical_frame
[robot_state_publisher-1] [INFO] [1739911789.662750472] [zed.zed_state_publisher]: got segment zed_right_camera_frame
[robot_state_publisher-1] [INFO] [1739911789.662757960] [zed.zed_state_publisher]: got segment zed_right_camera_optical_frame
[component_container_isolated-2] [INFO] [1739911789.852359050] [zed.zed_container]: Load Library: /root/ros2_ws/install/zed_components/lib/libzed_camera_component.so
[component_container_isolated-2] Authorization required, but no authorization protocol specified
[component_container_isolated-2] Authorization required, but no authorization protocol specified
[component_container_isolated-2] Authorization required, but no authorization protocol specified
[component_container_isolated-2] Authorization required, but no authorization protocol specified
[component_container_isolated-2] [INFO] [1739911789.934500780] [zed.zed_container]: Found class: rclcpp_components::NodeFactoryTemplate<stereolabs::ZedCamera>
[component_container_isolated-2] [INFO] [1739911789.934638893] [zed.zed_container]: Instantiate class: rclcpp_components::NodeFactoryTemplate<stereolabs::ZedCamera>
[component_container_isolated-2] double free or corruption (out)
[ERROR] [component_container_isolated-2]: process has died [pid 219952, exit code -6, cmd '/opt/ros/humble/install/lib/rclcpp_components/component_container_isolated --use_multi_threaded_executor --ros-args --log-level info --ros-args -r __node:=zed_container -r __ns:=/zed'].

Steps to reproduce:
The container was generated with the script:
https://github.com/stereolabs/zed-ros2-wrapper/blob/master/docker/jetson_build_dockerfile_from_sdk_and_l4T_version.sh
I changed the url source for protobuf to download it from index https://pypi.org/simple

Container was pulled to an Orin AGX with Jetpack 6.2 Ubuntu 22.04.
Camera type: zedi2
The following command was executed to start the container with zed_wrapper:

docker  run --runtime=nvidia -it  --privileged  -e DISPLAY=$DISPLAY --network=host --ipc=host --pid=host -e NVIDIA_DRIVER_CAPABILITIES=all -v /tmp/.X11-unix/:/tmp/.X11-unix   -v /dev:/dev   -v /dev/shm:/dev/shm   -v /usr/local/zed/resources/:/usr/local/zed/resources/   -v /usr/local/zed/settings/:/usr/local/zed/settings/  zed_ros2_l4t_36.3.0_sdk_4.2.3 ros2 launch zed_wrapper zed_camera.launch.py camera_model:=zed2i

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Development

No branches or pull requests

4 participants