This folder contains the node to perform person identification.
This node adds to people.json
an averaged face descriptor for the person in front of the robot.
Use the following command to launch the nodes.
ros2 launch person_identification capture_face.launch.xml name:=<person_name> neural_network_inference_type:=<cpu, torch_gpu or trt_gpu>
name
(string): The person name.mean_size
(int): How many descriptor to average. The default value is 10.face_sharpness_score_threshold
(double): The threshold to consider the face sharp enough. The default value is 0.5.
video_analysis
(perception_msgs/VideoAnalysis): The video analysis containing the detected objects.
This node adds to people.json
an averaged voice descriptor for the person talking near the robot.
Use the following command to launch the nodes.
ros2 launch person_identification capture_voice.launch.xml name:=<person_name> neural_network_inference_type:=<cpu, torch_gpu or trt_gpu>
name
(string): The person name.mean_size
(int): How many descriptor to average. The default value is 10.
audio_analysis
(perception_msgs/AudioAnalysis): The audio analysis containing the audio classes, general audio embedding, voice embedding and the sound direction.
This node performs person identification. The people must be already added to people.json
with the previous nodes.
face_sharpness_score_threshold
(double): The threshold to consider the face sharp enough. The default value is 0.5.face_descriptor_threshold
(double): The maximum distance between two face descriptors to be considered the same person. The default value is 0.7.voice_descriptor_threshold
(double): The maximum distance between two voice descriptors to be considered the same person. The default value is 1.266.face_voice_descriptor_threshold
(double): The maximum distance between two merged descriptors to be considered the same person. The default value is 1.5092.nose_confidence_threshold
(double): The confidence threshold for the nose keypoint. The default value is 0.4.direction_frame_id
(string): The audio analysis frame id. The default value is odas.direction_angle_threshold_rad
(double): The maximum angle between the face and voice directions to be considered the same person. The default value is 0.15.ignore_direction_z
(bool): Indicates if the angle between between the face and voice directions ignores the z-axis. The default value is true.search_frequency
(double): The frequency at which the search occurs. The default value is 2.
video_analysis
(perception_msgs/VideoAnalysis): The video analysis containing the detected objects.audio_analysis
(perception_msgs/AudioAnalysis): The audio analysis containing the audio classes, general audio embedding, voice embedding and the sound direction.
person_names
(perception_msgs/PersonNames): The person names.