This is a Viam module providing a model of vision service for tracking object using ReID.
To use this module, follow these instructions to add a module from the Viam Registry and select the viam:vision:re-id-object-tracker
model from the re-id-object-tracker
module.
This module implements the following methods of the vision service API:
GetDetections()
: returns the bounding boxes with the unique id as label and the object detection confidence as confidence.GetClassifications()
: returns the labelnew_object_detected
for an image when a new object enters the scene.CaptureAllFromCamera()
: returns the next image and detections or classifications all together, given a camera name.
in progress
Note
Before configuring your vision service, you must create a robot.
Navigate to the CONFIGURE tab of your machine in the Viam app. Add vision / re-id-object-tracker to your machine.
The following attributes are required to configure your re-id-object-tracker
module:
{
"camera_name": "camera-1",
"path_to_database": "/path/to/database.db" # the file doesn't need to exist
}
In addition to the vision service API, the re-id-object-tracking
module supports some model-specific commands that allow you to add, delete, relabel and list people.
You can invoke these commands by passing appropriately keyed JSON documents to the DoCommand()
method using one of Viam's SDKs.
The list_current
doCommand is used to get all the information of the currently detected tracks.
Input:
"list_current": true
returns:
{
"list_current": {
track_id": {
"manual_label": str,
"face_id_label": str,
"face_id_conf": float,
"re_id_label": str,
"re_id_conf": float
}
}
}
The object tracker generates by default a unique ID string in the format "<category>_N_YYYYMMDD_HHMMSS"
. Given this unique id, the user can add a label to track (attached to the manual_label
field in the output of list_current
).
"relabel": {"person_N_20241126_190034": "Known Person"}
returns:
{
"relabel": {
"person_N_20241126_190034": "success: changed label to 'Known Person' "
}
Recomputes embeddings.
"recompute_embeddings": true
Name | Type | Inclusion | Default | Description |
---|---|---|---|---|
camera_name |
string | Required | Camera name to be used as input for tracking. | |
path_to_database |
string | Required | Path to the database where tracking information is stored. | |
lambda_value |
float | Optional | 0.95 |
The lambda value is meant to adjust the contribution of the re-id and the IoU matchings. The distance between two tracks equals: λ * feature_dist + (1 - λ) * (1 - IoU_score). |
max_age_track |
int | Optional | 1e3 |
Maximum age (in frames) for a track to be considered active. Ranges from 0 to 1e5. |
min_distance_threshold |
float | Optional | 0.3 |
Minimum distance threshold for considering two tracks as distinct. Values range from 0 to 5. |
feature_distance_metric |
string | Optional | 'cosine' |
Metric used for calculating feature distance. Options include cosine and euclidean . Refer to torch-re-id model zoo to select the metric that matches your model. |
cooldown_period_s |
float | Optional | 5 |
Duration for which the trigger is on.new_object_detected . |
re_id_threshold |
float | Optional | 0.3 |
Threshold for determining whether two persons match based on body features similarity. |
min_track_persistence |
int | Optional | 10 |
Minimum number of frames a track candidate must persist before beinfg promoted to a track. |
max_frequency_hz |
float | Optional | 10 |
Frequency at which the tracking steps are performed. |
save_to_db |
bool | Optional | True |
Indicates whether tracks should be saved to the database. |
save_period |
int | Optional | 20 |
Interval (in number of tracking steps) when tracks are saved to the database. |
start_fresh |
bool | Optional | False |
Whether or not to load the tracks from the database at reconfigure() . |
path_to_known_persons |
string | Optional | None |
Path to the database containing pictures of entire persons. If the directory does not exist it will be created at reconfigure() . Refer example directory tree to see how to add pictures of known persons and associate labels with the persons. |
Name | Type | Inclusion | Default | Description |
---|---|---|---|---|
detector_model_name |
string | Optional | 'fasterrcnn_mobilenet_v3_large_320_fpn' |
Name of the model used for detection. Only option at the moment. |
detection_threshold |
float | Optional | 0.8 |
Confidence threshold for detecting objects, with values ranging from 0.0 to 1.0. |
detector_device |
string | Optional | 'cpu' |
Device on which the detection model will run. Options are cpu and gpu . |
Name | Type | Inclusion | Default | Description |
---|---|---|---|---|
feature_extractor_model |
string | Optional | 'osnet_ain_x1_0' |
Name of the model used for feature extraction. Only option at the moment. |
feature_encoder_device |
string | Optional | 'cuda' |
Device on which the feature encoder will run. Options are cpu and cuda . |
Name | Type | Inclusion | Default | Description |
---|---|---|---|---|
path_to_known_faces |
string | Optional | None |
Path to a file or database containing images or embeddings of known faces. If the directory does not exist it will be created at reconfigure() . Refer example directory tree to see how to add pictures of known faces and associate labels with the faces. |
face_detector_device |
string | Optional | 'cpu' |
Device on which the face detector will run. Options are cpu and cuda . |
face_detector_model |
string | Optional | 'ultraface_version-RFB-320-int8' |
Name of the model used for face detection. Only option at the moment. |
face_detection_threshold |
float | Optional | 0.9 |
Confidence threshold for detecting faces, with values ranging from 0.0 to 1.0. |
face_feature_extractor_model |
string | Optional | 'facenet' |
Model used for extracting features from detected faces for identification. Only option at the moment. |
cosine_id_threshold |
float | Optional | 0.3 |
Threshold for determining face identity matches using cosine similarity. Both cosine and euclidean distances should be under threshold for faces to be considered as match. |
euclidean_id_threshold |
float | Optional | 0.9 |
Threshold for determining face identity matches using Euclidean distance. |
In the example below, all persons (or faces) detected in any pictures within the directory French_Team
will have an embedding associated with the label French_Team
. The supported image formats for known faces are PNG and JPEG.
path
└── to
└── known_faces
└── Zinedine_Zidane
│ └── zz_1.png
│ └── zz_2.jpeg
│ └── zz_3.jpeg
│
└── Jacques_Chirac
│ └── jacques_1.jpeg
│
└── French_Team
| └── ribery.jpeg
| └── vieira.png
| └── thuram.jpeg
| └── group_picture.jpeg
│
└── Italian_Team
└── another_group_picture.png