This is a ROS2 wrapper for Vision Transformers for Dense Prediction with an image segmentation, DPT. We utilize huggingface
and the transformers
for the source of the algorithm. The main idea is for this container to act as a standalone interface and node, removing the necessity to integrate separate packages and solve numerous dependency issues.
From paper: The paper assemble tokens from various stages of the vision transformer into image-like representations at various resolutions and progressively combine them into full-resolution predictions using a convolutional decoder. The transformer backbone processes representations at a constant and relatively high resolution and has a global receptive field at every stage. These properties allow the dense vision transformer to provide finer-grained and more globally coherent predictions when compared to fully-convolutional networks.
- Install Docker and ensure the Docker daemon is running in the background.
- Run
docker pull shaderobotics/dpt:${ROS2_DISTRO}
- Follow the run commands in the usage section below
- Install Docker and ensure the Docker daemon is running in the background.
- Clone this repo with
git pull https://github.com/open-shade/dpt.git
- Build the container with
docker build . -t [name]
. This will take a while. We have also provided associatedcloudbuild.sh
scripts to build on GCP all of the associated versions. - Follow the run commands in the usage section below.
Run docker run -t --net=host shaderobotics/dpt:${ROS_DISTRO}
. Your node should be running now. Then, by running ros2 topic list,
you should see all the possible pub and sub routes.
For more details explaining how to run Docker images, visit the official Docker documentation here. Also, additional information as to how ROS2 communicates between external environment or multiple docker containers, visit the official ROS2 docs here.
This wrapper utilizes 4 optional parameters to modify the data coming out of the published topics as well as the dataset YOLOS utilizes for comparison. Most parameters can be modified during runtime. However, if you wish to use your own dataset, you must pass that parameter in before runtime. If you are unsure how to pass or update parameters before or during runtime, visit the official ROS2 docs here.
The supported, optional parameters are...
Name | Type | Default | Use |
---|---|---|---|
pub_image | Boolean | True | Enable or disable the pub of the processed image (with bounding boxes) |
pub_pixels | Boolean | True | Enable or disable the pub of the pixels with associated classification IDs (8-bit image stream) |
pub_detections | Boolean | True | Enable or disable the publishing of detections (whether or not to send back a string with all detections found) |
pub_masks | Boolean | True | Enable or disable the publishing of masks (whether or not to send back a string with all detections found) |
You do not need to specify any parameters, unless you wish to modify the defaults.
Name | IO | Type | Use |
---|---|---|---|
dpt/image_raw | sub | sensor_msgs.msg.Image | Takes the raw camera output to be processed |
dpt/image | pub | sensor_msgs.msg.Image | Outputs the processed image with segmentation on top of the image |
dpt/pixels | pub | sensor_msgs.msg.Image | Outputs each pixel classified with the associated class ID as an 8-bit stream |
dpt/detections | pub | std_msgs.msg.String | Outputs all detected classes in the image |
dpt/masks | pub | sensor_msgs.msg.Image | Outputs the masks all in one image colorized based on class |
To test and ensure that this package is properly installed, replace the Dockerfile in the root of this repo with what exists in the demo folder. Installed in the demo image contains a camera stream emulator by klintan which directly pubs images to the DPT node and processes it for you to observe the outputs.
To run this, run docker build . -t --net=host [name]
, then docker run -t [name]
. Observing the logs for this will show you what is occuring within the container. If you wish to enter the running container and preform other activities, run docker ps
, find the id of the running container, then run docker exec -it [containerId] /bin/bash