This example demonstrates how to run with a real robot using an ALOHA setup. See here for instructions on how to load checkpoints and run inference. We list the relevant checkpoint paths for each provided fine-tuned model below.
This repo uses a fork of the ALOHA repo, with very minor modifications to use Realsense cameras.
- Follow the hardware installation instructions in the ALOHA repo.
- Modify the
third_party/aloha/aloha_scripts/realsense_publisher.py
file to use serial numbers for your cameras.
export SERVER_ARGS="--env ALOHA --default_prompt='take the toast out of the toaster'"
docker compose -f examples/aloha_real/compose.yml up --build
Terminal window 1:
# Create virtual environment
uv venv --python 3.10 examples/aloha_real/.venv
source examples/aloha_real/.venv/bin/activate
uv pip sync examples/aloha_real/requirements.txt
uv pip install -e packages/openpi-client
# Run the robot
python examples/aloha_real/main.py
Terminal window 2:
roslaunch --wait aloha ros_nodes.launch
Terminal window 3:
uv run scripts/serve_policy.py --env ALOHA --default_prompt='take the toast out of the toaster'
The pi0_base
model can be used in zero shot for a simple task on the ALOHA platform, and we additionally provide two example fine-tuned checkpoints, “fold the towel” and “open the tupperware and put the food on the plate,” which can perform more advanced tasks on the ALOHA.
While we’ve found the policies to work in unseen conditions across multiple ALOHA stations, we provide some pointers here on how best to set up scenes to maximize the chance of policy success. We cover the prompts to use for the policies, objects we’ve seen it work well on, and well-represented initial state distributions. Running these policies in zero shot is still a very experimental feature, and there is no guarantee that they will work on your robot. The recommended way to use pi0_base
is by finetuning with data from the target robot.
This task involves the robot taking two pieces of toast out of a toaster and placing them on a plate.
- Checkpoint path:
s3://openpi-assets/checkpoints/pi0_base
- Prompt: "take the toast out of the toaster"
- Objects needed: Two pieces of toast, a plate, and a standard toaster.
- Object Distribution:
- Works on both real toast and rubber fake toast
- Compatible with standard 2-slice toasters
- Works with plates of varying colors
![Screenshot 2025-01-31 at 10 06 02 PM](https://private-user-images.githubusercontent.com/23410440/408779204-3d043d95-9d1c-4dda-9991-e63cae61e02e.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzkyNjY5NjgsIm5iZiI6MTczOTI2NjY2OCwicGF0aCI6Ii8yMzQxMDQ0MC80MDg3NzkyMDQtM2QwNDNkOTUtOWQxYy00ZGRhLTk5OTEtZTYzY2FlNjFlMDJlLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTElMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjExVDA5Mzc0OFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTMzZDU0MDFhYzU1NTJmNTQyZWUyZGJmOTdkNzVkYTI1NDliYmY2NzlmODgxMDRmNzcyOGM2MmM1MDNiOWMzMTQmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.vIXDtUv2JUIZRktV6J114tfThtV98mtb3UuLrq3zfEM)
- The toaster should be positioned in the top-left quadrant of the workspace.
- Both pieces of toast should start inside the toaster, with at least 1 cm of bread sticking out from the top.
- The plate should be placed roughly in the lower-center of the workspace.
- Works with both natural and synthetic lighting, but avoid making the scene too dark (e.g., don't place the setup inside an enclosed space or under a curtain).
This task involves folding a small towel (e.g., roughly the size of a hand towel) into eighths.
- Checkpoint path:
s3://openpi-assets/checkpoints/pi0_aloha_towel
- Prompt: "fold the towel"
- Object Distribution:
- Works on towels of varying solid colors
- Performance is worse on heavily textured or striped towels
![Screenshot 2025-01-31 at 10 01 15 PM](https://private-user-images.githubusercontent.com/23410440/408778995-9410090c-467d-4a9c-ac76-96e5b4d00943.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzkyNjY5NjgsIm5iZiI6MTczOTI2NjY2OCwicGF0aCI6Ii8yMzQxMDQ0MC80MDg3Nzg5OTUtOTQxMDA5MGMtNDY3ZC00YTljLWFjNzYtOTZlNWI0ZDAwOTQzLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTElMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjExVDA5Mzc0OFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWZmYjdlMmNkMTQxMWIxNTY2YzRhN2ViYmZiZDFlYTNmMGMyODEwODMxYzU1ZmRmMzcyYjUwYTRmNWVkMDg2NWYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.gST9og_EgYx8W1jwpQfMy81UrMgtx9oVEOdMhf3aOK8)
- The towel should be flattened and roughly centered on the table.
- Choose a towel that does not blend in with the table surface.
This task involves opening a tupperware filled with food and pouring the contents onto a plate.
- Checkpoint path:
s3://openpi-assets/checkpoints/pi0_aloha_tupperware
- Prompt: "open the tupperware and put the food on the plate"
- Objects needed: Tupperware, food (or food-like items), and a plate.
- Object Distribution:
- Works on various types of fake food (e.g., fake chicken nuggets, fries, and fried chicken).
- Compatible with tupperware of different lid colors and shapes, with best performance on square tupperware with a corner flap (see images below).
- The policy has seen plates of varying solid colors.
![Screenshot 2025-01-31 at 10 02 27 PM](https://private-user-images.githubusercontent.com/23410440/408779042-60fc1de0-2d64-4076-b903-f427e5e9d1bf.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzkyNjY5NjgsIm5iZiI6MTczOTI2NjY2OCwicGF0aCI6Ii8yMzQxMDQ0MC80MDg3NzkwNDItNjBmYzFkZTAtMmQ2NC00MDc2LWI5MDMtZjQyN2U1ZTlkMWJmLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTElMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjExVDA5Mzc0OFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWMyNWZjMjI5MjE1OGQ3NmY2MmRhOGE1ZWU4NmMzNWVmYjc1YjA2YmRmOGIzNmY3YzIzMWFjZTVkOGY3NDA1MTQmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.FklY6xtkdsU5XjkIu8Kq6qiOB7YZTuOEby_5lDG7MJw)
- Best performance observed when both the tupperware and plate are roughly centered in the workspace.
- Positioning:
- Tupperware should be on the left.
- Plate should be on the right or bottom.
- The tupperware flap should point toward the plate.
-
Convert the dataset to the LeRobot dataset v2.0 format.
We provide a script convert_aloha_data_to_lerobot.py that converts the dataset to the LeRobot dataset v2.0 format. As an example we have converted the
aloha_pen_uncap_diverse_raw
dataset from the BiPlay repo and uploaded it to the HuggingFace Hub as physical-intelligence/aloha_pen_uncap_diverse. -
Define a training config that uses the custom dataset.
We provide the pi0_aloha_pen_uncap config as an example. You should refer to the root README for how to run training with the new config.
IMPORTANT: Our base checkpoint includes normalization stats from various common robot configurations. When fine-tuning a base checkpoint with a custom dataset from one of these configurations, we recommend using the corresponding normalization stats provided in the base checkpoint. In the example, this is done by specifying the trossen asset_id and a path to the pretrained checkpoint’s asset directory within the AssetsConfig.