Skip to content

Commit

Permalink
init
Browse files Browse the repository at this point in the history
  • Loading branch information
HeegerGao committed Dec 11, 2024
1 parent b5597dd commit 0f96078
Show file tree
Hide file tree
Showing 1,236 changed files with 2,970,285 additions and 2 deletions.
200 changes: 200 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,200 @@
log
*.egg-info
__pycache__/
wandb
data/

# pycharm configs
.idea/

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class


.DS_Store

# C extensions
*.so

# Distribution / packaging
.Python
develop-eggs/
dist/
eggs/
.eggs/
lib/
lib64/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/
.pytest_cache/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3

# Flask stuff:
.webassets-cache

# Scrapy stuff:
.scrapy


# pyenv
.python-version

# SageMath parsed files
*.sage.py

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mypy
.mypy_cache/
*.pth

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

#
*.zip

# Distribution / packaging
.Python
develop-eggs/
eggs/
.eggs/
*.egg-info/
.installed.cfg
*.egg

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/

# Translations
*.mo
*.pot

# Django stuff:
*.log

# Flask stuff:
.webassets-cache

# Scrapy stuff:
.scrapy

# pyenv
.python-version

# celery beat schedule file
celerybeat-schedule

# SageMath parsed files
*.sage.py

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject
# mypy
.mypy_cache/

# mac
.DS_Store

.idea

.pytest_cache/
experiments/
results/
dynamics_results/
resnet50/
finetune_liv/
Meta-Llama-3.1-8B/
fine_tuned_vae_decoder/
output/
/models/
cvae_results/
policy_results/
policy_eval/
eval/
vis/
bar.png
cloth_policy/
cloth_video_output/
config/cloth/
config/tea/
scripts/train_cloth_policy.py
scripts/preprocess_data_cloth.py
scripts/test.py
flip/network/cloth_policy.py
flip/dataloader/cloth_policy_dataloader.py
config/libero_10/dp.yaml
config/libero_10/policy_eval.yaml
flip/dataloader/policy_dataloader.py
flip/network/policy.py
scripts/eval_policy.py
scripts/preprocess_data_sa.py
config/libero_10/preprocess_sa.yaml
scripts/train_policy.py
167 changes: 165 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,165 @@
# FLIP
Code for FLIP: Flow-Centric Generative Planning for General-Purpose Manipulation Tasks
# FLIP: Flow-Centric Generative Planning for General-Purpose Manipulation Tasks

Official Code Repository for **FLIP: Flow-Centric Generative Planning for General-Purpose Manipulation Tasks**.

[Chongkai Gao](https://chongkaigao.com/)<sup>1</sup>, [Haozhuo Zhang](https://haozhuo-zhang.github.io/)<sup>2</sup>, [Zhixuan Xu](https://ariszxxu.github.io/)<sup>1</sup>, Zhehao Cai<sup>1</sup>, [Lin Shao](https://linsats.github.io/)<sup>1</sup>

<sup>1</sup>National University of Singapore, <sup>2</sup>Peking University

<p align="center">
<a href='https://nus-lins-lab.github.io/flipweb/'>
<img src='https://img.shields.io/badge/Paper-arXiv-red?style=plastic&logo=arXiv&logoColor=red' alt='Paper arXiv'>
</a>
<a href='https://nus-lins-lab.github.io/flipweb/'>
<img src='https://img.shields.io/badge/Project-Page-66C0FF?style=plastic&logo=Google%20chrome&logoColor=66C0FF' alt='Project Page'>
</a>
</p>
<div align="center">
<img src="imgs/teaser.png" alt="main" width="95%">
</div>

In this paper, we present FLIP, a model-based planning algorithm on visual space that features three key modules: 1. a multi-modal flow generation model as the general-purpose action proposal module; 2. a flow-conditioned video generation model as the dynamics module; and 3. a vision-language representation learning model as the value module. Given an initial image and language instruction as the goal, FLIP can progressively search for long-horizon flow and video plans that maximize the discounted return to accomplish the task. FLIP is able to synthesize long-horizon plans across objects, robots, and tasks with image flows as the general action representation, and the dense flow information also provides rich guidance for long-horizon video generation. In addition, the synthesized flow and video plans can guide the training of low-level control policies for robot execution.

## Installaltion

### 1. Create Python Environment

```
conda create -n flip python==3.8
conda activate flip
```

### 2. Install Dependencies

```
pip install -r requirements.txt
```

### 3. Download CoTracker V2 Checkpoint
```
cd flip/co_tracker
wget https://huggingface.co/facebook/cotracker/resolve/main/cotracker2.pth
```

### 4. Download Meta Llama 3.1 8B
1. Get the download access from https://huggingface.co/meta-llama/Llama-3.1-8B.
2. Put the downloaded folder at `./`. You should have a file structure like this:
```
...
- liv
- llama_models
- Meta-Llama-3.1-8B
- consolidated.00.pth
- params.json
- tokenizer.model
- scripts
...
```

### 5. Download LIV Pretrained Models

1. Download the `model.pt` and `config.yaml` accroding to `https://github.com/penn-pal-lab/LIV/blob/main/liv/__init__.py#L33`.
2. `mkdir liv/resnet50`.
3. Put the `model.pt` and `config.yaml` under `liv/resnet50`. You should have a file structure like this:
```
...
- liv
- cfgs
- dataset
- examples
- models
- resnet50
- config.yaml
- model.pt
- utils
__init__.py
train_liv.py
trainer.py
- llama_models
...
```

## Data Preparation

### 1. Download the LIBERO-LONG Dataset

1. `wget https://utexas.box.com/shared/static/cv73j8zschq8auh9npzt876fdc1akvmk.zip`

2. `mkdir data/libero_10`

3. unzip and put the 10 LIBERO-10 hdf5 files into `data/libero_10`

### 2. Replay

`python scripts/replay_libero_data_from_hdf5.py`

By default, the resolution is 128 $\times$ 128.

### 3. Flow Tracking

`python scripts/video_tracking.py`.

By default, we only track the agentview demos. You may change the `eye_in_hand` to `true` in the `config/libero_10/tracking.yaml` to track the eye_in_hand demos.

### 4. Data Preprocessing

`python scripts/preprocess_data_to_hdf5.py`.

By default, we only preprocess the agentview demos. You may change the `eye_in_hand` to `true` in the `config/libero_10/preprocess.yaml` to preprocess the eye_in_hand demos.


## Training

### 1. Train the Flow Generation Model (Action Module)

`torchrun --nnodes=1 --nproc_per_node=2 scripts/train_cvae.py`

You can change `config/libero_10/cvae.yaml` for custom training. Current config is for A100 40G GPUs.

### 2. Train the Video Generation Model (Dynamics Module)

`torchrun --nnodes=1 --nproc_per_node=2 scripts/train_dynamics.py`

You can change `config/libero_10/dynamics.yaml` for custom training. Current config is for A100 40G GPUs.

### 3. Finetune the LIV Model with Video Clips (Value Module)

`python scripts/finetune_liv.py`

This script will first make a liv dataset and then train on it.

You may change the configs in `config/libero_10/finetune_liv.yaml`, `liv/cfgs/dataset/libero_10.yaml`, and `liv/cfgs/training/finetune.yaml` according to your own tasks.

### 4. Finetune the Pretrained VAE Encoder (for the Dynamics Module)

`torchrun --nnodes=1 --nproc_per_node=8 scripts/finetune_vae.py`

You can change `config/libero_10/finetune_vae.yaml` for custom training.


## Testing

1. `makedir models/libero_10`

2. put all the trained models (agentview_dynamics.pt, cvae.pt, finetuned_vae.pt, reward.pt) under `models/libero_10`

3. `torchrun scripts/hill_climbing.py`


## Separate Testing of Action Module and Dynamics Module

1. Action Module: `python scripts/eval_cvae.py`.
2. Dynamics Module: `scripts/train_dynamics.py`.

## Citation

If you find our codes or models useful in your work, please cite [our paper](https://nus-lins-lab.github.io/flipweb/):

```
TODO
```


## Contact

If you have any questions, feel free to contact me through email ([[email protected]]([email protected]))!
Loading

0 comments on commit 0f96078

Please sign in to comment.