The implementation for MuJoCo environments is in mujoco/.
Acknowledgement
-
[1]This repo is based on Learning-Feasibility-Different-Dynamics.
-
Contrastive clustering algorithm is based on Deep Clustering Network.
The implementation for Driving environment is in carlo/.
Acknowledgement
- This repo is based on https://github.com/Stanford-ILIAD/CARLO
The implementation for Simulated Franka Panda Arm is in simulated_robot/.
Acknowledgement
- This repo is based on https://github.com/ku2482/gail-airl-ppo.pytorch
You can resort to wandb to login your personal account via export your own wandb api key.
export WANDB_API_KEY=YOUR_WANDB_API_KEY
and run
wandb online
to turn on the online syncronization.
[1] Z. Cao, Y. Hao, M. Li, and D. Sadigh. Learning feasibility to imitate demonstrators with different dynamics. In CoRL, 2021.
[2] Schulman, John, et al. "Proximal policy optimization algorithms." arXiv preprint arXiv:1707.06347 (2017).
[3] Ho, Jonathan, and Stefano Ermon. "Generative adversarial imitation learning." Advances in neural information processing systems. 2016.
[4] Fu, Justin, Katie Luo, and Sergey Levine. "Learning robust rewards with adversarial inverse reinforcement learning." arXiv preprint arXiv:1710.11248 (2017).