The implementation for MuJoCo environments is in mujoco/.
[1]This repo is based on Learning-Feasibility-Different-Dynamics.
Contrastive clustering algorithm is based on Deep Clustering Network.
The implementation for Driving environment is in carlo/.
- This repo is based on
The implementation for Simulated Franka Panda Arm is in simulated_robot/.
- This repo is based on
You can resort to wandb to login your personal account via export your own wandb api key.
and run
wandb online
to turn on the online syncronization.
[1] Z. Cao, Y. Hao, M. Li, and D. Sadigh. Learning feasibility to imitate demonstrators with different dynamics. In CoRL, 2021.
[2] Schulman, John, et al. "Proximal policy optimization algorithms." arXiv preprint arXiv:1707.06347 (2017).
[3] Ho, Jonathan, and Stefano Ermon. "Generative adversarial imitation learning." Advances in neural information processing systems. 2016.
[4] Fu, Justin, Katie Luo, and Sergey Levine. "Learning robust rewards with adversarial inverse reinforcement learning." arXiv preprint arXiv:1710.11248 (2017).