Skip to content

Latest commit

 

History

History
56 lines (35 loc) · 1.78 KB

README.md

File metadata and controls

56 lines (35 loc) · 1.78 KB

pytorch-a2c-ppo-acktr

Please use hyper parameters from this readme. With other hyper parameters things might not work (it's RL after all)!

Original repository - Link

This is a PyTorch implementation of

  • Advantage Actor Critic (A2C), a synchronous deterministic version of A3C
  • Proximal Policy Optimization PPO
  • Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation ACKTR
  • Generative Adversarial Imitation Learning GAIL

Requirements

In order to install requirements, follow:

# PyTorch
conda install pytorch torchvision -c soumith

# Baselines for Atari preprocessing
git clone https://github.com/openai/baselines.git
cd baselines
pip install -e .

# Other requirements
pip install -r requirements.txt

Visualization

In order to visualize the results use visualize.ipynb.

Training

PPO Single column

python main.py --env-name "PongNoFrameskip-v4" --use-pnn --use-gae   --num-processes 8 --num-steps 128 --num-mini-batch 4  --use-linear-lr-decay 

Progressive neural network with 2 columns

python main.py --env-name "PongNoFrameskip-v4"  --use-pnn --n-columns 2 --pnn-paths "path_to_trained_model_from_previous_runs"  --use-gae   --num-processes 8 --num-steps 128 --num-mini-batch 4  --use-linear-lr-decay 

Works with minigrid environments. Pass 'MiniGrid-xyz' (change this to environment's name) as the argument for --env-name