Udacity's Deep RL Nanodegree - Project 3: Collaboration and Competition

Introduction

This repository contains my solution for the third project of the Deep Reinforcement Learning Nanodegree from Udacity. In this exercise an RL-agent controls two rackets to bounce a ball over a net, somewhat like playing Ping-Pong. The environment was made with Unity's ML-Agents framework.

Environment Definition

Rewards

If an agent hits the ball over the net, it receives a reward of +0.1.
If an agent lets a ball hit the ground or hits the ball out of bounds, it receives a reward of -0.01.
No rewards are provided in a per-time-step basis.

State Space

The observation space consists of 8 variables corresponding to the position and velocity of the ball and racket, the environment returns 3 stacked observation spaces at each timestep, son the returned variable has dimension 24. The vector has the following variables:
[Racket Pos X, Racket Pos Y, Racket Vel X, Racket Vel Y, Ball Pos X, Ball Pos Y, Racket Vel X, Racket Vel y]

(For some reason it seems that the last two elements are a repeat of the third and fourth elements.)

Action Space

Two continuous actions are available, corresponding to movement toward (or away from) the net, and jumping. it looks the following.

[Racket Movement, Racket Jump]

Racket Movement: Positive values move the racket at a constant speed towards the net, Negative values move it away from the net.
Racket Jump: Values larger than 0.5 trigger a Jump, Values lower or equal than 0.5 do nothing.

Every entry in the action vector should be a number between -1 and +1.

Task Goal

The goal of each agent is to keep the ball in play for as many time steps as possible. In order to consider the environment solved, any of the agents must get an average score of +0.5 over 100 consecutive episodes. The score of a particular episode is the maximum score between the two agents.

Getting Started

Most of these instructions were borrowed from the installation instructions in Udacity's Deep Reinforcement Learning Repository.

Prepare the Anaconda Environment

Create (and activate) a new environment with Python 3.6.

Linux or Mac:

conda create --name drlnd python=3.6
source activate drlnd

Windows:

conda create --name drlnd python=3.6 
activate drlnd

Clone the repository (if you haven't already!), and navigate to the python/ folder. Then, install several dependencies.

git clone https://github.com/SaidAlvarado/https://github.com/SaidAlvarado/Udacity-DeepRL-Project_3_Collaboration_and_Competition.git
cd Udacity-DeepRL-Project_3_Collaboration_and_Competition/python
pip install .

Create an IPython kernel for the drlnd environment.

python -m ipykernel install --user --name drlnd --display-name "drlnd"

Before running code in the notebook, change the kernel to match the drlnd environment by using the drop-down Kernel menu.

Download the Unity Environment

Download the environment from one of the links below. You need only select the environment that matches your operating system:
- Linux: click here
- Mac OSX: click here
- Windows (32-bit): click here
- Windows (64-bit): click here
Decompress the file into the root directory of the repository.

Run the code.

Follow the instructions in DDPG_Collaboration_and_Competition.ipynb to train and see the agent in action.

Understanding the Algorithm

For more information regarding the algorithm used to solve this environment, please refer the the technical report Report.md included in the repository.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
python		python
.gitignore		.gitignore
DDPG_Collaboration_and_Competition.ipynb		DDPG_Collaboration_and_Competition.ipynb
LICENSE		LICENSE
README.md		README.md
Report.md		Report.md
checkpoint_actor_SUCCESS.pth		checkpoint_actor_SUCCESS.pth
checkpoint_critic_SUCCESS.pth		checkpoint_critic_SUCCESS.pth
ddpg_agent.py		ddpg_agent.py
model.py		model.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Udacity's Deep RL Nanodegree - Project 3: Collaboration and Competition

Introduction

Environment Definition

Rewards

State Space

Action Space

Task Goal

Getting Started

Prepare the Anaconda Environment

Download the Unity Environment

Run the code.

Understanding the Algorithm

About

Releases

Packages

Languages

License

SaidAlvarado/Udacity-DeepRL-Project_3_Collaboration_and_Competition

Folders and files

Latest commit

History

Repository files navigation

Udacity's Deep RL Nanodegree - Project 3: Collaboration and Competition

Introduction

Environment Definition

Rewards

State Space

Action Space

Task Goal

Getting Started

Prepare the Anaconda Environment

Download the Unity Environment

Run the code.

Understanding the Algorithm

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages