GitHub - keithmgould/cartpole_tiny_dnn_agent: Tiny-DNN powered Cartpole agent using vanilla reinforcement learning

The goal here is to use the tiny-dnn library for reinforcement learning. The environment is handled by OpenAI Gym, and the interface is through the OpenAI Gym Http Server. The agent communicates with the environment using the c++ OpenAI Gym Http client.

Status:

The agent currently learns and performs much better than a random agent, but caps out at only a decent level of learning, and I'm not sure why.

Dependencies:

Python (tested on 3.6.3)
OpenAI's gym
a c++ compiler
boost (needed by the gym http c++ client)

Origin of the http client code

The code for the gym http client in this repo is (somewhat heavily) modified from here.

Inspiration:

I treated the code found in PyTorch's example for REINFORCE as a template for the agent in this repo.

Setup:

Ensure you have python on your machine, able to run the OpenAI http server. Directions HERE.
you can use URLs in your browser to test that the gym server is running
Modify the Makefile to suit your system and compiler. You will need to update the paths.
If you can run make, and the compiled 'agent' is created, you are golden.

Running:

Once you run the agent, you should see the gym render the simulation. This is at least the case for running on a mac.

Pseudocode:

The part I'm least confident about is the desired_outs section. This corresponds to this method.

  Forever do
    state = env.reset
    states = actsions = rewards = []
    while in_episode?
      possible_actions = net.predict(state) //softmax (probabilities sum to 1)
      action = weighted_random_choice(possible_actions) (action is 0 or 1 for left or right)
      state, reward, in_episode? = env.step(action) // reward is always 1
      actions.push(action); state.push(state); reward.push(reward) // store things
    rewards = normalize(rewards) // mean=0, std=1
    desired_outs = one_hot_using_rewards(actions,rewards) // for ex: 1 => {1.3, 0}. 0 => {0, 0.7}
    for(int i = 0; i < actions.size; i++)
      net.train(state[i], desired_outs[i])

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
dist		dist
include/gym		include/gym
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
agent.cpp		agent.cpp
gym_binding.cpp		gym_binding.cpp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Status:

Dependencies:

Origin of the http client code

Inspiration:

Setup:

Running:

Pseudocode:

About

Releases

Packages

Languages

keithmgould/cartpole_tiny_dnn_agent

Folders and files

Latest commit

History

Repository files navigation

Status:

Dependencies:

Origin of the http client code

Inspiration:

Setup:

Running:

Pseudocode:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages