(Work in progress) readme
The source code and data for Twitch Plays Robotics (TPR), a crowdsourcing project to teach robots natural language. This project is funded by the National Science Foundation and is run by a research team at the University of Vermont.
We used a slightly modified version of Pyrosim, a Python wrapper above Open Dynamics Engine (ODE) https://github.com/jbongard/pyrosim for our experiment.
To install Pyrosim, open a terminal window, and navigate into the pyrosim directory. For example:
$ cd ~/Desktop/TPR-1.0/pyrosim
Then run:
$ make
- This script dispays a window prompting users to type in reinforcement for a robot that is being displayed.
- It also maintains the primary population of robots and evoloves the robots in the population
This script generates a secondary population of robots that feeds into the primary one. To do so, it uses a hill climbing approach and novelty search to create diverse behaviours.
This script listens to a twitch channel and records any incoming message to the chat table.
This script reads all the unprocessed messages from the chat table and places them into their right tables.
This script picks an unprocessed help message from the help table and sends a response to the user in the chat session.
This script wakes up every 10 seconds and updates score of all the users and commands.
This script displays a window prompting users to vote for the next command.
This script displays a window of top 5 commands by score, or learnability.
This script displays a window of top 5 users by score.
https://github.com/zmahoor/TPR-minimal helps you to start using twitch server for broadcasting and receiving/sending messages from/to a twitch channel.
Each robot was displayed under a given command and a color on the broadcasting computer for 30 seconds. This 30 second period is called a robot evaluation. Sensor data for a robot evaluated at a specific time is stored in a file named "robot_id_Year-month-day-hour-minute-second.dat". In the file name, id represents the robot's id, and "Year-month-day-hour-minute-second" shows the start time of the evaluation.
Every sensor data file contains a pickled python dictionary of multiple elements (keys, values). Each element of this dictionary holds values of different sensors of a robot over its evaluation period. The key of the dictionary is a string which encodes a sensor type, and the value is a numpy array of length 1800 of the corresponding sensor values. The followings explain the keys to this dictionary.
A key starting with 'T' encodes a touch sensor, a key starting with 'P' and ending with either 'X', 'Y', or 'Z' to a position sensor, a key starting with 'R' to a Ray (distance) sensor, and finally, a key starting with 'P' and ending with an integer to a proprioceptive (joint) sensor.
The controller of each robot displayed during the experiment is stored in this directory. This directory contains 10 subdirectories (one subdirectory per robot type). The name of a subdirectory maps to a robot type as follows: 1: twigbot, 2: stickbot, 3: branchbot, 4: treebot, shinbot: tablebot, starfishbot: starfishbot, crabbot: crabbot, quadruped: quadruped, snakebot: snakebot, spherebot: spherebot.
We used MySQL to store and retrieve information of users, incoming messages, displaying robots in our experiments. The following sections explain the general schemas for the tables in our database.
- chatID: The message number
- timeArrival: What time the message was stored
- username: Who typed it
- txt: What was in the message
- processed: Was it organized by library_bot? 0 for no 1 for yes
- cmdLogID: The command order
- userName: Who typed the message
- cmdTxt: What the command is
- timeArrival: When it was stored
- processed: Flag put something here
- animationFlag: Flag put something here
- displayID: Display order (unique number identifying each evaluation)
- robotID: What robot (robot table.1)
- cmdTxt: The command input
- color: What color the robot was displayed as
- startTime: When it started evaluation
- numYes: Number of positive reinforcements
- numNo: Number of negative reinforcements
- numLike: Number of likes
- numDislike: Number of dislikes
- helpID: Help message number (unique number)
- txt: What the message was
- userName: Who asked for help
- timeArrival: When the message was stored
- processed: Did help_bot respond? 0 for no 1 for yes
- rewardLogID: Reward number (unique number)
- userName: Who typed the reward message
- reward: What it was (Y/N/L/D)
- color: What robot it was given to
- timeArrival: When it was received
- processed: Flag put something here
- displayID: the evaluation id that this reward was given for
- robotID: What number robot it is (unique number for each robot)
- type: The morphology
- numEvals: How many times it was evaluated
- dead: Is the robot dead (0 for no, 1 for yes)
- totalFitness: Robot's fitness
- totalLikeability:Robot's likeability
- birthDate: When the robot was created
- parentID: Robot it was mutated from (0 for randomly generated)
- deathDate: When the robot died
- sumYes: total yes's this robot has received
- sumNo: total no's this robot has received
- sumLike: total likes this robot has received
- sumDislike: total dislikes this robot has received
- cmdTxt: What the command is
- timeAdded: When it was first typed
- wordToVec: [-1,1] using wordToVec algorithm
- totalLearnability: How obedient the robots are to the command
- active: Are the robots currently acting on it (0 for no, 1 for yes)
- ID: a unique number assigned to each user
- userName: The twitch username
- timeAdded: When they first typed a message
- parentName: Who referred them
- score: The user's score (sum of all messages starting with a !)
- ban: Banned from the channel (0 for no, 1 for yes)