Skip to content

Commit

Permalink
Add munching
Browse files Browse the repository at this point in the history
  • Loading branch information
tasuki committed Mar 6, 2018
1 parent a3a0a6e commit 671c970
Show file tree
Hide file tree
Showing 4 changed files with 44 additions and 0 deletions.
3 changes: 3 additions & 0 deletions munch/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Reads the game records from json, chooses which ones to use, splits them into training, validation, and test sets.

Just cd here and run `bash process_games.sh`
8 changes: 8 additions & 0 deletions munch/get_records.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
import json
import sys

fname = sys.argv[1]
data = json.load(open(fname))

for game in data:
print(game["record"])
12 changes: 12 additions & 0 deletions munch/process_games.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
#!/bin/bash

DATADIR="../data/"

# get game records
python get_records.py "$DATADIR"/bga-games-info.json > "$DATADIR"/records-all.txt

# remove rows that appear more than once
cat "$DATADIR"/records-all.txt | sort | uniq -u > "$DATADIR"/records-clean.txt

# create train/valid/test data sets
python split_records.py "$DATADIR"
21 changes: 21 additions & 0 deletions munch/split_records.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
import sys

data_dir = sys.argv[1]

train = open(data_dir + "/records-1-train.txt", "w")
valid = open(data_dir + "/records-2-valid.txt", "w")
test = open(data_dir + "/records-3-test.txt", "w")

with open(data_dir + "/records-clean.txt") as f:
records = f.readlines()

counter = 0
for record in records:
counter += 1
fifth = counter % 5 + 1
if fifth == 5:
test.write(record)
elif fifth == 4:
valid.write(record)
else:
train.write(record)

0 comments on commit 671c970

Please sign in to comment.