Tensorflow implementation of NLP From Scratch (https://arxiv.org/abs/1103.0398)
- clone repo and get requirements (pandas mostly)
git clone https://github.com/nrlewis/nlpfromscratch-tf.git
cd nlpfromscratch-tf
pip install -r requirements.txt
- prep CONLL data for POS Tagging. Creates 'caps' features anad sentences upto 40 tokens, split %30 into validation set
mkdir -p conlldata/sent # create logging directory
python prep_scripts/prep_conll.py data/conll2000/train.txt conlldata/sent -s 40 -v .3
- run the training
mkdir sent_conv_log
python train_from_scratch.py \
sent_conv_log \
conlldata/sent/train_pos_sent.csv \
conlldata/sent/valid_pos_sent.csv \
conlldata/sent/vocab.json
- Check out tensorboard
cd sent_conv_log
tensorboard --logdir .
- prep CONLL data for POS Tagging. Creates 'caps' features and windowing of size 3 split %30 into validation set
mkdir -p conlldata/win # create data directory
python prep_scripts/prep_conll.py data/conll2000/train.txt conlldata/win -w 3 -v .3
- run the training
mkdir win_log
python train_from_scratch.py \
win_log\
conlldata/win/train_pos_w3.csv \
conlldata/win/valid_pos_w3.csv \
conlldata/win/vocab.json
- Check out tensorboard
cd win_log
tensorboard --logdir .