Simple convolutional neural network for words recognition
Put your dataset into ./dataset/
, then run:
python train.py --words one,two,three
See argparser function for more arguments
You can see dataset directory structure in ./tests/dataset/
(testing model
trained on tensorflow speech commands dataset).
After training you can find trained models in ./train/
directory.
python recognize.py \
--wav_file audio_to_recognize.wav \
--labels_file ./train/labels.txt \
--model_file ./train/model.ckpt-10000
It will print something like this:
350 two
1000 one
1400 _unknown_
...
where number - offset in milliseconds, word - recognized label.