This project is a Simplified Faster RCNN implementation based on chainercv and other projects . It aims to:
- Simplify the code (Simple is better than complex)
- Make the code more straight forward (Flat is better than nested)
- Match the performance reported in origin paper (Speed Counts and mAP Matters)
- mAP
VGG16 train on trainval
and test on test
split.
Note: the training shows great randomness, you may need to train more epoch to reach the highest mAP. However, it should be easy to reach the lowerboud.
Implementation | mAP |
---|---|
origin paper | 0.699 |
using caffe pretrained model (enable with--caffe-pretrain ) |
0.702-0.712 |
using torchvision pretrained model | 0.693-0.701 |
model converted from chainercv (reported 0.706) | 0.7053 |
- Speed
Implementation | GPU | Inference | Trainining |
---|---|---|---|
origin paper | K40 | 5 fps | NA |
This | TITAN Xp | 12 fps^*^ | 5-6 fps |
pytorch-faster-rcnn | TITAN Xp | NA | 5-6fps^**^ |
* include reading images from disk, preprocessing, etc. see eval
in train.py
for more detail.
** it depends on the environment.
**NOTE ** you should make sure you install cupy correctly to reach the benchmark.
-
install PyTorch >=0.3 with GPU (code are gpu-only), refer to official website
-
install cupy, you can install via
pip install
but it's better to read the docs and make sure the environ is correctly set -
install other dependencies:
pip install -r requirements.txt
-
build
nms_gpu_post
:cd model/utils/nms/; python3 build.py build_ext --inplace
-
start vidom for visualize
nohup python3 -m visdom.server &
If you're in China and have encounter problem with visdom (i.e. timeout, blank screen), you may refer to visdom issue, and a temporay solution provided by me
Download pretrained model from google drive.
See demo.ipynb for more detail.
-
Download the training, validation, test data and VOCdevkit
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCdevkit_08-Jun-2007.tar
-
Extract all of these tars into one directory named
VOCdevkit
tar xvf VOCtrainval_06-Nov-2007.tar tar xvf VOCtest_06-Nov-2007.tar tar xvf VOCdevkit_08-Jun-2007.tar
-
It should have this basic structure
$VOCdevkit/ # development kit $VOCdevkit/VOCcode/ # VOC utility code $VOCdevkit/VOC2007 # image sets, annotations, etc. # ... and several other directories ...
-
specifiy the
voc_data_dir
inconfig.py
, or pass it to program using argument like--voc-data-dir=/path/to/VOCdevkit/VOC2007/
.
TBD
If you want to use caffe-pretrain model as initial weight, you can run below to get vgg16 weights converte from caffe, which is the same as the origin paper use.
python misc/convert_caffe_pretrain.py
Then you should speicified where caffe-pretraind model vgg16_caffe.pth
stored in config.py
by setting caffe_pretrain_path
If you want to use torchvision pretrained model, you may skip this step.
NOTE that, caffe pretrained model has showed slight better performance.
make checkpoints/ # make dir for storing snapshots
python3 train.py train --env='fasterrcnn-caffe' --plot-every=100 --caffe-pretrain
you may refer to config.py
for more argument.
Some Key arguments:
--caffe-pretrain
=True: use caffe pretrain model or use torchvision pretrained model (Default: torchvison)--plot-every=n
: visulize predict, loss etc every n batches.--env
: visdom env for visulization--voc_data_dir
: where the VOC data stored--use-drop
: use dropout in roi head, default without dropout--use-adam
: use adam instead of SGD, default SGD. (You need set a very lowlr
for Adam)--load-path
: pretrained model path, defaultNone
, if it's specified, the pretrained model would be loaded.
TODO: make it clear
- visdom
- dataloader/ulimit
- cupy
- vgg
- [] training on coco
- [] resnet
- [] replace cupy with THTensor+cffi?
This work builds on many excellent works, which include:
- Yusuke Niitani's ChainerCV (mainly)
- Ruotian Luo's pytorch-faster-rcnn which based on Xinlei Chen's tf-faster-rcnn
- faster-rcnn.pytorch by Jianwei Yang and Jiasen Lu.It's mainly based on longcw's faster_rcnn_pytorch
- All the above Repositories have refer to py-faster-rcnn by Ross Girshick and Sean Bell either directly or indirectly.
MIT, see the LICENSE for more detail.