A Pythonic, Extensible and Minimal Implemention of Faster RCNN Without Harming Performance
This project is a Simplified Faster R-CNN implementation mostly based on chainercv and Other projects . It aims to:
- Simplify the code (Simple is better than complex)
- Make the code more straight forward (Flat is better than nested)
- Match the performance reported in origin paper (Speed Counts and mAP Matters)
- mAP
VGG16 train on trainval and test on test, Note, the training show great randomness, you may need to train more epoch to reach the highest mAP. However, it should be easy to reach the lowerboud. It's also reported that train it with more epochs may
Implementation | mAP |
---|---|
origin paper | 0.699 |
using caffe pretrained model (enable with--caffe-pretrain ) |
0.700-0.708 |
using torchvision pretrained model | 0.690-0.701 |
model converted from chainercv (reported 0.706) | 0.7053 |
current the best i've ever seen (ruotian's) | 0.711 |
- Speed
Implementation | GPU | Inference | Trainining |
---|---|---|---|
origin paper | K40 | 5 fps | NA |
This | TITAN Xp | 12 fps^*^ | 5-6 fps |
pytorch-faster-rcnn | TITAN Xp | NA | 5-6fps^**^ |
* include reading images from disk, preprocessing, etc. see eval
in train.py
for more detail.
** it depends on the environment.
NOTE that you should make sure you install cupy correctly to reach the benchmark.
-
install PyTorch >=0.3 with GPU (code are gpu-only), refer to official website
-
install cupy, you can install via
pip install
but it's better to read the docs and make sure the environ is correctly set -
install other dependencies:
pip install -r requirements.txt
-
Optional but recommended: build
nms_gpu_post
:cd model/utils/nmspython3 build.py build_ext --inplace
-
start vidom for visualize
nohup python3 -m visdom.server &
If you're in China and have encounter problem with visdom (i.e. timeout, blank screen), you may refer to visdom issue, and a temporay solution provided by me
download pretrained model from [..............................................]
see demo.ipynb
for detail
-
Download the training, validation, test data and VOCdevkit
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCdevkit_08-Jun-2007.tar
-
Extract all of these tars into one directory named
VOCdevkit
tar xvf VOCtrainval_06-Nov-2007.tar tar xvf VOCtest_06-Nov-2007.tar tar xvf VOCdevkit_08-Jun-2007.tar
-
It should have this basic structure
$VOCdevkit/ # development kit $VOCdevkit/VOCcode/ # VOC utility code $VOCdevkit/VOC2007 # image sets, annotations, etc. # ... and several other directories ...
-
specifiy the
voc_data_dir
inconfig.py
, or pass it to program using argument like '--voc-data-dir=/path/to/VOCdevkit/VOC2007/' .
TBD
if you want to use caffe-pretrain model, you can run:
python misc/convert_caffe_pretrain.py
then you should speicified where caffe-pretraind model vgg16_caffe.pth
stored in config.py
if you want to use torchvision pretrained model, you may skip this.
make checkpoints/ # make dir for storing snapshots
python3 train.py train --env='fasterrcnn-caffe' --plot-every=100 --caffe-pretrain
you may refer to config.py
for more argument.
Some Key arguments:
--caffe-pretrain
=True: use caffe pretrain model or use torchvision pretrained model(Default: torchvison)--plot-every=n
: visulize predict, loss etc every n batches.--env
: visdom env for visulization--voc_data_dir
: where the VOC data stored--use-drop
: use dropout in roi head, default without dropout--use-adam
: use adam instead of SGD, default SGD--load-path
: pretrained model path, defaultNone
, if it's specified, the pretrained model would be loaded.
- visdom
- dataloader/ulimit
- cupy
- vgg
[] training on coco [] resnet [] replace cupy with THTensor+cffi?
This work builds on many excellent works, which include:
- Yusuke Niitani's ChainerCV
- Ruotian Luo's pytorch-faster-rcnn which based on Xinlei Chen's tf-faster-rcnn
- faster-rcnn.pytorch by Jianwei Yang and Jiasen Lu.It's mainly based on longcw's faster_rcnn_pytorch
- All the above Repositories have refer to py-faster-rcnn by Ross Girshick and Sean Bell either directly or indirectly.