forked from chenyuntc/simple-faster-rcnn-pytorch
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
153 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,153 @@ | ||
A Pythonic, Extensible and Minimal Implemention of Faster RCNN Without Harming Performance | ||
|
||
## Introduction | ||
|
||
This project is a **Simplified** Faster R-CNN implementation mostly based on [chainercv](https://github.com/chainer/chainercv) and Other [projects](#Acknowledgement) . It aims to: | ||
|
||
- Simplify the code (*Simple is better than complex*) | ||
- Make the code more straight forward (*Flat is better than nested*) | ||
- Match the performance reported in [ origin paper](https://arxiv.org/abs/1506.01497) (*Speed Counts and mAP Matters*) | ||
|
||
## Performance | ||
|
||
- mAP | ||
|
||
VGG16 train on trainval and test on test, Note, the training show great randomness, you may need to train more epoch to reach the highest mAP. However, it should be easy to reach the lowerboud. It's also reported that train it with more epochs may | ||
|
||
| Implementation | mAP | | ||
| :--------------------------------------: | ----------- | | ||
| [origin paper](https://arxiv.org/abs/1506.01497) | 0.699 | | ||
| using caffe pretrained model (enable with`--caffe-pretrain`) | 0.700-0.708 | | ||
| using torchvision pretrained model | 0.690-0.701 | | ||
| model converted from [chainercv](https://github.com/chainer/chainercv/tree/master/examples/faster_rcnn) (reported 0.706) | 0.7053 | | ||
| current the best i've ever seen (ruotian's) | 0.711 | | ||
|
||
- Speed | ||
|
||
| Implementation | GPU | Inference | Trainining | | ||
| ---------------------------------------- | -------- | --------- | ---------- | | ||
| [origin paper](https://arxiv.org/abs/1506.01497) | K40 | 5 fps | NA | | ||
| This | TITAN Xp | 12 fps^*^ | 5-6 fps | | ||
| [pytorch-faster-rcnn](https://github.com/ruotianluo/pytorch-faster-rcnn) | TITAN Xp | NA | 5-6fps^**^ | | ||
|
||
\* include reading images from disk, preprocessing, etc. see `eval` in `train.py` for more detail. | ||
|
||
** it depends on the environment. | ||
|
||
**NOTE that** you should make sure you install cupy correctly to reach the benchmark. | ||
|
||
## Install Prerequisites | ||
|
||
- install PyTorch >=0.3 with GPU (code are gpu-only), refer to [official website](http://pytorch.org) | ||
- install cupy, you can install via `pip install` but it's better to read the [docs](https://docs-cupy.chainer.org/en/latest/install.html#install-cupy-with-cudnn-and-nccl) and make sure the environ is correctly set | ||
|
||
|
||
- install other dependencies: `pip install -r requirements.txt ` | ||
- Optional but recommended: build `nms_gpu_post`: `cd model/utils/nmspython3 build.py build_ext --inplace` | ||
- start vidom for visualize | ||
|
||
|
||
``` | ||
nohup python3 -m visdom.server & | ||
``` | ||
|
||
If you're in China and have encounter problem with visdom (i.e. timeout, blank screen), you may refer to [visdom issue](https://github.com/facebookresearch/visdom/issues/111#issuecomment-321743890), and a temporay solution provided by me | ||
|
||
## Demo | ||
|
||
download pretrained model from [..............................................] | ||
|
||
see `demo.ipynb` for detail | ||
|
||
## Train | ||
|
||
### Data | ||
|
||
#### Pascal VOC2007 | ||
|
||
1. Download the training, validation, test data and VOCdevkit | ||
|
||
``` | ||
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar | ||
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar | ||
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCdevkit_08-Jun-2007.tar | ||
``` | ||
|
||
2. Extract all of these tars into one directory named `VOCdevkit` | ||
|
||
``` | ||
tar xvf VOCtrainval_06-Nov-2007.tar | ||
tar xvf VOCtest_06-Nov-2007.tar | ||
tar xvf VOCdevkit_08-Jun-2007.tar | ||
``` | ||
|
||
3. It should have this basic structure | ||
|
||
``` | ||
$VOCdevkit/ # development kit | ||
$VOCdevkit/VOCcode/ # VOC utility code | ||
$VOCdevkit/VOC2007 # image sets, annotations, etc. | ||
# ... and several other directories ... | ||
``` | ||
|
||
4. specifiy the `voc_data_dir` in `config.py`, or pass it to program using argument like '--voc-data-dir=/path/to/VOCdevkit/VOC2007/' . | ||
|
||
#### COCO | ||
|
||
TBD | ||
|
||
### preprare caffe-pretrained vgg16 | ||
|
||
if you want to use caffe-pretrain model, you can run: | ||
|
||
```` | ||
python misc/convert_caffe_pretrain.py | ||
```` | ||
|
||
then you should speicified where caffe-pretraind model `vgg16_caffe.pth` stored in `config.py` | ||
|
||
if you want to use torchvision pretrained model, you may skip this. | ||
|
||
### begin traininig | ||
|
||
```Bash | ||
make checkpoints/ # make dir for storing snapshots | ||
``` | ||
|
||
|
||
|
||
``` | ||
python3 train.py train --env='fasterrcnn-caffe' --plot-every=100 --caffe-pretrain | ||
``` | ||
|
||
you may refer to `config.py` for more argument. | ||
|
||
Some Key arguments: | ||
|
||
- `--caffe-pretrain`=True: use caffe pretrain model or use torchvision pretrained model(Default: torchvison) | ||
- `--plot-every=n`: visulize predict, loss etc every n batches. | ||
- `--env`: visdom env for visulization | ||
- `--voc_data_dir`: where the VOC data stored | ||
- `--use-drop`: use dropout in roi head, default without dropout | ||
- `--use-adam`: use adam instead of SGD, default SGD | ||
- `--load-path`: pretrained model path, default `None`, if it's specified, the pretrained model would be loaded. | ||
|
||
## Troubleshooting | ||
|
||
- visdom | ||
- dataloader/ulimit | ||
- cupy | ||
- vgg | ||
|
||
## TODO | ||
[] training on coco | ||
[] resnet | ||
[] replace cupy with THTensor+cffi? | ||
|
||
## Acknowledge | ||
This work builds on many excellent works, which include: | ||
|
||
- [Yusuke Niitani's ChainerCV](https://github.com/chainer/chainercv) | ||
- [Ruotian Luo's pytorch-faster-rcnn](https://github.com/ruotianluo/pytorch-faster-rcnn) which based on [ Xinlei Chen's tf-faster-rcnn](https://github.com/endernewton/tf-faster-rcnn) | ||
- [faster-rcnn.pytorch by Jianwei Yang and Jiasen Lu](https://github.com/jwyang/faster-rcnn.pytorch).It's mainly based on [longcw's faster_rcnn_pytorch](https://github.com/longcw/faster_rcnn_pytorch) | ||
- All the above Repositories have refer to [py-faster-rcnn by Ross Girshick and Sean Bell](https://github.com/rbgirshick/py-faster-rcnn) either directly or indirectly. |