基于ctpn的场景文本检测(连接文本提议网络)。 它在tensorflow中实现。 原始文件可以在[这里](https://arxiv.org/abs/1609.03605)找到。 另外,caffe中的原始回购可以在[here](https://github.com/tianzhi0549/CTPN)中找到。 有关论文和代码的更多详细信息,请参阅[blog](http://slade-ruan.me/2017/10/22/text-detection-ctpn/)。 如果您有任何疑问,请先检查问题,如果问题仍然存在,请打开新问题。
** NOTICE: **
- reonstruct the repo
- cython nms and bbox utils
- loss function as referred in paper
- oriented text connector
- BLSTM
nms和bbox utils是用cython编写的,因此你必须先构建库。
cd utils/bbox
chmod +x make.sh
./make.sh
它将在当前文件夹中生成nms.so和bbox.so。
- follow setup to build the library
- download the ckpt file from googl drive or baidu yun
- put checkpoints_mlt/ in text-detection-ctpn/
- put your images in data/demo, the results will be saved in data/res, and run demo in the root
python ./main/demo.py
- First, download the pre-trained model of VGG net and put it in data/vgg_16.ckpt. you can download it from tensorflow/models
- Second, download the dataset we prepared from google drive or baidu yun. put the downloaded data in data/dataset/mlt, then start the training.
- Also, you can prepare your own dataset according to the following steps.
- Modify the DATA_FOLDER and OUTPUT in utils/prepare/split_label.py according to your dataset. And run split_label.py in the root
python ./utils/prepare/split_label.py
- it will generate the prepared data in data/dataset/
- The input file format demo of split_label.py can be found in gt_img_859.txt. And the output file of split_label.py is img_859.txt. A demo image of the prepared data is shown below.
Simplely run
python ./main/train.py
- checkpoints_mlt中提供的模型在GTX1070上针对50k iters进行了训练。 每次大约需要0.25秒。 因此完成50k迭代需要大约3.5小时。
NOTICE:
下面使用的所有照片都是从互联网上收集的。 如果它对您有影响,请与我联系删除它们。