Skip to content

text detection mainly based on ctpn model in tensorflow, id card detect, connectionist text proposal network

License

Notifications You must be signed in to change notification settings

awaymeet/text-detection-ctpn

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

84 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

text-detection-ctpn

基于ctpn的场景文本检测(连接文本提议网络)。 它在tensorflow中实现。 原始文件可以在[这里](https://arxiv.org/abs/1609.03605)找到。 另外,caffe中的原始回购可以在[here](https://github.com/tianzhi0549/CTPN)中找到。 有关论文和代码的更多详细信息,请参阅[blog](http://slade-ruan.me/2017/10/22/text-detection-ctpn/)。 如果您有任何疑问,请先检查问题,如果问题仍然存在,请打开新问题。


** NOTICE: **


路线图

  • reonstruct the repo
  • cython nms and bbox utils
  • loss function as referred in paper
  • oriented text connector
  • BLSTM

安装

nms和bbox utils是用cython编写的,因此你必须先构建库。

cd utils/bbox
chmod +x make.sh
./make.sh

它将在当前文件夹中生成nms.so和bbox.so。


例子

  • follow setup to build the library
  • download the ckpt file from googl drive or baidu yun
  • put checkpoints_mlt/ in text-detection-ctpn/
  • put your images in data/demo, the results will be saved in data/res, and run demo in the root
python ./main/demo.py

训练

准备数据

  • First, download the pre-trained model of VGG net and put it in data/vgg_16.ckpt. you can download it from tensorflow/models
  • Second, download the dataset we prepared from google drive or baidu yun. put the downloaded data in data/dataset/mlt, then start the training.
  • Also, you can prepare your own dataset according to the following steps.
  • Modify the DATA_FOLDER and OUTPUT in utils/prepare/split_label.py according to your dataset. And run split_label.py in the root
python ./utils/prepare/split_label.py
  • it will generate the prepared data in data/dataset/
  • The input file format demo of split_label.py can be found in gt_img_859.txt. And the output file of split_label.py is img_859.txt. A demo image of the prepared data is shown below.


训练

Simplely run

python ./main/train.py
  • checkpoints_mlt中提供的模型在GTX1070上针对50k iters进行了训练。 每次大约需要0.25秒。 因此完成50k迭代需要大约3.5小时。

some results

NOTICE: 下面使用的所有照片都是从互联网上收集的。 如果它对您有影响,请与我联系删除它们。


面向文本连接器

  • 面向文本连接器已经实现,我正在工作,但仍需要进一步改进。
  • 左图是DETECT_MODE H的结果,右图是DETECT_MODE O.

About

text detection mainly based on ctpn model in tensorflow, id card detect, connectionist text proposal network

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 99.9%
  • Shell 0.1%