svtr2文字识别模型微调后，识别文字结果为乱码 #13530

dirac472 · 2024-07-26T08:51:39Z

dirac472
Jul 26, 2024

使用的svtr2识别模型架构，使用了svtr2的预训练模型，
训练数据是合成的数据，有100万张
。lable_.txt没有用paddle ocr自带的文字，而是使用训练文本中的文字重新生成的，训练了五轮，第三轮最佳，精度有98%，
配置文件如下：Global:
debug: false
use_gpu: true
epoch_num: 20 # 200
log_smooth_window: 20
print_batch_step: 10
save_model_dir: ./output/rec_svtrv2_ch
save_epoch_step: 1
eval_batch_step: [0, 500]
cal_metric_during_train: False
pretrained_model: ./pretrain_models/openatom_rec_svtrv2_ch_train/best_accuracy
checkpoints: ./output/rec_svtrv2_ch/best_accuracy
save_inference_dir:
use_visualdl: false
infer_img: doc/imgs_words/ch/
character_dict_path: E:/BaiduNetdiskDownload/DataSet/Chinese_dataset/labels_.txt #ppocr/utils/ppocr_keys_v1.txt
max_text_length: &max_text_length 25
infer_mode: false
use_space_char: true
distributed: true
save_res_path: ./output/rec/predicts_svrtv2.txt

Optimizer:
name: AdamW
beta1: 0.9
beta2: 0.999
epsilon: 1.e-8
weight_decay: 0.05
no_weight_decay_name: norm
one_dim_param_no_weight_decay: True
lr:
name: Cosine
learning_rate: 0.001 # 8gpus 192bs
warmup_epoch: 5

Architecture:
model_type: rec
algorithm: SVTR_HGNet
Transform:
Backbone:
name: SVTRv2
use_pos_embed: False
dims: [128, 256, 384]
depths: [6, 6, 6]
num_heads: [4, 8, 12]
mixer: [['Conv','Conv','Conv','Conv','Conv','Conv'],['Conv','Conv','Global','Global','Global','Global'],['Global','Global','Global','Global','Global','Global']]
local_k: [[5, 5], [5, 5], [-1, -1]]
sub_k: [[2, 1], [2, 1], [-1, -1]]
last_stage: False
use_pool: True
Head:
name: MultiHead
head_list:
- CTCHead:
Neck:
name: svtr
dims: 256
depth: 2
hidden_dims: 256
kernel_size: [1, 3]
use_guide: True
Head:
fc_decay: 0.00001
- NRTRHead:
nrtr_dim: 384
max_text_length: *max_text_length
num_decoder_layers: 2

Loss:
name: MultiLoss
loss_config_list:
- CTCLoss:
- NRTRLoss:

PostProcess:
name: CTCLabelDecode

Metric:
name: RecMetric
main_indicator: acc

Train:
dataset:
name: MultiScaleDataSet
ds_width: false
data_dir: E:/BaiduNetdiskDownload/DataSet/Chinese_dataset/images
ext_op_transform_idx: 1
label_file_list:
- E:/BaiduNetdiskDownload/DataSet/Chinese_dataset/train_list.txt
transforms:
- DecodeImage:
img_mode: BGR
channel_first: false
- RecAug:
- MultiLabelEncode:
gtc_encode: NRTRLabelEncode
- KeepKeys:
keep_keys:
- image
- label_ctc
- label_gtc
- length
- valid_ratio
sampler:
name: MultiScaleSampler
scales: [[320, 32], [320, 48], [320, 64]]
first_bs: &bs 192
fix_bs: false
divided_factor: [8, 16] # w, h
is_training: True
loader:
shuffle: true
batch_size_per_card: *bs
drop_last: true
num_workers: 8
Eval:
dataset:
name: SimpleDataSet
data_dir: E:/BaiduNetdiskDownload/DataSet/Chinese_dataset/images
label_file_list:
- E:/BaiduNetdiskDownload/DataSet/Chinese_dataset/val_list.txt
transforms:
- DecodeImage:
img_mode: BGR
channel_first: false
- MultiLabelEncode:
gtc_encode: NRTRLabelEncode
- RecResizeImg:
image_shape: [3, 48, 320]
- KeepKeys:
keep_keys:
- image
- label_ctc
- label_gtc
- length
- valid_ratio
loader:
shuffle: false
drop_last: false
batch_size_per_card: 128
num_workers: 4
使用命令将训练模型转换为推理模型，python tools/export_model.py -c configs/rec/SVTRv2/rec_svtrv2_ch.yml
最后识别结果是乱码

dirac472 · 2024-07-27T00:21:08Z

dirac472
Jul 27, 2024
Author

没问题了，推理时指定字典就行， python tools/infer/predict_rec.py --image_dir='E:/BaiduNetdiskDownload/DataSet/Chinese_dataset/images/img_0000001.jpg' --rec_model_dir='./inference/rec_svtrv2_ch' --rec_char_dict_path='E:/BaiduNetdiskDownload/DataSet/Chinese_dataset/labels_.txt'

0 replies

nissansz · 2024-10-25T03:29:40Z

nissansz
Oct 25, 2024

这个要paddleocr哪个版本才能训练？

0 replies

nissansz · 2024-10-25T07:07:10Z

nissansz
Oct 25, 2024

这个模型要什么环境财经运行？各种报错，运行不起来，有没有教程？

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

svtr2文字识别模型微调后，识别文字结果为乱码 #13530

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 3 comments

{{title}}

{{title}}

{{title}}

Select a reply

svtr2文字识别模型微调后，识别文字结果为乱码 #13530

dirac472 Jul 26, 2024

Replies: 3 comments

dirac472 Jul 27, 2024 Author

nissansz Oct 25, 2024

nissansz Oct 25, 2024

dirac472
Jul 26, 2024

dirac472
Jul 27, 2024
Author

nissansz
Oct 25, 2024

nissansz
Oct 25, 2024