Translation use case? #12060
Tejaswgupta
started this conversation in
General
Replies: 3 comments
-
please refer: https://github.com/PaddlePaddle/PaddleOCR/blob/main/README_en.md#-tutorials |
Beta Was this translation helpful? Give feedback.
0 replies
-
@GreatV I've gone through the docs, the only relevant thing was PP-Structure but that's an overkill and would require more work to get components out of it for our use case. |
Beta Was this translation helpful? Give feedback.
0 replies
-
@Tejaswgupta Sorry, as far as I know paddleocr doesn't do direct paragraph level detection and recognition. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
We've fine-tuned a transformers based translation engine which works on paragraph to ensure contextual translation. We had been using tesseract/HOCR for paragraph level extraction , but the HOCR library we used is obsolete now. PP-OCR seems a promising solution but I couldn't any resources on paragraph level extraction.
Can someone shed some light on this. Thanks!
Beta Was this translation helpful? Give feedback.
All reactions