Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

overlapping boxes displayed for Devanagari script #27

Open
Shreeshrii opened this issue Aug 25, 2016 · 3 comments
Open

overlapping boxes displayed for Devanagari script #27

Shreeshrii opened this issue Aug 25, 2016 · 3 comments

Comments

@Shreeshrii
Copy link

Shreeshrii commented Aug 25, 2016

Please see attached - there are multiple boxes displayed for devanagari script even though there is one entry in box file ...

image

@Shreeshrii
Copy link
Author

Shreeshrii commented Aug 26, 2016

Sample Box Tiff pairs for Sanskrit - san language, Devanagari script can be found at
https://github.com/Shreeshrii/imagessan/tree/master/san95-box-tiff

Corresponding traineddata is at https://github.com/Shreeshrii/imagessan/blob/master/tessdata/san95.traineddata

You could use it for testing for tesseract4java performance with complex scripts.

Thanks!

@Shreeshrii
Copy link
Author

I am wondering whether the overlap is because of multi-page tifs and corresponding box files.

@Shreeshrii
Copy link
Author

https://sourceforge.net/projects/vietocr/files/jTessBoxEditor/ has implementation of editor for multipage tiffs.

@pvorb pvorb removed the todo label Sep 14, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants