-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Very good overall performance, but this one fails? #44
Comments
@jbarth-ubhd weird, I have not seen any segmentation results like this coming from the tool. Can you attach the PAGE-XML as well plz? (cc @vahidrezanezhad) |
I'll let in through a second time, just to be sure... |
yes, second try, same, result. Complete ocr-d workflow results: |
Thanks for providing the test data. I can also confirm this via Aletheia. The issue seems to be with the region segmentation - where there are regions detected (see e.g. the marginalia on the left hand side), the textline segmentation actually works ok-ish. We will have a look what's wrong here! Btw, just in case you missed this recent announcement in the OCR-D Chat:
|
I'm looking forward for this! Thanks for inspecting. |
Dear @jbarth-ubhd, I found some time to investigate this further and with the current version of Using binarized image (regions) I will also try again with our new (but still work-in-progress) segmentation tool which prefers non-binarized images as input and post results here. |
Here the original image:
https://digi.ub.uni-heidelberg.de/diglitData/v/blaeu1655bd6_-_00_129.tif
here the image fed into sbb-textline (binarized etc):
https://digi.ub.uni-heidelberg.de/diglitData/v/blaeu1655bd6_-_00_129-binarized.png
and here the detected segments:
model used:
The text was updated successfully, but these errors were encountered: