Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Noise removal for grayscale images #289

Open
zuphilip opened this issue Dec 29, 2017 · 1 comment
Open

Noise removal for grayscale images #289

zuphilip opened this issue Dec 29, 2017 · 1 comment

Comments

@zuphilip
Copy link
Collaborator

In the page segmentation there is a step for removing noise, but for the grayscale line images (option --gray) this noise removal is neglected:

https://github.com/tmbdev/ocropy/blob/8cfce574dd0d3a3ad653494f604ed57d1c775241/ocropus-gpageseg#L444-L451

I think that the function remove-noise can also handle grayscale images, but it will always output a cleaned binary image. How can we use that to do the same cleaning in the grayscale image?

@zuphilip zuphilip changed the title Noise removal is not done for grayscale images Noise removal for grayscale images Dec 29, 2017
@mittagessen
Copy link

remove_noise handles grayscale images by binarizing them at 0.5 and then removing every connected component smaller than 8 pixels. An unevenly lighted image or even just lightly colored printing will be unusable after that process.

A short literature review shows a large number of grayscale despeckling algorithms (mainly for ultrasounds and SAR) that might be more useful albeit probably computationally expensive. Also speckling seems to be mostly binarization artifacts, so I'm unsure if it will improve accuracy to clean grayscale images.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants