Improving OCR recognition #8

madmalkav · 2021-10-23T11:06:46Z

Are there any options that can be played with to try to improve recognition? In example, this text:

Is readed as:

「おしゝ ! 邊つかったか ? .

If I remove the furigana from the selection (not very convinient for multiline texts) I get :

「おいしい ! 上幅つかったか ? 」

madmalkav · 2021-10-24T11:20:02Z

I have been testing alternatives to Tesseract and Easy OCR seems to do a much better recognition work (but messes the output format a little if there is furigana, see: JaidedAI/EasyOCR#575

I have barely no coding experience but I'm looking into trying to fork the project to try to add support for backends different to Tesseract. Will report if I manage to do anything useful.

kamui-fin · 2022-02-03T01:30:25Z

Improving the OCR accuracy is definitely an ongoing goal of this project. Including alternative backends does sound interesting however it seems like Easy OCR only supports python. I think a better option would be to focus development efforts towards fine tuning tesseract to recognize text better along with some extra text processing. One of the first steps would be to implement a text processing stage which replaces many of the commonly missed characters with the expected ones, sort of how Kaku does it. Another thing to look into is further training the models to adapt to commonly missed fonts. I'm open to any contributions or ideas so feel free to share your findings.

wildwestrom · 2022-04-06T00:37:19Z

So here's one problem I found while OCRing Steins;Gate.
As you can see, this is the image that comes out of processing. Kurisu's labcoat is visible within the image, and as a consequence messes up the OCR.

Output text: 「しかも、完全ではないけけど、タイムトラへし老殿玖さぜてるってこと|になるわね、これ」盆「・世でアー

When I change the Otsu Score Fraction to anything greater than or equal to 0.1, this problem is nearly eliminated.
Here's what it looks like at 0.1.

Output text: 「しかも、完全ではない|けど、タイムトラベルを成功させてるってこと|こなるわね、これ」るを

Here's the relevant line of code:
https://github.com/kamui-fin/gazou/blob/master/src/ocr.cpp#L32
Raw image for reference:

Let me know your thoughts on this, what I should test this setting on, etc.

Yuri-K7 · 2024-01-08T11:18:35Z

I have a similar issue, with a specific background of a game which makes the text completely unreadable :

Changing the same otsu score fraction to 0.7 removes most of the background, but there's a lot of errors, and then changing the usm fract to 1.5 makes it almost perfect (there's still one error) :

Output : 僕は黙々とシャープペンシルの先を走らせ、青い野
線が刻まれた真新しい大学ノートに、ひとつの円を描
きだした。いつも描く馴染みのあるあの円だ。

そして僕ば、ねっとりとした夢の中へ落ちていく。

I don't know how that applies to other content and if they're even the best values for this image.

kamui-fin added the enhancement New feature or request label Mar 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improving OCR recognition #8

Improving OCR recognition #8

madmalkav commented Oct 23, 2021

madmalkav commented Oct 24, 2021

kamui-fin commented Feb 3, 2022

wildwestrom commented Apr 6, 2022 •

edited

Loading

Yuri-K7 commented Jan 8, 2024 •

edited

Loading

Improving OCR recognition #8

Improving OCR recognition #8

Comments

madmalkav commented Oct 23, 2021

madmalkav commented Oct 24, 2021

kamui-fin commented Feb 3, 2022

wildwestrom commented Apr 6, 2022 • edited Loading

Yuri-K7 commented Jan 8, 2024 • edited Loading

wildwestrom commented Apr 6, 2022 •

edited

Loading

Yuri-K7 commented Jan 8, 2024 •

edited

Loading