Bangla OCR Correction

Project aim:

Correcting OCR output on Bangla language

Workflow:

Creating Bangla Corpus
Pre-processing corpus, converting articles to lines
Convert text-lines to Images
Running OCR on these Images
Calculating Word Error Rate of OCR output using ground truth text-lines

Project tree

Preparing Text Dataset

Bangla Text corpus not availaible
Wikipedia Article Scrapped for corpus preparation
Find Scrapped Corpus here

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
OCR-Evaluation		OCR-Evaluation
OCR		OCR
Preprocessing		Preprocessing
Text2Images-Pillow		Text2Images-Pillow
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bangla OCR Correction

Project aim:

Workflow:

Project tree

Preparing Text Dataset

Conversion Text2Images

Using pytesseract to convert images2text

Spell Checker for correcting noisy output for OCR

Relevant Link to blogs/repositories

About

Releases

Packages

Languages

umairanis03/Noise-Correction-OCR-Bangla

Folders and files

Latest commit

History

Repository files navigation

Bangla OCR Correction

Project aim:

Workflow:

Project tree

Preparing Text Dataset

Conversion Text2Images

Using pytesseract to convert images2text

Spell Checker for correcting noisy output for OCR

Relevant Link to blogs/repositories

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages