readbot

A delightful python OCR module

This is a simple module for adding quick and easy OCR processing within your python program. In addition to wrapping an OCR engine (currently, only Tesseract) it also handles file input quite liberally. Valid file input may be a string of a files path, a file object, or even a URL to a file on the web. As an added bonus, if you need to perform OCR on a PDF, this module will use GhostScript to convert the file to a PNG for Tesseract.

This module is intended for very basic OCR use and is in no way comprehensive. Right now, it is less than 100 LOC and simply calls a Tesseract subprocess.

Installation

Its on pypi!

$ pip install tesseract

Tesseract

You need Tesseract to use this module.

(mac)

$ brew install tesseract

(linux)

$ sudo apt-get tesseract-ocr

Usage

from readbot import ReadBot

rb = ReadBot()

print rb.interpret('/path/to/file/Hello_World.png')

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
readbot		readbot
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

readbot

Installation

Usage

About

Releases

Packages

Contributors 2

Languages

mwmeyer/readbot

Folders and files

Latest commit

History

Repository files navigation

readbot

Installation

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages