You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What steps will reproduce the problem?
Trying to use the code that makes a whitelist for Tesseract like follows
ocr = tesseract.TessBaseAPI()
ocr.SetVariable("tessedit_char_whitelist", "0123456789;")
ocr.SetPageSegMode(tesseract.PSM_AUTO)
ocr.Init("C:\\Program Files (x86)\\Tesseract-OCR\\","eng",tesseract.OEM_DEFAULT)
What is the expected output? What do you see instead?
Intended output is to have only "0123456789;" characters be recognized when
using the image_to_string() function. Using code like what is above,
image_to_string() just ignores it and grabs whatever characters it finds.
What version of the product are you using? On what operating system?
pytesseract-0.1, Python 2.7, Windows 8.1
Please provide any additional information below.
I've been trying everything people use for Tesseract-OCR, but that doesn't work
with pytesseract. I haven't been able to find any solution or method to
whitelisting with the image_to_string() function anywhere, which would be
immensely helpful in improving the accuracy of the function.
Thanks in advance for any help on the matter.
Original issue reported on code.google.com by [email protected] on 9 Jun 2015 at 6:58
The text was updated successfully, but these errors were encountered:
Original issue reported on code.google.com by
[email protected]
on 9 Jun 2015 at 6:58The text was updated successfully, but these errors were encountered: