A Java Spring Boot example how to use Tesseract for extracting text through Optical Character Recognition (OCR) from image files.
Requirements to run this code:
-
JDK 17
-
tesseract-ocr / tessdata https://github.com/tesseract-ocr/tessdata
-
Environment variable "TESSDATA_PREFIX" pointing to tessdata folder.
-
gradle
gradle.bat bootRun
Open your browser at http://localhost:8080/
The page shown should be like,