Skip to content

LATIS-DocumentAI-Team/ocr-microservice

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OCR Microservice

This microservice standardizes the usage of Optical Character Recognition (OCR) engines, providing a unified interface to access multiple OCR engines. It currently supports three main OCR engines:

  • PaddleOCR: A deep learning-based OCR engine capable of handling various tasks such as text detection, recognition, and structure analysis.
  • Tesseract: An open-source OCR engine that provides accurate text recognition from images.
  • EasyOCR: Another deep learning-based OCR engine known for its simplicity and ease of use.

The microservice returns OCR output in a standardized format of a Document following the structure defined by DocumentAI-std. An example of the output is provided below:

{
  "ocr_result": {
    "filename": "0c5d743d-d936-40ae-9642-c9db27c6155c.png",
    "elements": [
      {
        "x": 48,
        "y": 45,
        "w": 47,
        "h": 20,
        "content_type": 1,
        "content": "STE"
      },
      {
        "x": 104,
        "y": 47,
        "w": 97,
        "h": 20,
        "content_type": 1,
        "content": "SIDMAC"
      }
    ]
  },
  "code": 200,
  "message": "success"
}

Built with

  • Python3.11: The microservice is developed using Python 3.11, providing a robust and efficient runtime environment.
  • Fast API: FastAPI is used to build the RESTful API endpoints, offering high performance and easy-to-use tools for API development.

Usage

Locally:

  1. Download the repository:
git clone https://github.com/LATIS-DocumentAI-Group/ocr-microservice.git
cd ocr-microservice
  1. Install the requirements:
pip install -r requirements.txt
  1. Run the main file:
python main.py

Using Docker

  1. Pull the Docker image:
docker pull hamzagbada18/ocr-microservice:latest
  1. Run the Docker container:
docker run -p 8000:8000 --name ocr-api hamzagbada18/ocr-microservice:latest
  1. Access the OpenAPI documentation:

You can access the OpenAPI Specification documentation through the following link: localhost:8000/docs

  1. Acces throw REST API
  • POST /applyOcr/
  • Apply OCR

Params:

Name Description
ocr_method This attribute indicates which OCR method will be applied. For Paddle OCR, ocr_method = paddle. For Tesseract, ocr_method = tesseract. For EasyOCR, ocr_method = easy.
languages List of supported languages. Supported languages are fr (French) and en (English). Note: Paddle OCR accepts only one language.
  1. Example usage with curl:
curl -X 'POST' \
  'http://localhost:8000/applyOcr/?ocr_method=tesseract&languages=en' \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -F '[email protected];type=image/png'