Skip to content

Latest commit

 

History

History
43 lines (24 loc) · 2.73 KB

Readme.md

File metadata and controls

43 lines (24 loc) · 2.73 KB

CloudPhy OCR Pipeline

This is a Python script for Optical Character Recognition (OCR) of images that are captured by a camera. The script first imports the required libraries, installs the necessary packages, and mounts the Google Drive. It then defines the functions for segmenting, cropping, classifying, object detection, and OCR tasks.

Importing Libraries

The script starts by importing the following libraries:

  • tesseract-ocr and pytesseract for OCR.
  • PIL, matplotlib, cv2, and numpy for image processing.
  • os and google.colab to mount the Google Drive.
  • roboflow for image segmentation.

Segmenting Images

The segment function is used for image segmentation, which is done using Roboflow. The function first imports the required libraries, including base64, cv2, matplotlib, and numpy. It then reads the input image, resizes it, and passes it to the Roboflow model for segmentation. The function then extracts the segmentation mask from the resulting JSON and converts it to a NumPy array. Finally, it applies the mask to the original image and returns the segmented image.

Segmentation

Cropping Images

The cropping function is used to crop the segmented image. The function takes as input the segmented image and the points for the region of interest (ROI). It first imports cv2 and matplotlib, and then performs perspective transformation to warp the image and extract the ROI. The function returns the cropped image.

Classifying Images

The classify function is used for image classification. The function first imports tensorflow, numpy, and keras.utils. It then loads the pre-trained model for image classification from the Google Drive, preprocesses the input image, and passes it to the model for prediction. The function returns the class label.

Object Detection

The inference_obj function is used for object detection. The function first imports the required libraries, including glob, matplotlib, cv2, requests, and numpy. It then performs object detection on the input images using the pre-trained YOLOv5 model. The function returns the directory where the inference results are stored.

Object Detection

OCR Task

The ocr_task function is used for OCR. The function takes as input an array of cropped images, reads each image, applies Gaussian blur and sharpening filters, and passes it to the OCR engine for text recognition. The function then returns the OCR result for each image.

Cropping

Google Drive Integration

The script mounts the Google Drive at the beginning using the drive.mount() function to store the pre-trained models and other necessary files.

Results