OCR Search Engine

Introduction

This is the source code for the developed OCR (Optical Character Recognition) Search Engine which is an attempt to make an Information Retrieval and Extraction (IRE) system that replicates the current state-of-the-art methods using the IRE and basic Natural Language Processing (NLP) techniques. In this project we have tried to demonstrate the study of the methods that are being used for performing search and retrieval tasks. We also present the small descriptions of the functionalities supported in our system along with the statistics of the dataset. We use Indic-OCR developed at CVIT, for generating the text for the OCR Search Engine.

![Watch the video]

Collaborators

Developed at : Centre for Visual Information and Technology

Thanks to these organisations for providing the data :

National Digital Library of India, IIT Kharagpur
IIIT Hyderabad
British Library, UK

Fork this repo

Fork this repo by clicking on the top of the repository, which will create a copy in your github account

Clone this repository

After forking, clone the repository and open a terminal and run the following git command:

git clone https://github.com/username/cvitsearch-se.git

Link to the project website details (temporary)

Link : http://preon.iiit.ac.in:3000/ [public-temporarily]

Requirements

These are the required packages for this repository to run

* python3.x
* django
* numpy
* scipy
* psql
* elasticsearch-dsl
* elasticsearch
* jinja

You can find all the requirements within the requirements.txt. To install, create a python3.x virtual environment and run :

pip -r install requirements.txt

This will install the packages in the environment.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
cvitsearch		cvitsearch
home		home
jinja2		jinja2
LICENSE		LICENSE
README.md		README.md
manage.py		manage.py
shell.py		shell.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OCR Search Engine

Introduction

Collaborators

Fork this repo

Clone this repository

Link to the project website details (temporary)

Requirements

About

Releases

Packages

Languages

License

iriyagupta/cvitsearch-se

Folders and files

Latest commit

History

Repository files navigation

OCR Search Engine

Introduction

Collaborators

Fork this repo

Clone this repository

Link to the project website details (temporary)

Requirements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages