Skip to content

This repository contains the second project I did at IIITH as a research fellow, which was a search engine and had the collaboration from NDLI(IITKGP) adn British Library.

License

Notifications You must be signed in to change notification settings

iriyagupta/cvitsearch-se

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OCR Search Engine

Open Source Love

Introduction

This is the source code for the developed OCR (Optical Character Recognition) Search Engine which is an attempt to make an Information Retrieval and Extraction (IRE) system that replicates the current state-of-the-art methods using the IRE and basic Natural Language Processing (NLP) techniques. In this project we have tried to demonstrate the study of the methods that are being used for performing search and retrieval tasks. We also present the small descriptions of the functionalities supported in our system along with the statistics of the dataset. We use Indic-OCR developed at CVIT, for generating the text for the OCR Search Engine.

![Watch the video]

Collaborators

  • Developed at : Centre for Visual Information and Technology

Thanks to these organisations for providing the data :

  • National Digital Library of India, IIT Kharagpur
  • IIIT Hyderabad
  • British Library, UK

Fork this repo

Fork this repo by clicking on the top of the repository, which will create a copy in your github account

Clone this repository

After forking, clone the repository and open a terminal and run the following git command:

git clone https://github.com/username/cvitsearch-se.git

Link to the project website details (temporary)

Link : http://preon.iiit.ac.in:3000/ [public-temporarily]

Requirements

These are the required packages for this repository to run

* python3.x
* django
* numpy
* scipy
* psql
* elasticsearch-dsl
* elasticsearch
* jinja

You can find all the requirements within the requirements.txt. To install, create a python3.x virtual environment and run :

pip -r install requirements.txt

This will install the packages in the environment.

About

This repository contains the second project I did at IIITH as a research fellow, which was a search engine and had the collaboration from NDLI(IITKGP) adn British Library.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published