Skip to content

Bachelor thesis in Computer Science by the Universidade Estadual de Santa Cruz. API for data extraction from pre-formatted PDF's.

Notifications You must be signed in to change notification settings

pedrecal/PDF_Extractor-API

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

65 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PDF Extractor

This API was developed for my bachelor thesis in Computer Science by the Universidade Estadual de Santa Cruz.

The objective of this work is to extract data from pre-formatted PDF files.

The dissertation of this work can be found here. (In brazilian portuguese)

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.

Prerequisites

Most important you will need Node.JS v12 or greater.

And Yarn.

Then you will need an running instace of mongoDB, it can be mLab, MongoDB Atlas or a local version.

So you can configure the variables.env.example. Remember to remove the .example at the end of the file name.

Installing

With all the prerequisites done, you need to install the dependencies

yarn

And to have the API up and running:

yarn run dev

This way you will have the development server running.

And coding style

The architectural structure of this project

Built With

  • Node.JS - Node.js® is a JavaScript runtime built on Chrome's V8 JavaScript engine.
  • PDF.JS - A general-purpose, web standards-based platform for parsing and rendering PDFs.
  • Express - Fast, unopinionated, minimalist web framework for Node.js

Contributing

Feel free to submit pull requests and contact me!

Versioning

We use SemVer for versioning.

Authors

  • Alexandre Pedrecal - Initial work - Pedrecal

See also the list of contributors who participated in this project.

License

This project is licensed under the MIT License.

About

Bachelor thesis in Computer Science by the Universidade Estadual de Santa Cruz. API for data extraction from pre-formatted PDF's.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published