Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
bozyurt authored Aug 27, 2020
1 parent 3579bac commit 2a31134
Showing 1 changed file with 11 additions and 2 deletions.
13 changes: 11 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,20 @@
# paperpdf2xml

A set of Python CLI to convert scientific papers in PDF format to XML documents with sections and tables.
A set of Python 3 CLI to convert scientific papers in PDF format to XML documents with sections and tables.

## Prerequisites

* Make sure you have installed `pdftottext` utility installed for initial PDF to text conversion
* Make sure you have installed `pdftottext` utility installed for initial PDF to text conversion.

For Ubuntu/Debian
```
sudo apt-get install poppler-utils
```

For RedHat/RHEL/ Fedora/ CentOS Linux
```
sudo yum install poppler-utils
```

* Install `spacy` NLP library and models

0 comments on commit 2a31134

Please sign in to comment.