- Python 3.10
- Tesseract binary should be installed first.
-
Install Python 3.10
-
Install Tesseract binary
-
Create venv for this project
python -m venv venv
-
Install packages using requirements.txt
- Activate virtual python environment
call venv/Scripts/activate.bat
- Install packages
pip install -r requirements.txt
-
Create pdfs, images and result directories
Yup, you are ready to go.
-
Copy pdf files into pdfs directory
-
Run the script
python main.py
-
Enter as required during the process
-
Output will be in result directory