Skip to content

Output images, clean up other output formats #281

Output images, clean up other output formats

Output images, clean up other output formats #281

Workflow file for this run

name: Integration test with benchmark
on: [push]
env:
TORCH_DEVICE: "cpu"
OCR_ENGINE: "surya"
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python 3.11
uses: actions/setup-python@v4
with:
python-version: 3.11
- name: Install python dependencies
run: |
pip install poetry
poetry install
poetry remove torch
poetry run pip install torch --index-url https://download.pytorch.org/whl/cpu
- name: Download benchmark data
run: |
wget -O benchmark_data.zip "https://drive.google.com/uc?export=download&id=1NHrdYatR1rtqs2gPVfdvO0BAvocH8CJi"
unzip -o benchmark_data.zip
- name: Run benchmark test
run: |
poetry run python benchmarks/overall.py benchmark_data/pdfs benchmark_data/references report.json
poetry run python scripts/verify_benchmark_scores.py report.json --type marker