TrOCR: Transformer based model for state-of-the-art optical character recognition (OCR) on both printed and handwritten text
End-to-end text recognition approach with pre-trained image transformer and text transformer models for both image understanding and wordpiece-level text generation.
This is based on the implementation of TrOCR found here. This repository contains scripts for optimized on-device export suitable to run on Qualcomm® devices. More details on model performance accross various devices, can be found here.
Sign up to start using Qualcomm AI Hub and run these models on a hosted Qualcomm® device.
Install the package via pip:
pip install "qai_hub_models[trocr]"
Once installed, run the following simple CLI demo:
python -m qai_hub_models.models.trocr.demo
More details on the CLI tool can be found with the --help
option. See
demo.py for sample usage of the model including pre/post processing
scripts. Please refer to our general instructions on using
models for more usage instructions.
This repository contains export scripts that produce a model optimized for on-device deployment. This can be run as follows:
python -m qai_hub_models.models.trocr.export
Additional options are documented with the --help
option. Note that the above
script requires access to Deployment instructions for Qualcomm® AI Hub.
- The license for the original implementation of TrOCR can be found here.
- The license for the compiled assets for on-device deployment can be found here
- TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models
- Source Model Implementation
- Join our AI Hub Slack community to collaborate, post questions and learn more about on-device AI.
- For questions or feedback please reach out to us.