SAST

1. Introduction
2. Environment
3. Model Training / Evaluation / Prediction
4. Inference and Deployment
5. FAQ

1. Introduction

Paper:

A Single-Shot Arbitrarily-Shaped Text Detector based on Context Attended Multi-Task Learning Wang, Pengfei and Zhang, Chengquan and Qi, Fei and Huang, Zuming and En, Mengyi and Han, Junyu and Liu, Jingtuo and Ding, Errui and Shi, Guangming ACM MM, 2019

On the ICDAR2015 dataset, the text detection result is as follows:

Model	Backbone	Configuration	Precision	Recall	Hmean	Download
SAST	ResNet50_vd	configs/det/det_r50_vd_sast_icdar15.yml	91.39%	83.77%	87.42%	trained model

On the Total-text dataset, the text detection result is as follows:

Model	Backbone	Configuration	Precision	Recall	Hmean	Download
SAST	ResNet50_vd	configs/det/det_r50_vd_sast_totaltext.yml	89.63%	78.44%	83.66%	trained model

2. Environment

Please prepare your environment referring to prepare the environment and clone the repo.

3. Model Training / Evaluation / Prediction

Please refer to text detection training tutorial. PaddleOCR has modularized the code structure, so that you only need to replace the configuration file to train different detection models.

4. Inference and Deployment

4.1 Python Inference

(1). Quadrangle text detection model (ICDAR2015)

First, convert the model saved in the SAST text detection training process into an inference model. Taking the model based on the Resnet50_vd backbone network and trained on the ICDAR2015 English dataset as an example (model download link), you can use the following command to convert:

python3 tools/export_model.py -c configs/det/det_r50_vd_sast_icdar15.yml -o Global.pretrained_model=./det_r50_vd_sast_icdar15_v2.0_train/best_accuracy  Global.save_inference_dir=./inference/det_sast_ic15

For SAST quadrangle text detection model inference, you need to set the parameter --det_algorithm="SAST", run the following command:

python3 tools/infer/predict_det.py --det_algorithm="SAST" --image_dir="./doc/imgs_en/img_10.jpg" --det_model_dir="./inference/det_sast_ic15/"

The visualized text detection results are saved to the ./inference_results folder by default, and the name of the result file is prefixed with 'det_res'. Examples of results are as follows:

(2). Curved text detection model (Total-Text)

First, convert the model saved in the SAST text detection training process into an inference model. Taking the model based on the Resnet50_vd backbone network and trained on the Total-Text English dataset as an example (model download link), you can use the following command to convert:

python3 tools/export_model.py -c configs/det/det_r50_vd_sast_totaltext.yml -o Global.pretrained_model=./det_r50_vd_sast_totaltext_v2.0_train/best_accuracy  Global.save_inference_dir=./inference/det_sast_tt

For SAST curved text detection model inference, you need to set the parameter --det_algorithm="SAST" and --det_box_type=poly, run the following command:

python3 tools/infer/predict_det.py --det_algorithm="SAST" --image_dir="./doc/imgs_en/img623.jpg" --det_model_dir="./inference/det_sast_tt/" --det_box_type='poly'

The visualized text detection results are saved to the ./inference_results folder by default, and the name of the result file is prefixed with 'det_res'. Examples of results are as follows:

Note: SAST post-processing locality aware NMS has two versions: Python and C++. The speed of C++ version is obviously faster than that of Python version. Due to the compilation version problem of NMS of C++ version, C++ version NMS will be called only in Python 3.5 environment, and python version NMS will be called in other cases.

4.2 C++ Inference

Not supported

4.3 Serving

Not supported

4.4 More

Not supported

5. FAQ

Citation

@inproceedings{wang2019single,
  title={A Single-Shot Arbitrarily-Shaped Text Detector based on Context Attended Multi-Task Learning},
  author={Wang, Pengfei and Zhang, Chengquan and Qi, Fei and Huang, Zuming and En, Mengyi and Han, Junyu and Liu, Jingtuo and Ding, Errui and Shi, Guangming},
  booktitle={Proceedings of the 27th ACM International Conference on Multimedia},
  pages={1277--1285},
  year={2019}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly