Skip to content
/ SEGR Public

SEGR:Semantic Enhancement and Graph Reasoning for Scene Text Recognition

License

Notifications You must be signed in to change notification settings

HHeracles/SEGR

Repository files navigation

SEGR:Semantic Enhancement and Graph Reasoning for Irregular Scene Text Recognition

Scene text recognition is an important research field focused on visual understanding, which involves cross-modal processing of visual and text semantic information. Accurately recognizing irregular scene text, which has problems such as low resolution, blurriness, deformation, uneven illumination, and so on, is a common challenge for existing scene text recognition methods. In this paper, we proposed a novel scene text recognition method based on text semantic enhancement and characters graph reasoning (SEGR) to improve the accuracy of irregular text recognition. Specifically, SEGR consists of a visual recognition branch that performs preliminary recognition based on visual features and an iterative correction branch that performs the correction of the preliminary recognition by mining semantic information and relationships between characters. The iteration correction branch consists of a text semantic enhancement module based on transformer and a relational reasoning module based on characters graph.

framework

Requirements

pip install torch==1.7.1 torchvision==0.8.2 fastai==1.0.60 opencv-python tensorboardX lmdb pillow

Datasets

We used datasets in LMDB format for training and evaluation. Synthetic datasets MJSynth, SynthTex and WikiText were used in the training process, and three irregular text datasets and three regular text datasets were used in the evaluation process.

Models

The SEGR pretrained model provided by us is on BaiduNetdisk(passwd:ph33), you can download it by yourself. The performance of the pretrained model on the evaluation datasets are shown in the following table:

Model IC13 SVT IIIT IC15 SVTP CUTE
SEGR 97.7 94.1 96.4 86.0 90.1 92.7

Training

If you want to train the model, you can use the following command:

CUDA_VISIBLE_DEVICES=0, 1 python main.py --config=configs/train_segr.yaml

Evaluation

If you want to evaluate the model, you can use the following command:

CUDA_VISIBLE_DEVICES=0, 1 python main.py --config=configs/train_segr.yaml --phase test --image_only

Acknowledgements

This PyTorch implementation is based on ABINet.

About

SEGR:Semantic Enhancement and Graph Reasoning for Scene Text Recognition

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published