- (2022/04/22) Update PAN++ ICDAR 2015 joint training & post-processing with vocabulary & visualization code.
- (2021/11/03) Paddle implementation of PAN, see Paddle-PANet. Thanks @simplify23.
- (2021/04/08) PSENet and PAN are included in MMOCR.
This repository contains the official implementations of PSENet, PAN, PAN++, and FAST [coming soon].
Text Detection
- PSENet (CVPR'2019)
- PAN (ICCV'2019)
- FAST (Arxiv'2021) [coming soon]
Text Spotting
First, clone the repository locally:
git clone https://github.com/whai362/pan_pp.pytorch.git
Then, install PyTorch 1.1.0+, torchvision 0.3.0+, and other requirements:
conda install pytorch torchvision -c pytorch
pip install -r requirement.txt
Finally, compile codes of post-processing:
# build pse and pa algorithms
sh ./compile.sh
Please refer to dataset/README.md for dataset preparation.
CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py ${CONFIG_FILE}
For example:
CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py config/pan/pan_r18_ic15.py
python test.py ${CONFIG_FILE} ${CHECKPOINT_FILE}
cd eval/
./eval_{DATASET}.sh
For example:
python test.py config/pan/pan_r18_ic15.py checkpoints/pan_r18_ic15/checkpoint.pth.tar
cd eval/
./eval_ic15.sh
python test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} --report_speed
For example:
python test.py config/pan/pan_r18_ic15.py checkpoints/pan_r18_ic15/checkpoint.pth.tar --report_speed
python test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} --vis
For example:
python test.py config/pan/pan_r18_ic15.py checkpoints/pan_r18_ic15/checkpoint.pth.tar --vis
Please cite the related works in your publications if it helps your research:
@inproceedings{wang2019shape,
title={Shape Robust Text Detection with Progressive Scale Expansion Network},
author={Wang, Wenhai and Xie, Enze and Li, Xiang and Hou, Wenbo and Lu, Tong and Yu, Gang and Shao, Shuai},
booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
pages={9336--9345},
year={2019}
}
@inproceedings{wang2019efficient,
title={Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network},
author={Wang, Wenhai and Xie, Enze and Song, Xiaoge and Zang, Yuhang and Wang, Wenjia and Lu, Tong and Yu, Gang and Shen, Chunhua},
booktitle={Proceedings of the IEEE International Conference on Computer Vision},
pages={8440--8449},
year={2019}
}
@article{wang2021pan++,
title={PAN++: Towards Efficient and Accurate End-to-End Spotting of Arbitrarily-Shaped Text},
author={Wang, Wenhai and Xie, Enze and Li, Xiang and Liu, Xuebo and Liang, Ding and Zhibo, Yang and Lu, Tong and Shen, Chunhua},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
year={2021},
publisher={IEEE}
}
@misc{chen2021fast,
title={FAST: Searching for a Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation},
author={Zhe Chen and Wenhai Wang and Enze Xie and ZhiBo Yang and Tong Lu and Ping Luo},
year={2021},
eprint={2111.02394},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
This project is developed and maintained by IMAGINE Lab@National Key Laboratory for Novel Software Technology, Nanjing University.
This project is released under the Apache 2.0 license.