-
论文:Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
-
官方项目:whai362/PVT
-
模型代码:pvt.py
-
验证集数据处理:
# 图像后端:pil # 输入图像大小:224x224 transforms = T.Compose([ T.Resize(248, interpolation='bicubic'), T.CenterCrop(224), T.ToTensor(), T.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) ])
-
模型细节:
Model Model Name Params (M) FLOPs (G) Top-1 (%) Top-5 (%) Pretrained Model PVT-Tiny pvt_ti 13.2 1.9 74.96 92.47 Download PVT-Small pvt_s 24.5 3.8 79.87 95.05 Download PVT-Medium pvt_m 44.2 6.7 81.48 95.75 Download PVT-Large pvt_l 61.4 9.8 81.74 95.87 Download
-
引用:
@misc{wang2021pyramid, title={Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions}, author={Wenhai Wang and Enze Xie and Xiang Li and Deng-Ping Fan and Kaitao Song and Ding Liang and Tong Lu and Ping Luo and Ling Shao}, year={2021}, eprint={2102.12122}, archivePrefix={arXiv}, primaryClass={cs.CV} }