This repo holds the Pytorch codes and models for Self-supervised Video Hashing via Bidirectional Transformers presented on CVPR 2021
We build our model based on BERT.
Self-supervised Video Hashing via Bidirectional Transformers
Shuyan Li, Xiu Li, Jiwen Lu, Jie Zhou
Pytorch 0.4.1
VGG features are kindly uploaded by the authors of SSVH. You can download them from Baiduyun disk.
FCV: https://pan.baidu.com/s/1i65ccHv and YFCC: https://pan.baidu.com/s/1bqR8VCF
Please set the data_root and home_root in ./utils/args.py.
You can place these features to in data_root.
These following data should be prepared before training. Some of them for FCVID have been provided. Generation files are provided in ./utils:
- Latent features. We have uploaded them in ./data/latent_feats.h5. You can also generate this file by yourself.
You should first train BTH model with only mask_loss, and use save_nf function in eval.py to generate it.
-
Anchor set. We have uploaded it in ./data/anchors.h5. You can also generate this file by running get_anchors.py.
-
Pseudo labels. We have uploaded them in ./data/train_assit.h5. You can also generate this file by running prepare.py.
-
Similarity matrix. You can directly run apro_adj.py to generate sim_matrix.h5 and save it in data_root. Since this file is very large, we didn't upload it.
After correctly setting the path, you can run train.py to train the model. Models will be saved in ./models.
When training is done, you can run eval.py to test it. mAP files will be saved in ./results.
We have provided a model trained on FCVID for testing: ./models/fcv_bits_64/9288.pth.
Please cite the following paper if you feel BTH useful to your research
@inproceedings{BTH2021CVPR,
author = {Shuyan Li and
Xiu Li and
Jiwen Lu and
Jie Zhou},
title = {Self-supervised Video Hashing via Bidirectional Transformers},
booktitle = {CVPR},
year = {2021},
}
For any question, please file an issue or contact Lily:
email: [email protected]