The code was trained with python 3.6, pytorch 1.10.1, torchvision 0.11.2, CUDA 10.1, opencv-python 4.5.1, self-attention-cv 1.2.3 and Ubuntu 18.04.
-
[Optional but recommended] create a new conda environment
conda create -n HOLT-Net python=3.6
And activate the environment
conda activate HOLT-Net
-
Clone this repository:
git clone https://github.com/JackKoLing/HOLT-Net.git
-
Install necessary packages (other common packages installed if need):
pip install torch==1.10.1 torchvision==0.11.2 opencv-python==4.5.1 self-attention-cv==1.2.3 tqdm numpy tensorboard tensorboardX pyyaml
Download SCAU-SD dataset form Google cloud or Baidu cloud. We transform the annotations of SCAU-SD dataset to JSON format following no_frills_hoi_det.
We count the training sample number of each category in smoker-det_hoi_count.json and smoker-det_verb_count.json following DIRV. It serves as a weight when calculating loss.
Make sure to put the files in the following structure:
|-- datasets
| |-- smoker_det
| | |-- images
| | | |-- trainval
| | | |-- test
| | |-- annotations
| | | |-- anno_list.json
| | | |-- hoi_list.json
| | | |-- object_list.json
| | | |-- smoker-det_hoi_count.json
| | | |-- smoker-det_verb_count.json
| | | |-- verb_list.json
Download the pre-trained models from Google cloud or Baidu cloud. which are trained on EfficientDet, and the post-refinement model trained on YOLOX. Make sure to put them in weights/
folder.
python train_smoker.py -c 1 --batch_size 8 --optim adamw --load_weights weights/efficientdet-d1_pretrained.pth
You may also adjust the saving directory and GPU number in projects/smoker_det.yaml
if you have multi-GPUs.
python test_smoker.py -c 1 -w $path to the checkpoint$
cd eval
python get_test_pred_yolox.py
python eval_smker.py
The code is developed based on the architecture of EfficientDet, DIRV and YOLOX. We sincerely thank the authors for the excellent works!