Method | Arch | Pretraining epochs | Pretraining mode | val | test | Pretrained | Finetuned |
---|---|---|---|---|---|---|---|
MAE | ViT-B/16 | 1600 | SSL | 38.3 | 37.0 | model | model |
MAE | ViT-B/16 | 1600 | SSL+Sup | 61.0 | 60.2 | model | model |
SERE | ViT-S/16 | 100 | SSL | 41.0 | 40.2 | model | model |
SERE | ViT-S/16 | 100 | SSL+Sup | 58.9 | 57.8 | model | model |
Command for SSL+Sup
python -m torch.distributed.launch --nproc_per_node=8 main_segfinetune.py \
--accum_iter 1 \
--batch_size 32 \
--model vit_base_patch16 \
--finetune mae_finetuned_vit_base.pth \
--epochs 100 \
--nb_classes 920 \
--blr 1e-4 --layer_decay 0.40 \
--weight_decay 0.05 --drop_path 0.1 \
--data_path ${IMAGENETS_DIR} \
--output_dir ${OUTPATH} \
--dist_eval
Command for SSL
python -m torch.distributed.launch --nproc_per_node=8 main_segfinetune.py \
--accum_iter 1 \
--batch_size 32 \
--model vit_base_patch16 \
--finetune mae_pretrain_vit_base.pth \
--epochs 100 \
--nb_classes 920 \
--blr 5e-4 --layer_decay 0.60 \
--weight_decay 0.05 --drop_path 0.1 \
--data_path ${IMAGENETS_DIR} \
--output_dir ${OUTPATH} \
--dist_eval
Command for SSL+Sup
python -m torch.distributed.launch --nproc_per_node=8 main_segfinetune.py \
--accum_iter 1 \
--batch_size 32 \
--model vit_small_patch16 \
--finetune sere_finetuned_vit_small_ep100.pth \
--epochs 100 \
--nb_classes 920 \
--blr 5e-4 --layer_decay 0.50 \
--weight_decay 0.05 --drop_path 0.1 \
--data_path ${IMAGENETS_DIR} \
--output_dir ${OUTPATH} \
--dist_eval
Command for SSL
python -m torch.distributed.launch --nproc_per_node=8 main_segfinetune.py \
--accum_iter 1 \
--batch_size 32 \
--model vit_small_patch16 \
--finetune sere_pretrained_vit_small_ep100.pth \
--epochs 100 \
--nb_classes 920 \
--blr 5e-4 --layer_decay 0.50 \
--weight_decay 0.05 --drop_path 0.1 \
--data_path ${IMAGENETS_DIR} \
--output_dir ${OUTPATH} \
--dist_eval
Method | Arch | Pretraining epochs | Pretraining mode | val | test | Pretrained | Finetuned |
---|---|---|---|---|---|---|---|
PASS | ResNet-50 D32 | 100 | SSL | 21.0 | 20.3 | model | model |
PASS | ResNet-50 D16 | 100 | SSL | 21.6 | 20.8 | model | model |
D16
means the output stride is 16 with dilation=2 in the last stage. This result is better than the results reported in the paper thanks to the new training scripts.
Command for SSL (ResNet-50 D32)
python -m torch.distributed.launch --nproc_per_node=8 main_segfinetune.py \
--accum_iter 1 \
--batch_size 32 \
--model resnet50 \
--finetune pass919_pretrained.pth.tar \
--epochs 100 \
--nb_classes 920 \
--blr 5e-4 --layer_decay 0.4 \
--weight_decay 0.0005 \
--data_path ${IMAGENETS_DIR} \
--output_dir ${OUTPATH} \
--dist_eval
Command for SSL (ResNet-50 D16)
python -m torch.distributed.launch --nproc_per_node=8 main_segfinetune.py \
--accum_iter 1 \
--batch_size 32 \
--model resnet50_d16 \
--finetune pass919_pretrained.pth.tar \
--epochs 100 \
--nb_classes 920 \
--blr 5e-4 --layer_decay 0.45 \
--weight_decay 0.0005 \
--data_path ${IMAGENETS_DIR} \
--output_dir ${OUTPATH} \
--dist_eval
Arch | Pretraining epochs | RF-Next mode | val | test | Pretrained | Searched | Finetuned |
---|---|---|---|---|---|---|---|
ConvNeXt-T | 300 | - | 48.7 | 48.8 | model | - | model |
RF-ConvNeXt-T | 300 | rfsingle | 50.7 | 50.5 | model | model | model |
RF-ConvNeXt-T | 300 | rfmultiple | 50.8 | 50.5 | model | model | model |
RF-ConvNeXt-T | 300 | rfmerge | 51.3 | 51.1 | model | model | model |
Command for ConvNeXt-T
python -m torch.distributed.launch --nproc_per_node=8 main_segfinetune.py \
--accum_iter 1 \
--batch_size 32 \
--model convnext_tiny \
--patch_size 4 \
--finetune convnext_tiny_1k_224_ema.pth \
--epochs 100 \
--nb_classes 920 \
--blr 2.5e-4 --layer_decay 0.6 \
--weight_decay 0.05 --drop_path 0.2 \
--data_path ${IMAGENETS_DIR} \
--output_dir ${OUTPATH} \
--dist_eval
Before training RF-ConvNext, please search dilation rates with the mode of rfsearch.
For rfmultiple and rfsingle, please set pretrained_rfnext
as the weights trained in rfsearch.
For rfmerge, we initilize the model with weights in rfmultiple and only finetune seg_norm
, seg_head
and rfconvs
whose dilate rates are changed.
The othe parts of the network are freezed.
Please set pretrained_rfnext
as the weights trained in rfmutilple.
Note that this freezing operation in rfmerge may be not required for other tasks.
Command for RF-ConvNeXt-T (rfsearch)
python -m torch.distributed.launch --nproc_per_node=8 main_segfinetune.py \
--accum_iter 1 \
--batch_size 32 \
--model rfconvnext_tiny_rfsearch \
--patch_size 4 \
--finetune convnext_tiny_1k_224_ema.pth \
--epochs 100 \
--nb_classes 920 \
--blr 2.5e-4 --layer_decay 0.6 0.9 --layer_multiplier 1.0 10.0 \
--weight_decay 0.05 --drop_path 0.2 \
--data_path ${IMAGENETS_DIR} \
--output_dir ${OUTPATH} \
--dist_eval
Command for RF-ConvNeXt-T (rfsingle)
python -m torch.distributed.launch --nproc_per_node=8 main_segfinetune.py \
--accum_iter 1 \
--batch_size 32 \
--model rfconvnext_tiny_rfsingle \
--patch_size 4 \
--finetune convnext_tiny_1k_224_ema.pth \
--pretrained_rfnext ${OUTPATH_OF_RFSEARCH}/checkpoint-99.pth \
--epochs 100 \
--nb_classes 920 \
--blr 2.5e-4 --layer_decay 0.6 0.9 --layer_multiplier 1.0 10.0 \
--weight_decay 0.05 --drop_path 0.2 \
--data_path ${IMAGENETS_DIR} \
--output_dir ${OUTPATH} \
--dist_eval
python inference.py --model rfconvnext_tiny_rfsingle \
--patch_size 4 \
--nb_classes 920 \
--output_dir ${OUTPATH}/predictions \
--data_path ${IMAGENETS_DIR} \
--pretrained_rfnext ${OUTPATH_OF_RFSEARCH}/checkpoint-99.pth \
--finetune ${OUTPATH}/checkpoint-99.pth \
--mode validation
Command for RF-ConvNeXt-T (rfmultiple)
python -m torch.distributed.launch --nproc_per_node=8 main_segfinetune.py \
--accum_iter 1 \
--batch_size 32 \
--model rfconvnext_tiny_rfmultiple \
--patch_size 4 \
--finetune convnext_tiny_1k_224_ema.pth \
--pretrained_rfnext ${OUTPATH_OF_RFSEARCH}/checkpoint-99.pth \
--epochs 100 \
--nb_classes 920 \
--blr 2.5e-4 --layer_decay 0.55 0.9 --layer_multiplier 1.0 10.0 \
--weight_decay 0.05 --drop_path 0.1 \
--data_path ${IMAGENETS_DIR} \
--output_dir ${OUTPATH} \
--dist_eval
python inference.py --model rfconvnext_tiny_rfmultiple \
--patch_size 4 \
--nb_classes 920 \
--output_dir ${OUTPATH}/predictions \
--data_path ${IMAGENETS_DIR} \
--pretrained_rfnext ${OUTPATH_OF_RFSEARCH}/checkpoint-99.pth \
--finetune ${OUTPATH}/checkpoint-99.pth \
--mode validation
Command for RF-ConvNeXt-T (rfmerge)
python -m torch.distributed.launch --nproc_per_node=8 main_segfinetune.py \
--accum_iter 1 \
--batch_size 32 \
--model rfconvnext_tiny_rfmerge \
--patch_size 4 \
--pretrained_rfnext ${OUTPATH_OF_RFMULTIPLE}/checkpoint-99.pth \
--epochs 100 \
--nb_classes 920 \
--blr 2.5e-4 --layer_decay 0.55 1.0 --layer_multiplier 1.0 10.0 \
--weight_decay 0.05 --drop_path 0.2 \
--data_path ${IMAGENETS_DIR} \
--output_dir ${OUTPATH} \
--dist_eval
python inference.py --model rfconvnext_tiny_rfmerge \
--patch_size 4 \
--nb_classes 920 \
--output_dir ${OUTPATH}/predictions \
--data_path ${IMAGENETS_DIR} \
--pretrained_rfnext ${OUTPATH_OF_RFMULTIPLE}/checkpoint-99.pth \
--finetune ${OUTPATH}/checkpoint-99.pth \
--mode validation