args used for first table in README #10

tileb1 · 2021-09-29T20:23:10Z

Hello,
Could you please provide the args used for running main_esvit.py with the right arguments for each run in the table below (first table in README)? Are the args used different for each entry?

EsViT (Swin) with network configurations of increased model capacities, pre-trained with both view-level and region-level tasks. ResNet-50 trained with both tasks is shown as a reference.

arch	params	linear	k-nn	download	logs
ResNet-50	23M	75.7%	71.3%	full ckpt	train	linear	knn
EsViT (Swin-T, W=7)	28M	78.0%	75.7%	full ckpt	train	linear	knn
EsViT (Swin-S, W=7)	49M	79.5%	77.7%	full ckpt	train	linear	knn
EsViT (Swin-B, W=7)	87M	80.4%	78.9%	full ckpt	train	linear	knn
EsViT (Swin-T, W=14)	28M	78.7%	77.0%	full ckpt	train	linear	knn
EsViT (Swin-S, W=14)	49M	80.8%	79.1%	full ckpt	train	linear	knn
EsViT (Swin-B, W=14)	87M	81.3%	79.3%	full ckpt	train	linear	knn

Thank you!

The text was updated successfully, but these errors were encountered:

ChunyuanLI · 2021-10-06T06:19:38Z

Good question! You may find the args we used for each run in the released full ckpt, by loading the each checkpoint and checking the key args.

In general, we tuned very little to produce the reported results across different runs, so the hyper-parameter settings are similar in different configurations. For example, one typical hyper-parameter setting is (loading the released checkpoint of EsViT (Swin-T, W=7), and printing the dictionary item args):

Namespace(arch='swin_tiny', batch_size_per_gpu=32, cfg='experiments/imagenet/swin/swin_tiny_patch4_window7_224.yaml', clip_grad=3.0, data_path='/msrhyper-weka/public/penzhan/oscar/phillytools/data/sasa/imagenet/2012', dist_url='env://', epochs=300, freeze_last_layer=1, global_crops_scale=(0.4, 1.0), gpu=0, local_crops_number=8, local_crops_scale=(0.05, 0.4), local_rank=0, lr=0.0005, min_lr=1e-06, momentum_teacher=0.996, norm_last_layer=False, num_workers=10, optimizer='adamw', opts=[], out_dim=65536, output_dir='/mnt/output_storage/dino_exp/swin//swin_tiny/bl_lr0.0005_gpu16_bs32_dense_multicrop_epoch300', patch_size=16, rank=0, saveckp_freq=20, seed=0, teacher_temp=0.07, use_bn_in_head=False, use_dense_prediction=True, use_fp16=True, warmup_epochs=10, warmup_teacher_temp=0.04, warmup_teacher_temp_epochs=30, weight_decay=0.04, weight_decay_end=0.4, world_size=16, zip_mode=True)

tileb1 · 2021-10-06T08:10:28Z

Ah yes, didn't think of loading from the checkpoint... Thanks!

shallowtoil · 2021-10-25T06:21:17Z

Hi, @ChunyuanLI. I have been trying to download the checkpoint to load the pre-training args, but the download speed was extremely slow and the download often failed halfway. Could you please kindly share the args in separate links?

ChunyuanLI added documentation Improvements or additions to documentation good first issue Good for newcomers labels Oct 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

args used for first table in README #10

args used for first table in README #10

tileb1 commented Sep 29, 2021

ChunyuanLI commented Oct 6, 2021

tileb1 commented Oct 6, 2021

shallowtoil commented Oct 25, 2021 •

edited

Loading

args used for first table in README #10

args used for first table in README #10

Comments

tileb1 commented Sep 29, 2021

ChunyuanLI commented Oct 6, 2021

tileb1 commented Oct 6, 2021

shallowtoil commented Oct 25, 2021 • edited Loading

shallowtoil commented Oct 25, 2021 •

edited

Loading