Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

args used for first table in README #10

Open
tileb1 opened this issue Sep 29, 2021 · 3 comments
Open

args used for first table in README #10

tileb1 opened this issue Sep 29, 2021 · 3 comments
Labels
documentation Improvements or additions to documentation good first issue Good for newcomers

Comments

@tileb1
Copy link

tileb1 commented Sep 29, 2021

Hello,
Could you please provide the args used for running main_esvit.py with the right arguments for each run in the table below (first table in README)? Are the args used different for each entry?

  • EsViT (Swin) with network configurations of increased model capacities, pre-trained with both view-level and region-level tasks. ResNet-50 trained with both tasks is shown as a reference.
arch params linear k-nn download logs
ResNet-50 23M 75.7% 71.3% full ckpt train linear knn
EsViT (Swin-T, W=7) 28M 78.0% 75.7% full ckpt train linear knn
EsViT (Swin-S, W=7) 49M 79.5% 77.7% full ckpt train linear knn
EsViT (Swin-B, W=7) 87M 80.4% 78.9% full ckpt train linear knn
EsViT (Swin-T, W=14) 28M 78.7% 77.0% full ckpt train linear knn
EsViT (Swin-S, W=14) 49M 80.8% 79.1% full ckpt train linear knn
EsViT (Swin-B, W=14) 87M 81.3% 79.3% full ckpt train linear knn

Thank you!

@ChunyuanLI
Copy link
Contributor

Good question! You may find the args we used for each run in the released full ckpt, by loading the each checkpoint and checking the key args.

In general, we tuned very little to produce the reported results across different runs, so the hyper-parameter settings are similar in different configurations. For example, one typical hyper-parameter setting is (loading the released checkpoint of EsViT (Swin-T, W=7), and printing the dictionary item args):

Namespace(arch='swin_tiny', batch_size_per_gpu=32, cfg='experiments/imagenet/swin/swin_tiny_patch4_window7_224.yaml', clip_grad=3.0, data_path='/msrhyper-weka/public/penzhan/oscar/phillytools/data/sasa/imagenet/2012', dist_url='env://', epochs=300, freeze_last_layer=1, global_crops_scale=(0.4, 1.0), gpu=0, local_crops_number=8, local_crops_scale=(0.05, 0.4), local_rank=0, lr=0.0005, min_lr=1e-06, momentum_teacher=0.996, norm_last_layer=False, num_workers=10, optimizer='adamw', opts=[], out_dim=65536, output_dir='/mnt/output_storage/dino_exp/swin//swin_tiny/bl_lr0.0005_gpu16_bs32_dense_multicrop_epoch300', patch_size=16, rank=0, saveckp_freq=20, seed=0, teacher_temp=0.07, use_bn_in_head=False, use_dense_prediction=True, use_fp16=True, warmup_epochs=10, warmup_teacher_temp=0.04, warmup_teacher_temp_epochs=30, weight_decay=0.04, weight_decay_end=0.4, world_size=16, zip_mode=True)

@ChunyuanLI ChunyuanLI added documentation Improvements or additions to documentation good first issue Good for newcomers labels Oct 6, 2021
@tileb1
Copy link
Author

tileb1 commented Oct 6, 2021

Ah yes, didn't think of loading from the checkpoint... Thanks!

@shallowtoil
Copy link

shallowtoil commented Oct 25, 2021

Hi, @ChunyuanLI. I have been trying to download the checkpoint to load the pre-training args, but the download speed was extremely slow and the download often failed halfway. Could you please kindly share the args in separate links?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants