This repository has been archived by the owner on Mar 19, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 332
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
RoB distillation + JEPA evaluations (#284)
Summary: Pull Request resolved: fairinternal/ssl_scaling#284 Reviewed By: odelalleau Differential Revision: D42220017 Pulled By: QuentinDuval fbshipit-source-id: 742419aa859fdbe4bc80f1f9e9f4771fee0f41a2
- Loading branch information
1 parent
346114a
commit 04788de
Showing
259 changed files
with
13,408 additions
and
791 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
14 changes: 14 additions & 0 deletions
14
configs/config/benchmark/fulltune/imagenet1k/models/mobilenet_v3_timm.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
# @package _global_ | ||
config: | ||
MODEL: | ||
TRUNK: | ||
NAME: mobilenetv3_timm | ||
MOBILE_NET: | ||
NAME: mobilenetv3_large_100 | ||
TRUNK_ONLY: True | ||
HEAD: | ||
PARAMS: [ | ||
["mobilenet_v3_head_timm", {"num_classes": 1000}], | ||
] | ||
OPTIMIZER: | ||
regularize_bn: True |
12 changes: 12 additions & 0 deletions
12
configs/config/benchmark/fulltune/imagenet1k/models/mobilenet_v3_tv.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
# @package _global_ | ||
config: | ||
MODEL: | ||
TRUNK: | ||
NAME: mobilenetv3_tv | ||
MOBILE_NET: | ||
NAME: mobilenetv3_large_100 | ||
TIMM_BN: False | ||
HEAD: | ||
PARAMS: [ | ||
["mobilenet_v3_head", {"num_classes": 1000}], | ||
] |
9 changes: 9 additions & 0 deletions
9
configs/config/benchmark/fulltune/imagenet1k/models/resnet18_eval_mlp.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
# @package _global_ | ||
config: | ||
MODEL: | ||
TRUNK: | ||
NAME: resnet | ||
RESNETS: | ||
DEPTH: 18 | ||
HEAD: | ||
PARAMS: [['eval_mlp', {'in_channels': 512, 'dims': [512, 1000]}]] |
9 changes: 9 additions & 0 deletions
9
configs/config/benchmark/fulltune/imagenet1k/models/resnet34_eval_mlp.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
# @package _global_ | ||
config: | ||
MODEL: | ||
TRUNK: | ||
NAME: resnet | ||
RESNETS: | ||
DEPTH: 34 | ||
HEAD: | ||
PARAMS: [['eval_mlp', {'in_channels': 512, 'dims': [512, 1000]}]] |
9 changes: 9 additions & 0 deletions
9
configs/config/benchmark/fulltune/imagenet1k/models/resnext50_eval_mlp.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
# @package _global_ | ||
config: | ||
MODEL: | ||
TRUNK: | ||
NAME: resnet | ||
RESNETS: | ||
DEPTH: 50 | ||
HEAD: | ||
PARAMS: [['eval_mlp', {'in_channels': 2048, 'dims': [2048, 1000]}]] |
27 changes: 27 additions & 0 deletions
27
configs/config/benchmark/fulltune/imagenet1k/models/vit_tiny_cls4_eval_mlp.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
# @package _global_ | ||
config: | ||
MODEL: | ||
FEATURE_EVAL_SETTINGS: | ||
EVAL_MODE_ON: True | ||
FREEZE_TRUNK_AND_HEAD: True | ||
LINEAR_EVAL_FEAT_POOL_OPS_MAP: [ | ||
["concatCLS4", ["Identity", []] ], | ||
] | ||
TRUNK: # Tiny | ||
NAME: vision_transformer | ||
VISION_TRANSFORMERS: | ||
IMAGE_SIZE: 224 | ||
PATCH_SIZE: 16 | ||
NUM_LAYERS: 12 | ||
NUM_HEADS: 3 | ||
HIDDEN_DIM: 192 | ||
MLP_DIM: 768 | ||
CLASSIFIER: token | ||
DROPOUT_RATE: 0 | ||
ATTENTION_DROPOUT_RATE: 0 | ||
QKV_BIAS: True | ||
DROP_PATH_RATE: 0.0 | ||
HEAD: | ||
PARAMS: [ | ||
["eval_mlp", {"in_channels": 768, "dims": [768, 1000]}], | ||
] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
20 changes: 20 additions & 0 deletions
20
configs/config/benchmark/linear_image_classification/cifar100/models/mobilenet_v3_timm.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
# @package _global_ | ||
config: | ||
MODEL: | ||
FEATURE_EVAL_SETTINGS: | ||
LINEAR_EVAL_FEAT_POOL_OPS_MAP: [ | ||
["flatten", ["Identity", []] ], | ||
["flatten", ["Identity", []] ], | ||
] | ||
TRUNK: | ||
NAME: mobilenetv3_timm | ||
MOBILE_NET: | ||
NAME: mobilenetv3_large_100 | ||
PRETRAINED: False | ||
HEAD: | ||
PARAMS: [ | ||
["eval_mlp", {"in_channels": 1280, "dims": [1280, 100]}], | ||
["mlp", {"dims": [1280, 100]}], | ||
] | ||
OPTIMIZER: | ||
regularize_bn: True |
105 changes: 105 additions & 0 deletions
105
configs/config/benchmark/linear_image_classification/cifar100/models/mobilenet_v3_tv.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,105 @@ | ||
# @package _global_ | ||
config: | ||
MODEL: | ||
FEATURE_EVAL_SETTINGS: | ||
EVAL_MODE_ON: True | ||
FREEZE_TRUNK_ONLY: True | ||
SHOULD_FLATTEN_FEATS: True | ||
LINEAR_EVAL_FEAT_POOL_OPS_MAP: [ | ||
# Linear heads on top of normalized or not representations | ||
["trunk_pool", ["Identity", []] ], | ||
["trunk_pool", ["Identity", []] ], | ||
["trunk_pool", ["Identity", []] ], | ||
|
||
# MobileNet head on top of normalized or not representations | ||
["trunk_pool", ["Identity", []] ], | ||
["trunk_pool", ["Identity", []] ], | ||
["trunk_pool", ["Identity", []] ], | ||
# ["trunk_pool", ["Identity", []] ], | ||
# ["trunk_pool", ["Identity", []] ], | ||
|
||
# Exploring a two layer head | ||
["trunk_pool", ["Identity", []] ], | ||
["trunk_pool", ["Identity", []] ], | ||
["trunk_pool", ["Identity", []] ], | ||
|
||
# Combining several levels of representations | ||
["trunk", ["AdaptiveAvgPool2d", [[2, 1]]]], | ||
["trunk", ["AdaptiveAvgPool2d", [[2, 1]]]], | ||
["trunk", ["AdaptiveAvgPool2d", [[2, 1]]]], | ||
["trunk", ["AdaptiveAvgPool2d", [[2, 2]]]], | ||
["trunk", ["AdaptiveAvgPool2d", [[2, 2]]]], | ||
["trunk", ["AdaptiveAvgPool2d", [[2, 2]]]], | ||
] | ||
TRUNK: | ||
NAME: mobilenetv3_tv | ||
MOBILE_NET: | ||
NAME: mobilenetv3_large_100 | ||
PRETRAINED: False | ||
HEAD: | ||
PARAMS: [ | ||
# Linear heads on top of normalized or not representations | ||
["eval_mlp", {"in_channels": 960, "dims": [960, 100]}], | ||
["eval_mlp", {"in_channels": 960, "dims": [960, 100]}], | ||
["eval_mlp", {"in_channels": 960, "dims": [960, 100]}], | ||
|
||
# MobileNet head on top of normalized or not representations | ||
["mobilenet_v3_head", {"with_bn": True, "num_classes": 100}], | ||
["mobilenet_v3_head", {"with_bn": True, "num_classes": 100}], | ||
["mobilenet_v3_head", {"with_bn": True, "num_classes": 100}], | ||
# ["mobilenet_v3_head", {"with_bn": True, "drop_out": 0.1, "num_classes": 100}], | ||
# ["mobilenet_v3_head", {"with_bn": True, "drop_out": 0.0, "num_classes": 100}], | ||
|
||
# Exploring a two layers head | ||
["eval_mlp", {"in_channels": 960, "dims": [960, 1280, 100]}], | ||
["eval_mlp", {"in_channels": 960, "dims": [960, 1280, 100]}], | ||
["eval_mlp", {"in_channels": 960, "dims": [960, 1280, 100]}], | ||
|
||
# Combining several levels of representations | ||
["eval_mlp", {"in_channels": 1920, "dims": [1920, 100]}], | ||
["eval_mlp", {"in_channels": 1920, "dims": [1920, 100]}], | ||
["eval_mlp", {"in_channels": 1920, "dims": [1920, 100]}], | ||
["eval_mlp", {"in_channels": 3840, "dims": [3840, 100]}], | ||
["eval_mlp", {"in_channels": 3840, "dims": [3840, 100]}], | ||
["eval_mlp", {"in_channels": 3840, "dims": [3840, 100]}], | ||
] | ||
OPTIMIZER: | ||
name: sgd | ||
# In the OSS Caffe2 benchmark, RN50 models use 1e-4 and AlexNet models 5e-4 | ||
weight_decay: 0.0005 | ||
momentum: 0.9 | ||
num_epochs: 28 | ||
nesterov: True | ||
regularize_bn: True | ||
regularize_bias: True | ||
param_schedulers: | ||
lr: | ||
auto_lr_scaling: | ||
auto_scale: true | ||
base_value: 0.01 | ||
base_lr_batch_size: 256 | ||
name: multistep | ||
values: [0.01, 0.001, 0.0001, 0.00001] | ||
milestones: [8, 16, 24] | ||
update_interval: epoch | ||
param_group_constructor: linear_eval_heads | ||
linear_eval_heads: | ||
# Linear heads on top of normalized or not representations | ||
- {"lr": 1.0, "weight_decay": 0.0005, "regularize_bn": True} | ||
- {"lr": 1.0, "weight_decay": 0.0005, "regularize_bn": False} | ||
- {"lr": 1.0, "weight_decay": 0.0} | ||
# MobileNet head on top of normalized or not representations | ||
- {"lr": 1.0, "weight_decay": 0.0005, "regularize_bn": True} | ||
- {"lr": 1.0, "weight_decay": 0.0005, "regularize_bn": False} | ||
- {"lr": 1.0, "weight_decay": 0.0} | ||
# Exploring a two layers head | ||
- {"lr": 1.0, "weight_decay": 0.0005} | ||
- {"lr": 1.0, "weight_decay": 0.0001} | ||
- {"lr": 1.0, "weight_decay": 0.0} | ||
# Combining several levels of representations | ||
- {"lr": 1.0, "weight_decay": 0.0005, "regularize_bn": True} | ||
- {"lr": 1.0, "weight_decay": 0.0005, "regularize_bn": False} | ||
- {"lr": 1.0, "weight_decay": 0.0} | ||
- {"lr": 1.0, "weight_decay": 0.0005, "regularize_bn": True} | ||
- {"lr": 1.0, "weight_decay": 0.0005, "regularize_bn": False} | ||
- {"lr": 1.0, "weight_decay": 0.0} |
54 changes: 54 additions & 0 deletions
54
configs/config/benchmark/linear_image_classification/cifar100/models/vit_g16_no_cls.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
# @package _global_ | ||
config: | ||
DATA: | ||
TRAIN: | ||
TRANSFORMS: | ||
- name: RandomResizedCrop | ||
size: 224 | ||
interpolation: 3 | ||
- name: RandomHorizontalFlip | ||
- name: ToTensor | ||
- name: Normalize | ||
mean: [0.485, 0.456, 0.406] | ||
std: [0.229, 0.224, 0.225] | ||
TEST: | ||
TRANSFORMS: | ||
- name: Resize | ||
size: 256 | ||
interpolation: 3 | ||
- name: CenterCrop | ||
size: 224 | ||
- name: ToTensor | ||
- name: Normalize | ||
mean: [0.485, 0.456, 0.406] | ||
std: [0.229, 0.224, 0.225] | ||
MODEL: | ||
FEATURE_EVAL_SETTINGS: | ||
LINEAR_EVAL_FEAT_POOL_OPS_MAP: [ | ||
["concatPOOL4", ["Identity", []] ], | ||
["lastPOOL", ["Identity", []] ], | ||
["concatPOOL4", ["Identity", []] ], | ||
["lastPOOL", ["Identity", []] ], | ||
] | ||
TRUNK: # L-16 | ||
NAME: vision_transformer | ||
VISION_TRANSFORMERS: | ||
IMAGE_SIZE: 224 | ||
PATCH_SIZE: 16 | ||
NUM_LAYERS: 40 | ||
NUM_HEADS: 16 | ||
HIDDEN_DIM: 1408 | ||
MLP_DIM: 6144 | ||
DROPOUT_RATE: 0.0 | ||
ATTENTION_DROPOUT_RATE: 0.0 | ||
CLASSIFIER: token | ||
QKV_BIAS: True | ||
DROP_PATH_RATE: 0.0 | ||
USE_CLASS_TOKEN: False | ||
HEAD: | ||
PARAMS: [ | ||
["eval_mlp", {"in_channels": 5632, "dims": [5632, 100]}], | ||
["eval_mlp", {"in_channels": 1408, "dims": [1408, 100]}], | ||
["mlp", {"dims": [5632, 100]}], | ||
["mlp", {"dims": [1408, 100]}], | ||
] |
Oops, something went wrong.