This is the official implementation for ECCV24 paper: "POA: Pre-training Once for Models of All Sizes"
POA is a novel self-supervised pre-training framework capable of training multiple-sized models concurrently. By choosing different elastic widths and depths, the pre-trained teacher model can generate a sufficient number of candidate sub-networks for the selection of the suitable model tailored for downstream applications according to the computational resources available.This code uses the same environment as dinov2, please refer to:https://github.com/facebookresearch/dinov2
The root directory of the Imagenet dataset should hold the following contents:
<ROOT>/train/n01440764/n01440764_10026.JPEG
<ROOT>/train/[...]
<ROOT>/train/n15075141/n15075141_9993.JPEG
<ROOT>/val/n01440764/ILSVRC2012_val_00000293.JPEG
<ROOT>/val/[...]
<ROOT>/val/n15075141/ILSVRC2012_val_00049174.JPEG
<ROOT>/labels.txt
The provided dataset implementation expects a few additional metadata files and these metadata files can be generated (once) with the following lines of Python code:
from poa.data.datasets import ImageNet
for split in ImageNet.Split:
dataset = ImageNet(split=split, root="<ROOT>", extra="<EXTRA>")
dataset.dump_extra()
Run run_pretrain.sh for POA pretraining on 4 A100-80GB nodes (32 GPUs) :
sh shells/run_pretrain.sh
k-NN, linear probing, and fine-tuning eval:
sh shells/eval/eval_knn.sh
sh shells/eval/eval_linear.sh
sh shells/eval/eval_finetuning.sh
For downstream task evaluation, please run extract_elastic.sh to extract the specific submodel. After that, run the corresponding scripts in the shell/eval folder.
This repository is built based on the dinov2 repository and the ibot repository.
POA code is released under the Apache License 2.0. See LICENSE for additional details.