Name		Name	Last commit message	Last commit date
parent directory ..
datasets		datasets
tests		tests
README.md		README.md
args.py		args.py
benchmarks.yml		benchmarks.yml
checkpoint.py		checkpoint.py
configs.yml		configs.yml
ipu_options.py		ipu_options.py
log.py		log.py
model.py		model.py
optimization.py		optimization.py
requirements.txt		requirements.txt
train.py		train.py
zero_shot.py		zero_shot.py

README.md

CLIP

CLIP (ViT-B/32) based on the models provided by the openai-CLIP models, optimised for Graphcore's IPU.

Framework	Domain	Model	Datasets	Tasks	Training	Inference	Reference
PyTorch	Vision	CLIP	Conceptual Captions (cc3m), Imagenet LSVRC 2012, CIFAR-100	Image recognition	✅ Min. 8 IPUs (POD16) required	❌	Learning Transferable Visual Models From Natural Language Supervision

Instructions summary

Install and enable the Poplar SDK (see Poplar SDK setup)
Install the system and Python requirements (see Environment setup)
Download the ImageNet LSVRC 2012 dataset (See Dataset setup)

Poplar SDK setup

To check if your Poplar SDK has already been enabled, run:

 echo $POPLAR_SDK_ENABLED

If no path is provided, then follow these steps:

Navigate to your Poplar SDK root directory
Enable the Poplar SDK with:

cd poplar-<OS version>-<SDK version>-<hash>
. enable.sh

Additionally, enable PopART with:

cd popart-<OS version>-<SDK version>-<hash>
. enable.sh

More detailed instructions on setting up your Poplar environment are available in the Poplar quick start guide.

Environment setup

To prepare your environment, follow these steps:

Create and activate a Python3 virtual environment:

python3 -m venv <venv name>
source <venv path>/bin/activate

Navigate to the Poplar SDK root directory
Install the PopTorch (PyTorch) wheel:

cd <poplar sdk root dir>
pip3 install poptorch...x86_64.whl

Navigate to this example's root directory
Install the Python requirements:

pip3 install -r requirements.txt

More detailed instructions on setting up your PyTorch environment are available in the PyTorch quick start guide.

Dataset setup

Conceptual Captions (cc3m)

Download the conceptual captions dataset in three steps with the scripts provided:

Download Train_GCC-training.tsv from the Conceptual Captions source
Use the provided script to download the main dataset:

mkdir data
mv Train_GCC-training.tsv data/
mkdir -p data/cc3m/images
python3 datasets/download.py --url_file data/Train_GCC-training.tsv --save_path data/cc3m

Download the word segmentation vocabulary from the official CLIP repository and move it into the data directory:

mv bpe_simple_vocab_16e6.txt.gz datasets/

Disk space required: 84G

.
├── images
└── img_cap.csv

1 directory, 1 file

ImageNet LSVRC 2012 (Optional)

Download the ImageNet LSVRC 2012 dataset from the source or via kaggle

Disk space required: 144GB

.
├── bounding_boxes
├── imagenet_2012_bounding_boxes.csv
├── train
└── validation

3 directories, 1 file

And then pre-process the dataset using the scripts provided:

python3 datasets/preprocess.py

Running and benchmarking

To run a tested and optimised configuration and to reproduce the performance shown on our performance results page, use the examples_utils module (installed automatically as part of the environment setup) to run one or more benchmarks. The benchmarks are provided in the benchmarks.yml file in this example's root directory.

For example:

python3 -m examples_utils benchmark --spec <path to benchmarks.yml file>

Or to run a specific benchmark in the benchmarks.yml file provided:

python3 -m examples_utils benchmark --spec <path to benchmarks.yml file> --benchmark <name of benchmark>

For more information on using the examples-utils benchmarking module, please refer to the README.

Other features

Zero-shot evaluation

After training CLIP on cc3m dataset, you can apply zeroshot classification prediction on the validation set of ImageNet1k and CIFAR100 dataset to valify the performance of trained model. You can choose to use a checkpoint saved from the IPU by setting the is_ipu_ckpt to True or the official checkpoint by setting it to False. Zeroshot evaluation is performed on the validation set of ImageNet1k by default. If you want to perform zeroshot evaluation on CIFAR100, please set zeroshot_dataset to CIFAR100.

# Do zeroshot evaluation on ImageNet
python zero_shot.py \
    --config CLIP_ViT-B-32_cc3m \
    --is_ipu_ckpt True \
    --zeroshot_dataset imagenet \
    --ckpt_file output/ckpt/CLIP_epoch_K.pt

# Do zeroshot evaluation on CIFAR100
python zero_shot.py \
    --config CLIP_ViT-B-32_cc3m \
    --is_ipu_ckpt True \
    --zeroshot_dataset cifar100 \
    --ckpt_file output/ckpt/CLIP_epoch_K.pt

Licensing

This application is licensed under MIT license. Please see the LICENSE file in this directory for full details of the license conditions.

The following files are created by Graphcore and are licensed under MIT License (^* means additional license information stated following this list):

log.py
args.py
train.py
README.md
configs.yml
benchmarks.yml
preprocess.py
checkpoint.py
ipu_options.py
optimization.py
requirements.txt
tests/import_helper.py
tests/cpu_ipu_test.py
datasets/preprocess.py
datasets/text_templates.pt

The following file include code from this repo which uses MIT license:

model.py
datasets/simple_tokenizer.py

The following file include code from this repo.

zers_shot.py
datasets/dataset.py
datasets/download.py

External packages:

wandb, pytest, pyyaml, transformers are licensed under MIT License
torchvision is licensed under BSD 3-Clause License

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pytorch

pytorch

README.md

CLIP

Instructions summary

Poplar SDK setup

Environment setup

Dataset setup

Conceptual Captions (cc3m)

ImageNet LSVRC 2012 (Optional)

Running and benchmarking

Other features

Zero-shot evaluation

Licensing

Files

pytorch

Directory actions

More options

Directory actions

More options

Latest commit

History

pytorch

Folders and files

parent directory

README.md

CLIP

Instructions summary

Poplar SDK setup

Environment setup

Dataset setup

Conceptual Captions (cc3m)

ImageNet LSVRC 2012 (Optional)

Running and benchmarking

Other features

Zero-shot evaluation

Licensing