Based on https://github.com/victoresque/pytorch-template
python train.py -c path_to_config.json
If 'n_gpu'
in config.json is more than 1 the model will be wrapped with torch.nn.DataParallel()
python train_ddp.py --resume path_to_model.pth
In addition to using train_ddp.py
, dist_backend
& dist_url
options should be defined in config.json, other wise default values defined in train_ddp.py will be used.
python test.py --resume path_to_model.pth
fuser.py
can be used to do so.
python fuser.py -r /path/to/pth/file
--bs
Controls the batch size.
python parameters_extractor.py --model path_to_model.pth
To fuse BatchNorm2d
into the preceding Conv2d
, add -f
option to the previous command.
In this template the weights are saved together with config & other objects in the pth
file, it can be seprated using the following command.
python checkpoint_separator.py --model path_to_model.pth
The resulting weights file will be saved next to the model.
Config.json is a json file that descrips the experiment to run, sevral examples are avalibe under config_json.
Name of the experment, a timestamp will be appended to it using '_%d_%m_%H%M%S'
as format. This can be modified in parse_config.json
Total number of gpus to use. If set to -1
it will use all the available gpus
Distributed backend to use. For more info check relative PyTorch docs.
URL specifying how to initialize the process group. For more info check relative PyTorch docs.
The architecure can be defined using this key. The value should be a dictionary with two parameters type
& args
. Types can be {VGG_net,QuantVGG_pure,vgg16,alexnet}
.
VGG_net
: Local implementation of VGG.
QuantVGG_pure
: Quantized VGG using xilinx-brevitas.
vgg16
& alexnet
: Are the respective models taken from Pytorch.
Example of QuantVGG_pure
"arch": {
"type": "QuantVGG_pure",
"args": {
"VGG_type":"D",
"batch_norm":false,
"bit_width":16,
"num_classes":1000,
"pretrained_model":"Path to pretrained model or use pytorch to initialize with pytorch's version of the model"
}
}
Example of VGG_net
"arch": {
"type": "VGG_net",
"args": {
"in_channels":3,
"num_classes":1000,
"VGG_type":"D",
"batch_norm":true
}
}
Example of vgg16
"arch": {
"type": "vgg16",
"args": {
"pretrained":true,
"progress":true
}
}
Example of CIFAR_data_loader
"data_loader": {
"type": "CIFAR_data_loader",
"args":{
"data_dir": "/Path/to/Data_set",
"batch_size": 512,
"download": true,
"shuffle": true,
"validation_split": 0.1,
"num_workers": 5,
"flavor": 100,
"training": true
}
}
Example of ImageNet_data_loader
"data_loader": {
"type": "ImageNet_data_loader",
"args":{
"data_dir": "/Path/to/Data_set",
"batch_size": 512,
"shuffle": true,
"num_workers": 5,
"pin_memory":true,
"training": true
}
}
Example of CIFER test_data_loader
"data_loader": {
"type": "CIFAR_data_loader",
"args":{
"data_dir": "/Path/to/Data_set",
"batch_size": 512,
"download": true,
"shuffle": true,
"validation_split": 0.0,
"num_workers": 5,
"flavor": 100,
"training": false
}
}
to do
Metrics used to evaluate the model, currently defined metrics are accuracy
& top_k_acc
. Default value for k
in top_k_acc
is 5. New metrics can be defined under model/metric.py
.
Example of StepLR
"lr_scheduler": {
"type": "StepLR",
"args": {
"step_size": 30,
"gamma": 0.1,
"verbose": true
}
}
Example of MultiStepLR
"lr_scheduler": {
"type": "MultiStepLR",
"args": {
"milestones": [60,120,160],
"gamma": 0.2,
"verbose": true
}
}
Example of trainer config
"trainer": {
"epochs": 200,
"save_dir": "/Where/to/save/train_result",
"save_period": 200,
"verbosity": 2, // 0: quiet, 1: per epoch, 2: full
"monitor": "max val_accuracy",
"early_stop": -1,
"tensorboard": true
}
If set the model parameters will be extracted during testing
"extract": true,
Configrations to be used in the resulting config.h file
"extractor": {
"PE": 1,
"SIMD": 1,
"DATAWIDTH": 64,
"SEQUENCE_LENGTH": 120000,
"CLASS_LABEL_BITS": 1,
"MUL_BITS": 16,
"MUL_INT_BITS": 8,
"ACC_BITS": 16,
"ACC_INT_BITS": 8,
"IA_BITS": 8,
"IA_INT_BITS": 4
}
-Generic model initialization from PyTorch\
- If you get NCCL errors rerun with
export NCCL_DEBUG=WARN
- If you get
RuntimeError: CUDA error: all CUDA-capable devices are busy or unavailable
while training with DDP try settingpin_memory
to false inconfig.json