MMTracking provides various methods on existing benchmarks. Details about these methods and benchmarks are presented in model_zoo.md and dataset.md respectively. This note will show how to perform common tasks on existing models and standard datasets, including:
- Inference existing models on a given video or image folder.
- Test (inference and evaluate) existing models on standard datasets.
- Train exisitng models on standard datasets.
We provide demo scripts to inference a given video or a folder that contains continuous images. The source codes are available here.
Note that if you use a folder as the input, there should be only images in this folder and the image names must be sortable, which means we can re-order the images according to the filenames.
This script can inference an input video / images with a multiple object tracking model.
python demo/demo_mot.py \
${CONFIG_FILE} \
--input ${INPUT} \
[--output ${OUTPUT}] \
[--checkpoint ${CHECKPOINT_FILE}] \
[--device ${DEVICE}] \
[--backend ${BACKEND}] \
[--show]
The INPUT
and OUTPUT
support both mp4 video format and the folder format.
Optional arguments:
OUTPUT
: Output of the visualized demo. If not specified, the--show
is obligate to show the video on the fly.CHECKPOINT_FILE
: The checkpoint is optional in case that you already set up the pretrained models in the config by the keypretrains
.DEVICE
: The device for inference. Options arecpu
orcuda:0
, etc.BACKEND
: The backend to visualize the boxes. Options arecv2
andplt
.--show
: Whether show the video on the fly.
Examples:
python demo/demo_mot.py configs/mot/deepsort/sort_faster-rcnn_fpn_4e_mot17-private.py --input demo/demo.mp4 --output mot.mp4
This script can inference an input video with a single object tracking model.
python demo/demo_sot.py \
${CONFIG_FILE}\
--input ${INPUT} \
--checkpoint ${CHECKPOINT_FILE} \
[--output ${OUTPUT}] \
[--device ${DEVICE}] \
[--show]
Optional arguments:
OUTPUT
: Output of the visualized demo. If not specified, the--show
is obligate to show the video on the fly.DEVICE
: The device for inference. Options arecpu
orcuda:0
, etc.--show
: Whether show the video on the fly.
Examples:
python ./demo/demo_sot.py \
./configs/sot/siamese_rpn/siamese_rpn_r50_1x_lasot.py \
--input ${VIDEO_FILE} \
--checkpoint ../mmtrack_output/siamese_rpn_r50_1x_lasot_20201218_051019-3c522eff.pth \
--output ${OUTPUT} \
--show
This script can inference an input video with a video object detection model.
python demo/demo_vid.py \
${CONFIG_FILE}\
--input ${INPUT} \
--checkpoint ${CHECKPOINT_FILE} \
[--output ${OUTPUT}] \
[--device ${DEVICE}] \
[--show]
Optional arguments:
OUTPUT
: Output of the visualized demo. If not specified, the--show
is obligate to show the video on the fly.DEVICE
: The device for inference. Options arecpu
orcuda:0
, etc.--show
: Whether show the video on the fly.
Examples:
python ./demo/demo_vid.py \
./configs/vid/selsa/selsa_faster_rcnn_r101_dc5_1x_imagenetvid.py \
--input ${VIDEO_FILE} \
--checkpoint ../mmtrack_output/selsa_faster_rcnn_r101_dc5_1x_imagenetvid_20201218_172724-aa961bcc.pth \
--output ${OUTPUT} \
--show
This section will show how to test existing models on supported datasets. The following testing environments are supported:
- single GPU
- single node multiple GPU
- multiple nodes
During testing, different tasks share the same API and we only support samples_per_gpu = 1
.
You can use the following commands for testing:
# single-gpu testing
python tools/test.py ${CONFIG_FILE} [--checkpoint ${CHECKPOINT_FILE}] [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}]
# multi-gpu testing
./tools/dist_test.sh ${CONFIG_FILE} ${GPU_NUM} [--checkpoint ${CHECKPOINT_FILE}] [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}]
Optional arguments:
CHECKPOINT_FILE
: Filename of the checkpoint. You do not need to define it when applying some MOT methods but specify the checkpoints in the config.RESULT_FILE
: Filename of the output results in pickle format. If not specified, the results will not be saved to a file.EVAL_METRICS
: Items to be evaluated on the results. Allowed values depend on the dataset, e.g.,bbox
is available for ImageNet VID,track
is available for LaSOT,bbox
andtrack
are both suitable for MOT17.--cfg-options
: If specified, the key-value pair optional cfg will be merged into config file--eval-options
: If specified, the key-value pair optional eval cfg will be kwargs for dataset.evaluate() function, it’s only for evaluation--format-only
: If specified, the results will be formated to the offical format.
Examples:
Assume that you have already downloaded the checkpoints to the directory checkpoints/
.
-
Test DFF on ImageNet VID, and evaluate the bbox mAP.
python tools/test.py configs/vid/dff/dff_faster_rcnn_r101_dc5_1x_imagenetvid.py \ --checkpoint checkpoints/dff_faster_rcnn_r101_dc5_1x_imagenetvid_20201218_172720-ad732e17.pth \ --out results.pkl \ --eval bbox
-
Test DFF with 8 GPUs, and evaluate the bbox mAP.
./tools/dist_test.sh configs/vid/dff/dff_faster_rcnn_r101_dc5_1x_imagenetvid.py 8 \ --checkpoint checkpoints/dff_faster_rcnn_r101_dc5_1x_imagenetvid_20201218_172720-ad732e17.pth \ --out results.pkl \ --eval bbox
-
Test SiameseRPN++ on LaSOT, and evaluate the success and normed precision.
python tools/test.py configs/sot/siamese_rpn/siamese_rpn_r50_1x_lasot.py \ --checkpoint checkpoints/siamese_rpn_r50_1x_lasot_20201218_051019-3c522eff.pth \ --out results.pkl \ --eval track
-
Test SiameseRPN++ with 8 GPUs, and evaluate the success and normed precision.
./tools/dist_test.sh configs/sot/siamese_rpn/siamese_rpn_r50_1x_lasot.py 8 \ --checkpoint checkpoints/siamese_rpn_r50_1x_lasot_20201218_051019-3c522eff.pth \ --out results.pkl \ --eval track
-
Test Tracktor on MOT17, and evaluate CLEAR MOT metrics.
python tools/test.py configs/mot/tracktor/tracktor_faster-rcnn_r50_fpn_4e_mot17-public-half.py \ --eval track
-
Test Tracktor with 8 GPUs, and evaluate CLEAR MOT metrics.
./tools/dist_test.sh \ configs/mot/tracktor/tracktor_faster-rcnn_r50_fpn_4e_mot17-public-half.py 8 \ --eval track
MMTracking also provides out-of-the-box tools for training models. This section will show how to train predefined models (under configs) on standard datasets i.e. MOT17.
By default we evaluate the model on the validation set after each epoch, you can change the evaluation interval by adding the interval argument in the training config.
evaluation = dict(interval=12) # This evaluate the model per 12 epoch.
Important: The default learning rate in config files is for 8 GPUs.
According to the Linear Scaling Rule, you need to set the learning rate proportional to the batch size if you use different GPUs or images per GPU, e.g., lr=0.01
for 8 GPUs * 1 img/gpu and lr=0.04
for 16 GPUs * 2 imgs/gpu.
python tools/train.py ${CONFIG_FILE} [optional arguments]
During training, log files and checkpoints will be saved to the working directory, which is specified by work_dir
in the config file or via CLI argument --work-dir
.
We provide tools/dist_train.sh
to launch training on multiple GPUs.
The basic usage is as follows.
bash ./tools/dist_train.sh \
${CONFIG_FILE} \
${GPU_NUM} \
[optional arguments]
Optional arguments remain the same as stated above.
If you would like to launch multiple jobs on a single machine, e.g., 2 jobs of 4-GPU training on a machine with 8 GPUs, you need to specify different ports (29500 by default) for each job to avoid communication conflict.
If you use dist_train.sh
to launch training jobs, you can set the port in commands.
CUDA_VISIBLE_DEVICES=0,1,2,3 PORT=29500 ./tools/dist_train.sh ${CONFIG_FILE} 4
CUDA_VISIBLE_DEVICES=4,5,6,7 PORT=29501 ./tools/dist_train.sh ${CONFIG_FILE} 4
MMTracking relies on torch.distributed
package for distributed training.
Thus, as a basic usage, one can launch distributed training via PyTorch's launch utility.
Slurm is a good job scheduling system for computing clusters.
On a cluster managed by Slurm, you can use slurm_train.sh
to spawn training jobs. It supports both single-node and multi-node training.
The basic usage is as follows.
[GPUS=${GPUS}] ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} ${CONFIG_FILE} ${WORK_DIR}
You can check the source code to review full arguments and environment variables.
When using Slurm, the port option need to be set in one of the following ways:
-
Set the port through
--options
. This is more recommended since it does not change the original configs.CUDA_VISIBLE_DEVICES=0,1,2,3 GPUS=4 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config1.py ${WORK_DIR} --options 'dist_params.port=29500' CUDA_VISIBLE_DEVICES=4,5,6,7 GPUS=4 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config2.py ${WORK_DIR} --options 'dist_params.port=29501'
-
Modify the config files to set different communication ports.
In
config1.py
, setdist_params = dict(backend='nccl', port=29500)
In
config2.py
, setdist_params = dict(backend='nccl', port=29501)
Then you can launch two jobs with
config1.py
andconfig2.py
.CUDA_VISIBLE_DEVICES=0,1,2,3 GPUS=4 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config1.py ${WORK_DIR} CUDA_VISIBLE_DEVICES=4,5,6,7 GPUS=4 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config2.py ${WORK_DIR}
In this note, you will know how to inference, test, and train with customized datasets and models.
The basic steps are as below:
- Prepare the customized dataset (if applicable)
- Prepare the customized model (if applicable)
- Prepare a config
- Train, test, inference the new models.
There are three ways to support a new dataset in MMTracking:
- Reorganize the dataset into CocoVID format.
- Implement a new dataset.
Usually we recommend to use the first method which is usually easier than the second.
Details for customizing datasets are provided in tutorials/customize_dataset.md.
We provide instructions for cutomizing models of different tasks.
The next step is to prepare a config thus the dataset or the model can be successfully loaded. More details about the config system are provided at tutorials/config.md.
To train a model with the new config, you can simply run
python tools/train.py ${NEW_CONFIG_FILE}
For more detailed usages, please refer to the training instructions above.
To test the trained model, you can simply run
python tools/test.py ${NEW_CONFIG_FILE} ${TRAINED_MODEL} --eval bbox track
For more detailed usages, please refer to the testing or inference instructions above.