Skip to content

Latest commit

 

History

History
66 lines (50 loc) · 4.47 KB

README.md

File metadata and controls

66 lines (50 loc) · 4.47 KB

mtYOLO: A multi-task model to concurrently obtain the vital characteristics of individuals or animals

This is the official repository for
[2024 IEEE ICME Application/Industry Paper] mtYOLO: A multi-task model to concurrently obtain the vital characteristics of individuals or animals
Kian Eng ONG, Sivaji RETTA, Ramarajulu SRINIVASAN, Shawn TAN, Jun LIU
Information Systems Technology and Design, Singapore University of Technology and Design, Singapore
AnimalEYEQ Private Limited

Paper

[Coming Soon] 2024 IEEE International Conference on Multimedia and Expo (ICME) Official Application/Industry Paper

Citation

[Coming Soon]

Abstract

In multi-task learning, a model learns from various related tasks at the same time. Such a model is especially useful in various practical applications in the real-world (e.g., autonomous driving, precision livestock farming), as they are able to perform inference of various tasks concurrently. In this work, we present mt-YOLO, a single unified multi-task YOLOv8 model, that is trained end-to-end and is able to simultaneously produce the output of all the vital characteristics (e.g., size, keypoints) of the person or animal. Our experiments show that our multi-task YOLOv8 model takes a shorter time to train and performs better than individual tasks. The learning of various tasks can mutually benefit one another during model training and improve its performance, however the tasks may sometimes conflict one another and result in poorer model performance. Hence, in order to further enhance the feature extraction capability of the multi-task model and allow it to learn better features from various tasks, we incorporated the Efficient Channel Attention (ECA) mechanism as part of our multi-task unified model architecture. The ECA mechanism dynamically assigns larger weights to more important information but smaller weights to less relevant information. Our experiments showed that ECA can improve the model's performance without compromising too much on the compute time. Our codes can be found at https://github.com/AnimalEyeQ/mtYOLO.

Model Architecture

mtYOLO architecture

Datasets

  • MS-COCO Person Multi-Task

    • Download images and annotations from here
    • We would like to thank Andy @yermandy for providing this dataset.
  • CattleEyeView dataset

  • The dataset configuration file can be found in ./config/dataset/cattleeyeview_multitask.yaml or ./config/dataset/coco_multitask.yaml.

    • Instructions to modify the configurations can be found in the file.

Code

  • Run the following commands to install mtYOLOv8:

    cd ultralytics
    pip install -r requirements.txt
  • The mtYOLOv8 model configuration file and instructions to create other configuration files (e.g., pose, segment, without ECA) can be found in ./config/model/yolov8_multitask_cattleeyeview_ECA.yaml.

  • The code and instructions to train, validate or predict can be found in mtYOLO.ipynb.

  • The trained mtYOLOv8 with ECA models for MS-COCO Person Multi-Task and CattleEyeView can be found in ./model_checkpoint.

Acknowledgments

We would like to express our gratitude to