Skip to content

Implementation of augmentation, SMOTE, ADASYN, AE, DGAN for oversampling image data used in detection (YOLO and others one stage detectors). Maritime flags/code scenario.

Notifications You must be signed in to change notification settings

juliuszlosinski/Imbalanced-Data-Problem-for-Image-Detection-One-Stage-Detectors

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

65 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Goal: Research the Impact of Imbalanced and Balanced Maritime Code Flag Datasets on the Performance of One-Stage Image Detectors (YOLO family, SSD).

Main metrics:

  • intersection over union (IoU),
  • precision and recall,
  • average precision (AP),
  • mean average precision (mAP),
  • F1 score (trade-off between precision and recall).

YOLO detectors:

  • YOLOv11 (Yolov11m ~ Medium version) - latest stable version (REQUIRED FOR RESEARCH),
  • YOLOv8 (Yolov8m ~ Medium version) (REQUIRED FOR RESEARCH).

SSD detectors:

  • SSD300 - latest stable version (REQUIRED FOR RESEARCH),
  • SSD512 - bigger resoultions (ADDITIONAL FOR RESEARCH).

1. UML

image

2. Project Organization

├── documentation       <- UML diagrams and configuration
├── balancers           <- Package with balancers and utils
│   ├── __init__.py     <- Package identicator
│   ├── smote.py        <- SMOTE balancer (interpolation)
│   ├── adasyn.py       <- ADASYN balancer (interpolation)
│   ├── augmentation.py <- Augmentation balancer (augmenting images like rotations, etc.)
│   ├── autoencoder.py  <- Autoencoder balancer (learning needed!)
│   ├── dgan.py         <- DGAN balancer (learning needed!)
│   ├── balancer.py     <- General balancer with all balancers (aggregating all of the above)
│   ├── annotations.py  <- Annotations module
│   └── configuration_reader.py  <- Balancer configuration reader
├── maritime-flags-dataset    <- Source and balanced flags (A-Z)
│   ├── ADASYN_balanced_flags <- Balanced flags by using ADASYN balancer
│   ├── SMOTE_balanced_flags  <- Balanced flags by using SMOTE balancer
│   ├── AUGMENTATION_balanced_flags  <- Balanced flags by using Augmentation balancer
│   ├── DGAN_balanced_flags  <- Balanced flags by using DGAN balancer
│   ├── AE_balanced_flags    <- Balanced flags by using Autoencoder balancer
│   ├── combined_flags       <- Combined/test images 
│   ├── two_flags            <- Balanced two flags (A and B) per 1000 images
│   └── imbalanced_flags     <- Source folder with imbalanced flags
├── datasets   <- YOLO formatted datasets (detector by default sees this category!)
│   ├── yolo-maritime-flags-dataset (A-Z)
│     ├── images
│       ├── train <- Training images (.jpg)
│       ├── val   <- Validation images (.jpg)
│       └── test  <- Testing images (.jpg)
│     └── labels
│       ├── train <- Training labels (.txt)
│       ├── val   <- Validation labels (.txt)
│       └── test  <- Testing labels (.txt)
│   └── cross-validation-yolo-formatted-maritime-flags (A-Z)
│     ├── images
│       ├── fold_1  <- First fold with images (.jpg)
|         ├── train <- Training images (.jpg)
|         └── val   <- Validation images (.jpg)
│       ├── ...     <- ... fold with images (.jpg)
|         ├── train <- Training images (.jpg)
|         └── val   <- Validation images (.jpg)
│       └── fold_n  <- N-fold with images (.jpg)
|         ├── train <- Training images (.jpg)
|         └── val   <- Validation images (.jpg)
│     └── labels
│       ├── fold_1  <- First fold with labels (.txt)
|         ├── train <- Training labels (.txt)
|         └── val   <- Validation labels (.txt)
│       ├── ...     <- ... fold with labels (.txt)
|         ├── train <- Training labels (.txt)
|         └── val   <- Validation labels (.txt)
│       └── fold_n  <- N-fold with labels (.txt)
|         ├── train <- Training labels (.txt)
|         └── val   <- Validation labels (.txt)
├── .gitignore        <- Ignores venv_environment directory to be pushed (VENV)
├── test_packages.py  <- Testing loading all necessaries packages like Torch (VENV)
├── python_3.11_venv_requirements.txt <- List for venv with all used packages (VENV)
├── balance.py       <- Balancing dataset by using balancers package (BALANCING)
├── balancer_configuration.json <- Balancer configuration
├── detection.py     <- Training and testing yolo detector with balanced/imbalanced data (EVALUATING)
├── yolo_detector.py <- YOLO detector (DETECTING)
├── yolo_data.yaml   <- YOLO data configuration (traing and testing)
├── fold_1_dataset.yaml   <- YOLO first k-fold data configuration (K-cross validation)
├── ...                   <- YOLO ... k-fold data configuration (K-cross validation)
└── fold_n_dataset.yaml   <- YOLO n k-fold data configuration (K-cross validation)

3. Balancing approaches

3.1. Augmentation

image image

3.2. SMOTE

image image image

3.3. ADASYN

image image

3.4. Autoencoder

image image image image

3.5. Deep Convolutional GAN (DGAN)

image image

4. Detectors

3.1 YOLO family

image

3.2 SSD

image

3.3 YOLO vs SSD

image

About

Implementation of augmentation, SMOTE, ADASYN, AE, DGAN for oversampling image data used in detection (YOLO and others one stage detectors). Maritime flags/code scenario.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published