Skip to content

Latest commit

 

History

History
287 lines (258 loc) · 15.2 KB

README.md

File metadata and controls

287 lines (258 loc) · 15.2 KB

Representative Color Transform for Image Enhancement

An unofficial PyTorch implementation of Kim et al. "Representative Color Transform for Image Enhancement", ICCV2021.

For more information about our implementation, you can also read our blog.

RCTNet

In Kim et al. (2021), a novel image enhancement approach is introduced, namely Representative Color Transforms, yielding large capacities for color transformations. The overall proposed network comprises of four components: encoder, feature fusion, global RCT, and local RCT and is depicted in Figure 1. First the encoder is utilized for extracting high-level context information, which in is in turn leveraged for determining representative and transformed (in RGB) colors for the input image. Subsequently, an attention mechanism is used for mapping each pixel color in the input image to the representative colors, by computing their similarity. The last step involves the application of representative color transforms using both coarse- and fine-scale features from the feature fusion component to obtain enhanced images from the global and local RCT modules, which are combined to produce the final image.

RCTNet architecture

Figure 1: An overview of RCTNet's architecture.

Experiments

Dataset

The LOw-Light (LOL) dataset [Wang et al. (2004)] for image enhancement in low-light scenarios was used for the purposes of our experiment. It is composed of a training partition, containing 485 pairs of low- and normal-light image pairs, and a test partition, containing 15 such pairs. All the images have a resolution of 400×600. For the purposes of training, all images were randomly cropped and rotated by a multiple of 90 degrees.

Quantitative Evaluation

The results, in terms of the PSNR and SSIM evaluation metrics, calculated for our implementation of RCTNet are depicted in Table 1, along with results of competing image enhancement methods and the official implementation of RCTNet, as reported in Kim et al. (2021). It becomes evident that our results do not approximate those reported for the official implementation for both examined metrics.

Method PSNR SSIM
NPE [Wang et al. (2013)] 16.97 0.589
LIME [Guo et al. (2016)] 15.24 0.470
SRIE [Fu et al. (2016)] 17.34 0.686
RRM [Li et al. (2016)] 17.34 0.686
SICE [Cai et al. (2018)] 19.40 0.690
DRD [Wei et al. (2018)] 16.77 0.559
KinD [Zhang et al. (2019)] 20.87 0.802
DRBN [Yang et al. (2020)] 20.13 0.830
ZeroDCE [Guo et al. (2020)] 14.86 0.559
EnlightenGAN [Jiang et al. (2021)] 15.34 0.528
RCTNet [Kim et al. (2021)] 22.67 0.788
RCTNet (ours)* 19.96 0.768
RCTNet + BF [Kim et al. (2021)] 22.81 0.827

Table 1: Quantitative comparison on the LoL dataset. The best results are boldfaced.

Interestingly, the results of Table 1 deviate significantly in case the augmentations proposed by the authors (random cropping and random rotation by a multiple of 90 degrees) are also used during the evaluation. This finding indicates that the model favours augmented images, since during training we performed augmentation operations on all input images and for every epoch. While the authors refer to the same augmentations, they do not specify the frequency, with which those augmentations were performed. This phenomenon becomes more evident by looking at the quantitative results, when augmentations were used on the test images, as shown in Table 2.

Evaluation Metric Mean Standard Deviation Max Min
PSNR 20.522 0.594 22.003 18.973
SSIM 0.816 0.009 0.839 0.787

Table 2: Mean, standard deviation, maximum, and minimum values for PSNR and SSIM, for 100 executions with
different random seeds, when augmentations are also included in the test set, using our implementation of RCTNet.

Qualitative Evaluation

Some image enhancement results of the implemented RCTNet are shown below, compared to the low-light input images and the ground-truth normal-light output images. From these examples it becomes evident that RCTNet has successfully learned how to enhance low-light images, achieving comparable results to the ground-truth images in terms of exposure and color-tones. Nevertheless, the produced images are slightly less saturated with noise being more prominent. It was conjectured that by training the network for more epochs, some of these limitations could be alleviated. It is also observed that RCTNet fails to extract certain representative colors that are only available in small regions of the input image (e.g. the green color for the 4th image).

Input RCTNet Ground-Truth

Scripts

Training

  1. Download the official LoL dataset from here.
  2. Use the train.py script
$ python train.py -h
      usage: train.py [-h] --images IMAGES --targets TARGETS [--epochs EPOCHS] [--batch_size BATCH_SIZE] [--lr LR]
                      [--weight_decay WEIGHT_DECAY] [--config CONFIG] [--checkpoint CHECKPOINT]
                      [--checkpoint_interval CHECKPOINT_INTERVAL] [--device {cpu,cuda}]

      options:
        -h, --help            show this help message and exit
        --images IMAGES       Path to the directory of images to be enhanced
        --targets TARGETS     Path to the directory of groundtruth enhanced images
        --epochs EPOCHS       Number of epochs
        --batch_size BATCH_SIZE
                              Number of samples per minibatch
        --lr LR               Initial Learning rate of Adam optimizer
        --weight_decay WEIGHT_DECAY
                              Weight decay of Adam optimizer
        --config CONFIG       Path to configurations file for the RCTNet model
        --checkpoint CHECKPOINT
                              Path to previous checkpoint
        --checkpoint_interval CHECKPOINT_INTERVAL
                              Interval for saving checkpoints
        --device {cpu,cuda}   Device to use for training

Evaluation

  1. Download the official LoL dataset from here.
  2. Download the weights for our pre-trained model from here.
  3. Use the eval.py script.
$ python eval.py -h
usage: eval.py [-h] --images IMAGES --targets TARGETS --save SAVE --checkpoint CHECKPOINT [--config CONFIG]
               [--batch_size BATCH_SIZE] [--nseeds NSEEDS] [--device {cpu,cuda}]

options:
  -h, --help            show this help message and exit
  --images IMAGES       Path to the directory of images to be enhanced
  --targets TARGETS     Path to the directory of groundtruth enhanced images
  --save SAVE           Path to the save plots and log file with metrics
  --checkpoint CHECKPOINT
                        Path to the checkpoint
  --config CONFIG       Path to configurations file for the RCTNet model
  --batch_size BATCH_SIZE
                        Number of samples per minibatch
  --nseeds NSEEDS       Number of seeds to run evaluation for, in range [0 .. 1000]
  --device {cpu,cuda}   Device to use for training

Enhance images

  1. Download the weights for our pre-trained model from here.
  2. Use the enhance.py script.
$ python enhance.py -h
usage: enhance.py [-h] --image IMAGE --checkpoint CHECKPOINT [--config CONFIG] [--batch_size BATCH_SIZE]
                  [--device {cpu,cuda}]

options:
  -h, --help            show this help message and exit
  --image IMAGE         Path to an image or a directory of images to be enhanced
  --checkpoint CHECKPOINT
                        Path to previous checkpoint
  --config CONFIG       Path to configurations file for the RCTNet model
  --batch_size BATCH_SIZE
                        Number of samples per minibatch
  --device {cpu,cuda}   Device to use for training

References

@INPROCEEDINGS{9710400,  
  author={Kim, Hanul and Choi, Su-Min and Kim, Chang-Su and Koh, Yeong Jun},  
  booktitle={2021 IEEE/CVF International Conference on Computer Vision (ICCV)},   
  title={Representative Color Transform for Image Enhancement},   
  year={2021},  
  volume={},  
  number={},  
  pages={4439-4448},  
  doi={10.1109/ICCV48922.2021.00442}
},

@inproceedings{Chen2018Retinex,
  title={Deep Retinex Decomposition for Low-Light Enhancement},
  author={Chen Wei, Wenjing Wang, Wenhan Yang, Jiaying Liu},
  booktitle={British Machine Vision Conference},
  year={2018},
}