Marc Górriz | Marta Mrak | Alan F. Smeaton | Noel E. O’Connor |
A joint collaboration between:
BBC Research & Development | Dublin City University (DCU) | Insight Centre for Data Analytics |
In this work recent advances in conditional adversarial networks are investigated to develop an end-to-end architecture based on Convolutional Neural Networks (CNNs) to directly map realistic colours to an input greyscale image. Observing that existing colourisation methods sometimes exhibit a lack of colourfulness, this work proposes a method to improve colourisation results. In particular, the method uses Generative Adversarial Neural Networks (GANs) and focuses on improvement of training stability to enable better generalisation in large multi-class image datasets. Additionally, the integration of instance and batch normalisation layers in both generator and discriminator is introduced to the popular U-Net architecture, boosting the network capabilities to generalise the style changes of the content. The method has been tested using the ILSVRC 2012 dataset, achieving improved automatic colourisation results compared to other methods based on GANs.
2019 IEEE 21st International Workshop on Multimedia Signal Processing (MMSP). Find the paper discribing our work on IEEE Xplore and arXiv.
Please cite with the following Bibtex code:
@inproceedings{blanch2019end,
title={End-to-End Conditional GAN-based Architectures for Image Colourisation},
author={Blanch, Marc G{\'o}rriz and Mrak, Marta and Smeaton, Alan F and O'Connor, Noel E},
booktitle={2019 IEEE 21st International Workshop on Multimedia Signal Processing (MMSP)},
pages={1--6},
year={2019},
organization={IEEE}
}
The model is implemented in Keras, which at its time is developed over TensorFlow. Also, this code should be compatible with Python 3.6.
pip install -r https://github.com/marc-gorriz/ColorGAN/blob/master/requeriments.txt
Import the Open Source libraries for instance and spectral normalistion. Refer to models/layers
directory
Training examples are generated from the ImageNet dataset, particularly from the 1,000 synsets selected for the ImageNet Large Scale Visual Recognition Challenge 2012. Samples are selected from the reduced validation set, containing 50,000 RGB images uniformly distributed as 50 images per class. The test dataset is created by randomly selecting 10 images per class from the training set, generating up to 10,000 examples. All images are resized to 256×256 pixels and converted to the CIE Lab colour space.
Make sure the data path has the following tree structure:
-data
|
---- train
| |
| ---- 0
| | |---- img0.png
| | | …
| | |---- img49.png
| | …
| ---- 999
| | |---- img0.png
| | | …
| | |---- img49.png
|
---- test
| |
| ---- 0
| | |---- img0.png
| | | …
| | |---- img9.png
| | …
| ---- 999
| | |---- img0.png
| | | …
| | |---- img9.png
-
Make a new configuration file based on the available templates and save it into the
config
directory. Make sure to launch all the processes over GPU. -
To train a new model, run
python main.py --config config/[config file].py --action train
.
This work has been conducted within the project JOLT. This project is funded by the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska Curie grant agreement No 765140.
JOLT Project | European Comission |
If you have any general doubt about our work or code which may be of interest for other researchers, please use the public issues section on this github repo. Alternatively, drop us an e-mail at mailto:[email protected].