This is the official repository of this paper:
DeepSEE: Deep Disentangled Semantic Explorative Extreme Super-Resolution
Marcel Bühler, Andrés Romero, and Radu Timofte.
Computer Vision Lab, ETH Zurich, Switzerland
Abstract: Super-resolution (SR) is by definition ill-posed. There are infinitely many plausible high-resolution variants for a given low-resolution natural image. Most of the current literature aims at a single deterministic solution of either high reconstruction fidelity or photo-realistic perceptual quality. In this work, we propose an explorative facial super-resolution framework, DeepSEE, for Deep disentangled Semantic Explorative Extreme super-resolution. To the best of our knowledge, DeepSEE is the first method to leverage semantic maps for explorative super-resolution. In particular, it provides control of the semantic regions, their disentangled appearance and it allows a broad range of image manipulations. We validate DeepSEE on faces, for up to 32x magnification and exploration of the space of super-resolution.
25. Nov 2020: Updated Demo notebook. You can now run our demo in Google Colab.
19. Nov 2020: Demo notebook and checkpoints released.
8. June 2020: Training code released.
git clone https://github.com/mcbuehler/DeepSEE
cd DeepSEE/
pip install -r requirements.txt
-
Paper / supplementary material: Please download them from our Project Page.
-
Checkpoints: Download and unpack the model checkpoints to the folder
checkpoints
.
https://drive.google.com/drive/folders/1zZdvCaPExdM51Znw9-4Ku2PDpL0P1Klf?usp=sharing -
Demo data: Download and unpack the demo data to the folder
demo_data
.
https://drive.google.com/drive/folders/1shKT0pDmIPZDpBvTgQhP-Q8juqD_Xsyu?usp=sharing
We provide example training scripts for CelebA and CelebAMask-HQ in scripts/train
. If you want to train on your own dataset, you can adjust one of these scripts.
- Make sure to have a dataset consisting of images and segmentation masks, as described below.
- Set the correct paths (and other options, if applicable) in the training script. We list the training commands for the independent and the guided model in the same script. You can uncomment the one that you want to train. Example training command:
sh ./scripts/train/train_8x_256x256.sh
If you interrupt training and want to restart, make sure to use the flag --continue_train
. Otherwise, your previous checkpoints might be overwritten.
Note when Training Models for 32x Upscaling
Models for extreme upscaling require significant GPU memory. You will have to use 2 V100 GPUs with 16GB memory each (or similar).
You can enable model parallelism by setting --model_parallel_mode 1
. This will
compute the first part of the model pass on one GPU, and the second part on the second GPU.
This is already pre-set in scripts/train/train_32x_512x512.sh
.
- Obtain images: Download the dataset
CelebAMask-HQ.zip
from the authors GitHub repository and extract the contents.
For the guided model, you also need the identities. You can download the pre-computed annotations here.
As an alternative, you can also recompute them via
python data/celebamaskhq_compute_identities_file.py
.
You find the required mapping files in the file CelebAMask-HQ.zip
.
-
Obtain the semantic masks predicted from downscaled images. You have two options: a) Download from here or b) create them yourself.
-
Split the dataset into train / val / test. In
data/celebamaskhq_partition.py
, you can update the paths for in- and outputs. Then, runpython data/celebamaskhq_partition.py
- Obtain images: On the author's website, click the link
Align&Cropped Images
. It will open a Google Drive, where you can download the images under "CelebA" -> "Img" -> "img_align_celeba.zip". For the guided model, you also need the identity annotations. You can download these in the same Google Drive under "CelebA" -> "Anno" -> "identity_CelebA.txt". - Obtain the semantic masks predicted from downscaled images. You have two options: a) Download from here or b) predict them yourself.
- Create dataset splits: On the author's website, follow the link
Train/Val/Test Partitions
and download the file with dataset partitionsCelebA_list_eval_partition.txt
. It is located in theEval
folder. Update the paths indata/celeba_partition.py
and runpython data/celeba_partition.py
.
You need one folder containing the images and another folder with semantic masks. The filenames for the image name and label should be the same (ignoring extensions). For example, the image 0001.jpg
and the semantic mask 0001.png
would belong to the same sample. You can then copy and adapt one of the *_dataset.py
classes to your new dataset.
train.py
: Training script. Run this via the bash scripts inscripts/train/
.demo.py
: This script provides the interface for loading data and running inference for the demo notebook.data/
: Dataset classes, pre-processing scripts and dataset preparation scripts.deepsee_models/sr_model.py
: Interface to encoder, generator and discriminator.deepsee_models/networks/
: Actual model code (residual block, normalization block, encoder, discriminator, etc.)evaluator
: Evaluates during training and testing.evaluator/evaluate_folder.py
computes scores for two folders, one containing the upscaled images, and on the ground truth.managers/
: We have different manager classes for training, inference and demo.options/
: Contains the configurations and command line parameters for training and testing.scripts
: Contains scripts with pre-defined configurations.util/
: Code for logging, visualizations and more.
Marcel Christoph Bühler, Andrés Romero, and Radu Timofte.
Deepsee: Deep disentangled semantic explorative extreme super-resolution.
In The 15th Asian Conference on Computer Vision (ACCV), 2020.
Copyright belongs to the authors. All rights reserved. Licensed under the CC BY-NC-SA 4.0 (Attribution-NonCommercial-ShareAlike 4.0 International)
For the SPADE part: Copyright (C) 2019 NVIDIA Corporation.
We built on top of the code from SPADE. Thanks to Jiayuan Mao for Synchronized Batch Normalization code, mseitzer for the FID implementation in pytorch and the authors of PerceptualSimilarity.