Skip to content

Latest commit

 

History

History
45 lines (32 loc) · 2.07 KB

README.md

File metadata and controls

45 lines (32 loc) · 2.07 KB

Improved Wasserstein GAN Tensorflow

Tensorflow implementation of WGAN-GP, Wasserstein GAN with Gradient Penalty.

Datasets


How to Run

python train.py --DATASET=celeba --DATA_DIR=/path/to/celeba/

Other options include SELU activations and layer normalization. The discriminator does not use batch normalization because that would conflict with the gradient penalty, meaning instance normalization or layer normalization can still be used without issue. The authors suggest that layer normalization is used if any. Default is to not use these, but they can be used by,

python train.py --DATASET=celeba --DATA_DIR=/path/to/celeba/ --NORM=1

Results

Here are some non cherry picked results after 100,000 training steps with batch size 128. The first image uses layer normalization, while the second image does not.

To create an image like this, simply run createPhotos.py and point towards your checkpoint directory, like so,

python createPhotos.py checkpoints/DATASET_celeba/SCALE_10/NORM_False/SELU_False/

Layer Normalization in D

layer_norm

No Normalization in D

no_norm

Notes

  • Initial trials of SELU activations did not work, the model diverged pretty quickly.
  • For some reason, I was getting terrible results using tf.layers.conv2d as opposed to tf.contrib.layers.conv2d, and I am still unsure as to why.
  • The last layer of the discriminator is another convolution with stride 1, kernel size of 4, and depth of 1. I found this to work much better than the typical fully connected layer.
  • Using layer normalization seems to have more stable training, although it takes longer for each step (~2 seconds for batch size 128 on a GTX 1080 as opposed to ~1.5 seconds without layer norm). However, it seems to be converging faster, so it's possible that offsets the time.