This is a baseline model for a final project for cs236G. This repo will be integrated into our central project repo by milestone 2.
Please refer to the Dependencies section in the original Readme text below the double lines.
Data
Please clone the central project repo in the same directory level as your clone of this repo.
Training
- Train a StackGAN-v2 model on the children's book data
python main.py --cfg cfg/childrens_book_3stages.yml --gpu 0
Pretrained Model
- StackGAN-v2 for children's books. Download and save it to
models/
. Adjust config to go to these pth files.
Evaluating
- Coming soon.
Pytorch implementation for reproducing StackGAN_v2 results in the paper StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks by Han Zhang*, Tao Xu*, Hongsheng Li, Shaoting Zhang, Xiaogang Wang, Xiaolei Huang, Dimitris Metaxas.
python 2.7
Pytorch
In addition, please add the project folder to PYTHONPATH and pip install
the following packages:
tensorboard
python-dateutil
easydict
pandas
torchfile
Data
- Download our preprocessed char-CNN-RNN text embeddings for birds and save them to
data/
- [Optional] Follow the instructions reedscot/icml2016 to download the pretrained char-CNN-RNN text encoders and extract text embeddings.
- Download the birds image data. Extract them to
data/birds/
- Download ImageNet dataset and extract the images to
data/imagenet/
- Download LSUN dataset and save the images to
data/lsun
Training
- Train a StackGAN-v2 model on the bird (CUB) dataset using our preprocessed embeddings:
python main.py --cfg cfg/birds_3stages.yml --gpu 0
- Train a StackGAN-v2 model on the ImageNet dog subset:
python main.py --cfg cfg/dog_3stages_color.yml --gpu 0
- Train a StackGAN-v2 model on the ImageNet cat subset:
python main.py --cfg cfg/cat_3stages_color.yml --gpu 0
- Train a StackGAN-v2 model on the lsun bedroom subset:
python main.py --cfg cfg/bedroom_3stages_color.yml --gpu 0
- Train a StackGAN-v2 model on the lsun church subset:
python main.py --cfg cfg/church_3stages_color.yml --gpu 0
*.yml
files are example configuration files for training/evaluation our models.- If you want to try your own datasets, here are some good tips about how to train GAN. Also, we encourage to try different hyper-parameters and architectures, especially for more complex datasets.
Pretrained Model
- StackGAN-v2 for bird. Download and save it to
models/
(The inception score for this Model is 4.04±0.05) - StackGAN-v2 for dog. Download and save it to
models/
(The inception score for this Model is 9.55±0.11) - StackGAN-v2 for cat. Download and save it to
models/
- StackGAN-v2 for bedroom. Download and save it to
models/
- StackGAN-v2 for church. Download and save it to
models/
Evaluating
- Run
python main.py --cfg cfg/eval_birds.yml --gpu 1
to generate samples from captions in birds validation set. - Change the
eval_*.yml
files to generate images from other pre-trained models.
Examples generated by StackGAN-v2
Tsne visualization of randomly generated birds, dogs, cats, churchs and bedrooms
If you find StackGAN useful in your research, please consider citing:
@article{Han17stackgan2,
author = {Han Zhang and Tao Xu and Hongsheng Li and Shaoting Zhang and Xiaogang Wang and Xiaolei Huang and Dimitris Metaxas},
title = {StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks},
journal = {arXiv: 1710.10916},
year = {2017},
}
@inproceedings{han2017stackgan,
Author = {Han Zhang and Tao Xu and Hongsheng Li and Shaoting Zhang and Xiaogang Wang and Xiaolei Huang and Dimitris Metaxas},
Title = {StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks},
Year = {2017},
booktitle = {{ICCV}},
}
Our follow-up work
- AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks [Supplementary][code]
References