This is a simplified implementation of the experiment. It is based on two projects:
- The experiment details were taken from this paper https://arxiv.org/abs/1606.03657 and corresponding project https://github.com/openai/InfoGAN with same changes.
- And DCGAN torch implementation details were taken from https://github.com/soumith/dcgan.torch, thanks to its simplicity.
In original paper they maximize aproximated mutual information (MI) between latent parameters and generated images. Here I do the same thing, but I don't evaluate complete MI and deriving gradients from it. Instead, I keep the only part which is relevent to computing gradients.
For normally distributed parameters I assume that the variance is fixed (just like it is done in original experiment). So it is enough to reduce MSE between latent and estimated parameters. The resulting loss is proportional to MI, so the gradinet is same.
Same thing is here - the result is equevalent to minimizing negative log-likelihood.
Changing c1 from -2 to 2 in each column and activating different categorical parameter per row: