Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ELBO function #9

Open
erlebach opened this issue May 29, 2017 · 5 comments
Open

ELBO function #9

erlebach opened this issue May 29, 2017 · 5 comments

Comments

@erlebach
Copy link

erlebach commented May 29, 2017

Hi,

The Elbo function returns monitor_functions, a dictionary of elements to monitor.

In train.py and evaluation.py, you call elbo_loss, which returns monitor_functions. So far so good. In train.py, you call optimizer.minimize(loss_op), where loss_op is the return value to the elbow function
(line 259 in train.py). minimize() should take the function to be minimized as argument.

Perhaps there is a better explanation for how the code is written since it is unlikely you could get the code to work if this is an error.

I just realized that the code calls train() and not train_simple(). The issue I mention above is in train_simple(). I assume it is an error?

Thank you.

@wellecks
Copy link
Owner

We ended up using the train() function for the experiments, and the train_simple wasn't updated as the code evolved, so train_simple probably doesn't work right now. In train(), the training loss is extracted from monitor_functions, then turned into train_op and passed into sess.run on line 150. Does that help?

@erlebach
Copy link
Author

Yes, it helps. Thank you for replying. Just so that you know, I am analyzing your code. I converted it to run in anaconda3 (Python 3.6 + Tensorflow 1.1 (or 1.2: do not recall)). The conversion was very straightforward.

One additional question and a comment:
Question: you have several encoders, but only a basic decoder.
Regarding duplication of Rezende's results: Rezende uses a maxout nonlinearity, whereas you use a tanh nonlinearity. Might this account for differences in your results and his? You could use elu or Relu (why don't you?) Also, haven't some authors used convolutional networks to improve on the results?

thanks again. Gordon.

@wellecks
Copy link
Owner

Regarding the multiple encoders, we were trying to measure the performance difference between using a simple encoder and three types of normalizing flows ('residual', householder, inverse autoregressive), while keeping everything else constant. Introducing the flows only affects the encoder, so we just keep the decoder the same for all four cases.

Regarding tanh, we followed the settings found in version 1 of the Householder flows paper since it was fairly comprehensive (they used tanh there). Since then this paper's been updated, but we weren't able to get the results reported in version 1

@erlebach
Copy link
Author

erlebach commented May 30, 2017 via email

@wellecks
Copy link
Owner

I see, thanks. Right - we didn't use the conv net since we were just testing on MNIST, but in general it could be good to use. However, I don't think we tested the conv_net code so it might need some minor changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants