Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training Boltzmann machines in Wiebe et. al #14

Open
apozas opened this issue Mar 13, 2017 · 8 comments
Open

Training Boltzmann machines in Wiebe et. al #14

apozas opened this issue Mar 13, 2017 · 8 comments

Comments

@apozas
Copy link
Collaborator

apozas commented Mar 13, 2017

In appendix E, section 5 of Quantum Deep Learning, the authors write:

[...] our quantum algorithms [...] can therefore efficiently train full Boltzmann machines given that the mean-field approximation to the Gibbs state has only polynomially small overlap with the true Gibbs state.

However, by reading the rest of the article one is lead to the idea that using mean-field theory is a good enough state (better than a uniform state) to be used as a prior in the network.

Does this phrase therefore have a typo or am I missing something?

@peterwittek
Copy link
Owner

Yes, this looks like a typo. On top of p.12 (v2 of the manuscript) and in appendix E.3, they argue that the mean-field approximation prepares the state with a large overlap with the true Gibbs state.

@peterwittek
Copy link
Owner

This might not be so simple after all. We are trying to understand the conditions in which a mean-field approximation is efficient and it is not quite clear. In E.5 they say that κ is larger for the fully connected case, that is, the overlap is smaller with the true Gibbs state.

@apozas
Copy link
Collaborator Author

apozas commented Mar 15, 2017

This appendix E.5 is quite interesting indeed, so let me show how I understand what is going on with a couple of bullet points:

  • The point of using mean-field states as priors is that they are somehow "closer" to the solution (the actual Gibbs state), so the state takes less time in reaching the solution (and of course, because it doesn't take many resources to prepare mean-field states, otherwise there's no point). In particular, mean-field states reach the solution quicker than uniform states.

  • It looks like when you have fully-connected graphs the mean-field approximation does not work that well. They argue this by citing their Ref. 39. There I have found that

Our purpose in showing results for the fully connected case, then, is to demonstrate how approximations based on sparse graphs, such as those of this paper, may behave poorly for very densely connected problems

(Nevertheless, I am not entirely sure if the quote actually says what we are interested in).

The fact that the mean-field approximation does not work well in the case of fully-connected machines --kind of-- also makes sense from a physical point of view. If we talk about Ising models, a fully-connected network represents a system in which all other nodes are nearest neighbors of a given one. Mean-field theory was devised to study systems in which you have just a few nearest neighbors, like those that you find in physical cristaline lattices as in magnetic materials, or in this case, in restricted Boltzmann machines. With this in mind, it makes sense that mean-field priors are closer to solutions of RBMs than to solutions of BMs, which is what they see.

Therefore, it is possible that there is no typo and what they say in the appendix is correct: given that the mean-field approximation is not a good approximation in fully-connected networks (BMs), it may be cleverer to use priors other than mean-field states.

@peterwittek
Copy link
Owner

@cgogolin, you read this paper too, do you have any insight on the relationship between the graph topology and the mean-field approximation? I remember you dismissed using this approximation in the local Hamiltonian of the QMLN paper.

@cgogolin
Copy link

I don't remember the details, so what I am going to say might be total bullshit, but:

I think what they meant to say with this sentence is that it is enough for efficient training if the mean field approximation has an overlap that is not smaller than polynomially small.

@apozas
Copy link
Collaborator Author

apozas commented Mar 21, 2017

Don't take me wrong, I'd be really happy if this was just a semantical problem and it was indeed the case that the greater the overlap with a Gibbs state the better you perform, but there are really a lot of better ways of saying what @cgogolin is saying. Coming back to the phrase:

[...] given that the mean-field approximation to the Gibbs state has (only) polynomially small overlap with the true Gibbs state.

I would totally believe that they are defending what Christian is saying if instead of the expression in parentheses they used something like (no less than), (at least), (an overlap greater than)...

Do you think this question is relevant enough to email the authors?

@peterwittek
Copy link
Owner

I already tried emailing them with no success.

@peterwittek
Copy link
Owner

There is a lot more on this topic in a follow-up paper, which in turn also cites Chowdhury and Somma, 2016, and other efficient general protocols to prepare a thermal state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants