Training Boltzmann machines in Wiebe et. al #14

apozas · 2017-03-13T20:34:59Z

In appendix E, section 5 of Quantum Deep Learning, the authors write:

[...] our quantum algorithms [...] can therefore efficiently train full Boltzmann machines given that the mean-field approximation to the Gibbs state has only polynomially small overlap with the true Gibbs state.

However, by reading the rest of the article one is lead to the idea that using mean-field theory is a good enough state (better than a uniform state) to be used as a prior in the network.

Does this phrase therefore have a typo or am I missing something?

peterwittek · 2017-03-13T21:49:49Z

Yes, this looks like a typo. On top of p.12 (v2 of the manuscript) and in appendix E.3, they argue that the mean-field approximation prepares the state with a large overlap with the true Gibbs state.

peterwittek · 2017-03-15T08:25:14Z

This might not be so simple after all. We are trying to understand the conditions in which a mean-field approximation is efficient and it is not quite clear. In E.5 they say that κ is larger for the fully connected case, that is, the overlap is smaller with the true Gibbs state.

apozas · 2017-03-15T09:09:13Z

This appendix E.5 is quite interesting indeed, so let me show how I understand what is going on with a couple of bullet points:

The point of using mean-field states as priors is that they are somehow "closer" to the solution (the actual Gibbs state), so the state takes less time in reaching the solution (and of course, because it doesn't take many resources to prepare mean-field states, otherwise there's no point). In particular, mean-field states reach the solution quicker than uniform states.
It looks like when you have fully-connected graphs the mean-field approximation does not work that well. They argue this by citing their Ref. 39. There I have found that

Our purpose in showing results for the fully connected case, then, is to demonstrate how approximations based on sparse graphs, such as those of this paper, may behave poorly for very densely connected problems

(Nevertheless, I am not entirely sure if the quote actually says what we are interested in).

The fact that the mean-field approximation does not work well in the case of fully-connected machines --kind of-- also makes sense from a physical point of view. If we talk about Ising models, a fully-connected network represents a system in which all other nodes are nearest neighbors of a given one. Mean-field theory was devised to study systems in which you have just a few nearest neighbors, like those that you find in physical cristaline lattices as in magnetic materials, or in this case, in restricted Boltzmann machines. With this in mind, it makes sense that mean-field priors are closer to solutions of RBMs than to solutions of BMs, which is what they see.

Therefore, it is possible that there is no typo and what they say in the appendix is correct: given that the mean-field approximation is not a good approximation in fully-connected networks (BMs), it may be cleverer to use priors other than mean-field states.

peterwittek · 2017-03-15T16:59:26Z

@cgogolin, you read this paper too, do you have any insight on the relationship between the graph topology and the mean-field approximation? I remember you dismissed using this approximation in the local Hamiltonian of the QMLN paper.

cgogolin · 2017-03-15T19:22:15Z

I don't remember the details, so what I am going to say might be total bullshit, but:

I think what they meant to say with this sentence is that it is enough for efficient training if the mean field approximation has an overlap that is not smaller than polynomially small.

apozas · 2017-03-21T15:35:45Z

Don't take me wrong, I'd be really happy if this was just a semantical problem and it was indeed the case that the greater the overlap with a Gibbs state the better you perform, but there are really a lot of better ways of saying what @cgogolin is saying. Coming back to the phrase:

[...] given that the mean-field approximation to the Gibbs state has (only) polynomially small overlap with the true Gibbs state.

I would totally believe that they are defending what Christian is saying if instead of the expression in parentheses they used something like (no less than), (at least), (an overlap greater than)...

Do you think this question is relevant enough to email the authors?

peterwittek · 2017-03-21T17:43:24Z

I already tried emailing them with no success.

peterwittek · 2017-04-15T19:55:49Z

There is a lot more on this topic in a follow-up paper, which in turn also cites Chowdhury and Somma, 2016, and other efficient general protocols to prepare a thermal state.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training Boltzmann machines in Wiebe et. al #14

Training Boltzmann machines in Wiebe et. al #14

apozas commented Mar 13, 2017

peterwittek commented Mar 13, 2017

peterwittek commented Mar 15, 2017

apozas commented Mar 15, 2017

peterwittek commented Mar 15, 2017

cgogolin commented Mar 15, 2017

apozas commented Mar 21, 2017

peterwittek commented Mar 21, 2017

peterwittek commented Apr 15, 2017

Training Boltzmann machines in Wiebe et. al #14

Training Boltzmann machines in Wiebe et. al #14

Comments

apozas commented Mar 13, 2017

peterwittek commented Mar 13, 2017

peterwittek commented Mar 15, 2017

apozas commented Mar 15, 2017

peterwittek commented Mar 15, 2017

cgogolin commented Mar 15, 2017

apozas commented Mar 21, 2017

peterwittek commented Mar 21, 2017

peterwittek commented Apr 15, 2017