-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training Boltzmann machines in Wiebe et. al #14
Comments
Yes, this looks like a typo. On top of p.12 (v2 of the manuscript) and in appendix E.3, they argue that the mean-field approximation prepares the state with a large overlap with the true Gibbs state. |
This might not be so simple after all. We are trying to understand the conditions in which a mean-field approximation is efficient and it is not quite clear. In E.5 they say that κ is larger for the fully connected case, that is, the overlap is smaller with the true Gibbs state. |
This appendix E.5 is quite interesting indeed, so let me show how I understand what is going on with a couple of bullet points:
(Nevertheless, I am not entirely sure if the quote actually says what we are interested in). The fact that the mean-field approximation does not work well in the case of fully-connected machines --kind of-- also makes sense from a physical point of view. If we talk about Ising models, a fully-connected network represents a system in which all other nodes are nearest neighbors of a given one. Mean-field theory was devised to study systems in which you have just a few nearest neighbors, like those that you find in physical cristaline lattices as in magnetic materials, or in this case, in restricted Boltzmann machines. With this in mind, it makes sense that mean-field priors are closer to solutions of RBMs than to solutions of BMs, which is what they see. Therefore, it is possible that there is no typo and what they say in the appendix is correct: given that the mean-field approximation is not a good approximation in fully-connected networks (BMs), it may be cleverer to use priors other than mean-field states. |
@cgogolin, you read this paper too, do you have any insight on the relationship between the graph topology and the mean-field approximation? I remember you dismissed using this approximation in the local Hamiltonian of the QMLN paper. |
I don't remember the details, so what I am going to say might be total bullshit, but: I think what they meant to say with this sentence is that it is enough for efficient training if the mean field approximation has an overlap that is not smaller than polynomially small. |
Don't take me wrong, I'd be really happy if this was just a semantical problem and it was indeed the case that the greater the overlap with a Gibbs state the better you perform, but there are really a lot of better ways of saying what @cgogolin is saying. Coming back to the phrase:
I would totally believe that they are defending what Christian is saying if instead of the expression in parentheses they used something like (no less than), (at least), (an overlap greater than)... Do you think this question is relevant enough to email the authors? |
I already tried emailing them with no success. |
There is a lot more on this topic in a follow-up paper, which in turn also cites Chowdhury and Somma, 2016, and other efficient general protocols to prepare a thermal state. |
In appendix E, section 5 of Quantum Deep Learning, the authors write:
However, by reading the rest of the article one is lead to the idea that using mean-field theory is a good enough state (better than a uniform state) to be used as a prior in the network.
Does this phrase therefore have a typo or am I missing something?
The text was updated successfully, but these errors were encountered: