Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Where is the implementation for stage 2 with full distribution matching loss L_distr #16

Open
JingyunLiang opened this issue Sep 13, 2020 · 5 comments

Comments

@JingyunLiang
Copy link

The given configs (e.g. train_IRN_x4.yml) seem to be the stage 1 (pre-training stage).

@pkuxmq
Copy link
Owner

pkuxmq commented Sep 13, 2020

train_IRN+_x4.yml

@JingyunLiang
Copy link
Author

Sorry, in train_IRN+_x4.yml, I only found pixel_criterion_forw, pixel_criterion_back, feature_criterion and gan. Also in models/IRNP_model.py, I only found above four kind of losses are used for optimization. I thought that train_IRN+_x4.yml is for generating visually pleasant images.

As in Eq. 10, there is an extra distribution loss (defined in Eq. 9). In page 10, you stated that After the pre-training stage, we restore the full distribution matching loss L_distr (stage 2) in the objective in place of L'_distr (stage 1).

Could you please tell me where is the code of L_distr in the second stage? Thank you.

@JingyunLiang JingyunLiang changed the title Where is the training config for stage 2 with full distribution matching loss L_distr Where is the implementation for stage 2 with full distribution matching loss L_distr Sep 13, 2020
@pkuxmq
Copy link
Owner

pkuxmq commented Sep 14, 2020

We employ the JS divergence as the probability metric for distribution matching (Eq. 7). Following GAN literatures, we implement JS divergence in the adversarial setting where the function T() is regarded as a discriminator. The gan loss in the code is the full distribution matching loss.

@JingyunLiang
Copy link
Author

Thanks for your quick reply. I read the paragraphs about losses, but I am still confused:

1, besides the latent variable z which has a prior, there exists y in IRN model which is subject to some distributional constraint. The z follows the same distribution, i.e., N(0,I), as GAN, while y is constrained to be similar to y_bicubic with the L_guide. They are concatenated to be the input of the network. Am I right? What is the main difference between this way and conditional GAN except that the Generator is a bijector?

2, our model does not have a standalone distribution on x. What does this mean? Does it mean that the model has no assumption of p(x)?

3, the conventional way to use adversarial loss simply cannot be applied. What are the differences in the implementation of GAN loss compared with ESRGAN?

4, match towards the data distribution with an essentially different distribution from the GAN model distribution. From the code, it seems that this model still tries to discriminate x_real and x_fake, which means that it is matching towards the distribution p(x_real).

5, From my understanding, both JS divergence JS(p(x_real), p(x_fake)) and GAN try to minimize the difference between p(x_real) and p(x_fake). In some senses, GAN can be derived based on JS divergence. I understand that this model is different from previous NF because the objective is not MLE of z anymore, but I don't get the point of introducing JS divergence here.

Thank you ahead of time for answering my questions.

@pkuxmq
Copy link
Owner

pkuxmq commented Sep 21, 2020

  1. Conditional GAN transforms z to a conditioned distribution p(x|y), in which conditions y are given. While in our model, the distribution of y are not fixed, but learned to be generated by the model following some constraints as well. Basically our method jointly models image downscaling and upscaling, rather than only do inverse generation with conditions from a fixed distribution.

  2. It means the distribution of x in the inverse procedure should depend on y=f^y (x), or say (x, y) should follow a joint distribution.

3&4. ESRGAN transforms the distribution of LR images to the distribution of HR images and projects each LR image to one HR image point, while our model, in the inverse procedure, transforms the distribution of latent variable z (combined with each LR image y) to the distribution p(x|y=f^y (x)), which models the lost information between HR and LR images. Therefore the adversarial loss plays different roles in principle: in ESRGAN, it encourages the generated point for each input point to lie on the real image manifold (also hold for conventional GAN distribution), while in our model, it encourages the generated distribution of p(x|y=f^y (x)) for each input point y=f^y (x) to follow our target distribution, i.e. real image manifold around the HR image. So the distributions are essentially different.

  1. We introduce JS divergence to realize distribution matching and implement it in the form of a kind of gan loss.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants