a GAN idea #31

alreadydone · 2018-09-27T20:01:23Z

Thank you for the work. I recently start working on reinforcement learning of mathematical research (with the formal language and deduction system of a proof assistant as the environment); it's not straightforward to design a proper reward, but novelty is certainly a good measure of progress, and your work is inspiring.

One idea I have, which I also intend to apply in my project, is about the measurement of prediction error; it seems to me that some GAN idea is applicable here. The predictor can be seen as a generator, so how about training a discriminator (conditioned on the current state) with the predicted outcomes as negative samples and the actual outcomes as positive samples? Maybe then you can just predict the pixels, and the discriminator will extract features automatically and ignore any essentially unpredictable features, like the exact locations of tree leaves in a breeze. Also it would be unnecessary to distinguish between things that affect or can be controlled by the agent and things that do not.

I am a beginner in reinforcement learning apart from my participation in the Leela Zero project. I haven't looked much into the details of the various algorithms and NN architectures, and just want to get some feedback about whether the general idea is promising. Thank you in advance!

AdarshMJ · 2018-11-08T06:28:26Z

My initial thoughts were the same. I read few papers which outline the similarities between RL algorithms and GAN, like for example - https://arxiv.org/pdf/1610.01945.pdf
Im not sure whether we can augment GAN with RL algorithms or would it just complicate the whole stuff

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

a GAN idea #31

a GAN idea #31

alreadydone commented Sep 27, 2018 •

edited

Loading

AdarshMJ commented Nov 8, 2018 •

edited

Loading

a GAN idea #31

a GAN idea #31

Comments

alreadydone commented Sep 27, 2018 • edited Loading

AdarshMJ commented Nov 8, 2018 • edited Loading

alreadydone commented Sep 27, 2018 •

edited

Loading

AdarshMJ commented Nov 8, 2018 •

edited

Loading