Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hi Related to RL Project #1

Closed
gaoyuankidult opened this issue Jan 19, 2014 · 10 comments
Closed

Hi Related to RL Project #1

gaoyuankidult opened this issue Jan 19, 2014 · 10 comments

Comments

@gaoyuankidult
Copy link
Owner

Hi Everyone

On the webpage it is mentioned that 26.1 [2] is the deadline of choosing topic. I hope we can start project much earlier before that.

We are planning to complete the project according to a paper.
The name of this papar is

Playing Atari with Deep Reinforcement Learning [1]

Mainly this application considers a algorithm called Deep Q-Networks (DQN), which is really just a fancy name that combines the a variation of Q learning with convolution neural network (CNN).

Hope we all can go through the paper first and I think we should at least know

  • Atari[3]
  • Q learning

for Q learning, you can read wiki and then the section Q-Learning Using Matlab of this article [4]
Basically, the project goes as a small team work. this github repo ( [email protected]:gaoyuankidult/DRL-AI.git ) will be used (please join the repo). questions and discussions can be posted on the github thread.

### Discussion of Next Meeting
  1. Understand CNN (nice work done by yaolu [5])
  2. Environment of Project
  3. Licenses
  4. Language and Architecture of Project

if you have other topic, please inform all members.

The proposed meeting time is 9. pm Tuesday in ida. It is about 2 and half hours long(can be shortened, if goals achieved )

[1] http://arxiv.org/pdf/1312.5602v1.pdf
[2] http://www.cs.helsinki.fi/en/courses/58314105/2014/k/s/1
[3] http://yavar.naddaf.name/ale/
[4] http://pcframe.net/bbs/zboard.php?id=scrap&page=1&sn1=&divpage=1&sn=off&ss=on&sc=on&select_arrange=headnum&desc=asc&no=4
[5] https://github.com/yaolubrain/cnn_linear_max

@yaolubrain
Copy link
Collaborator

Nice job. Yuan! The following is the best tutorial on CNN for object recognition. Please read it.
http://www.cs.toronto.edu/~ranzato/publications/ranzato_cvpr13.pdf

@gaoyuankidult
Copy link
Owner Author

Hi every one. it seems everyone has time around 9pm tomorrow. We will meet at 9 pm in one member's home. preferably my home. The place can be changed if things are not going as we planned.

Br

@yaolubrain
Copy link
Collaborator

no problem!

On Mon, Jan 20, 2014 at 2:14 PM, gaoyuankidult [email protected]:

Hi every one. it seems everyone has time around 9pm tomorrow. We will meet
at 9 pm in one member's home. preferably my home. The place can be changed
if things are not going as we planned.

Br


Reply to this email directly or view it on GitHubhttps://github.com//issues/1#issuecomment-32754674
.

Yao Lu
Department of Computer Science,
University of Helsinki, Finland

@gaoyuankidult
Copy link
Owner Author

Hi Yao

Under the pressure of the course. Nick thinks he should work on an another task about reinforcement learning. so as consequence, we two will build this project. I will take the responsibility of Nick. we meet next week for the project.

@gaoyuankidult
Copy link
Owner Author

Video Output Format:

A 2D array of 7-bit pixels, 160 pixels wide by 210 pixels
high.

first the 128 bytes of RAM (taking values in 0–255), then the 33,600 screen pixels
(taking value in 0–127). The screen is provided in row-order, i.e. beginning with the 160 pixels
that compose the first row.

@gaoyuankidult
Copy link
Owner Author

Hello everyone

What is the situation now ?

In this project, we use a relative loose cooperative style.
But we also need to catch up with our goal.

Our final goal is to present the result with detailed comparison.
for doing that we need to go through following steps:

  1. Working prototype for DQN and HyperNEAT
  2. Furthered enhanced version (comparable with eachother)

As you may know already, the current situation of my for this project is that I am able to provide a full functioning environment which can output suitable picture stream and instantaneous reward.

What is your situation now regarding to the project.

Cheers

@yaolubrain
Copy link
Collaborator

I think I can finish the CNN in C++ in at most two weeks. But I will try to
get it done in a week.

On Wed, Mar 5, 2014 at 4:41 PM, gaoyuankidult [email protected]:

Hello everyone

What is the situation now ?

In this project, we use a relative loose cooperative style.
But we also need to catch up with our goal.

Our final goal is to present the result with detailed comparison.
for doing that we need to go through following steps:

  1. Working prototype for DQN and HyperNEAT
  2. Furthered enhanced version (comparable with eachother)

As you may know already, the current situation of my for this project is
that I am able to provide a full functioning environment which can output
suitable picture stream and instantaneous reward.

What is your situation now regarding to the project.

Cheers

Reply to this email directly or view it on GitHubhttps://github.com//issues/1#issuecomment-36748164
.

Yao Lu
Department of Computer Science,
University of Helsinki, Finland

@jhb86253817
Copy link

Now I am working on NEAT, if things goes well, it can be done by this
weekend. Then, I will continue on HyperNEAT, which is based on NEAT, this
may require another week.

Haibo Jin

2014-03-05 17:23 GMT+02:00 yaolubrain [email protected]:

I think I can finish the CNN in C++ in at most two weeks. But I will try to
get it done in a week.

On Wed, Mar 5, 2014 at 4:41 PM, gaoyuankidult <[email protected]

wrote:

Hello everyone

What is the situation now ?

In this project, we use a relative loose cooperative style.
But we also need to catch up with our goal.

Our final goal is to present the result with detailed comparison.
for doing that we need to go through following steps:

  1. Working prototype for DQN and HyperNEAT
  2. Furthered enhanced version (comparable with eachother)

As you may know already, the current situation of my for this project is
that I am able to provide a full functioning environment which can output
suitable picture stream and instantaneous reward.

What is your situation now regarding to the project.

Cheers

Reply to this email directly or view it on GitHub<
https://github.com/gaoyuankidult/DRL-AI/issues/1#issuecomment-36748164>

.

Yao Lu
Department of Computer Science,
University of Helsinki, Finland

Reply to this email directly or view it on GitHubhttps://github.com//issues/1#issuecomment-36752905
.

@gaoyuankidult
Copy link
Owner Author

Great ! Guys ! You are all excellent cooperator !

@gaoyuankidult
Copy link
Owner Author

Let us continue doing this project and make an impressive result !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants