Introduction

Atari Game Pong in Tensor Flow

Reinforcement learning project - Training agent to play Atari game Pong in Tensor Flow

Introduction

In this project, reinforcement learning will be used for training an agent to play Pong game. For this purpose, specifically policy gradients method is analysed from its theoretical aspects to practical implementation. After I read Andrej Karpathy's blog, I was fascinated how simple but still effective this method could be. In his blog, Andrej expalained very briefly this approach and implemented it in Python from scratch. I wanted to do basically the same implementation, but in Tensor Flow instead. Just for the clarity, I will try to sum up the most important points about policy gradients and to explain environment which I used for game simulation.

Environment

Team of OpenAI researchers have developed gym environment which contains several Atari games. In this project, the focus is on the Pong game. The goal of this game is to pass the ball by the opponent by hitting it under some angle. This is presented on the Figure 1. The agent may move up or down at any moment.

Figure 1: Pong game environment

Policy gradients method

The idea behind this method is to learn approximatively optimal policy $\pi. This is done using simple neural network presented on Figure 2. The network will calculate probabilities for going up/down based on the input.

Figure 2: Neural network simulating policy $$\pi$$

The raw input is frame of the shape 210x160x3 which will be cropped to size 80x80x3 and then converted from RGB to gray matrix of size 80x80. This final matrix will be reshaped to 6400x1 array. In order to detect movement in the game, two adjacent frames are subracted and their difference is then reshaped in a mentioned way. Hidded layer contains 200 neurons. Output layer has 2 neurons, so that first one gives probability of going up, and the second of going down. These two probabilities are complementary (Pup = 1 - Pdown).

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
images		images
log		log
.gitignore		.gitignore
README.md		README.md
izvestaj_srdjan_jovanovic.pdf		izvestaj_srdjan_jovanovic.pdf
nn_model.py		nn_model.py
ping_pong_policy_gradient.py		ping_pong_policy_gradient.py
pong-agent.gif		pong-agent.gif
save.p		save.p
simulate_agent.py		simulate_agent.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Atari Game Pong in Tensor Flow

Introduction

Environment

Policy gradients method

About

Releases

Packages

Languages

XiuMingLin/PongFromPixels

Folders and files

Latest commit

History

Repository files navigation

Atari Game Pong in Tensor Flow

Introduction

Environment

Policy gradients method

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages