Hybrid Reward Architecture with Stanford CS221 Pacman

The motivation behind this is Stanford CS221 Pacman homework where we had to implement Expectimax with a static evaluation function and the top three scores were rewarded. After finishing CS221 read the Hybrid Reward Architecture paper from: https://papers.nips.cc/paper/7123-hybrid-reward-architecture-for-reinforcement-learning.pdf and decided to make the Pacman a bit smarter using Reinforcement Learning.

The Fruit Collection task example from the HRA paper was taken as basis for this implementation.

The following GFVs and Heads were defined:

- one head/gvf per food point and capsules
- two heads for the ghosts and a number of gvfs equal to the number of possible pacman locations, i.e. all locations not having a wall. At each pacman location the heads are reset to the gvfs corresponding to their current location. 
- for the scared ghosts a similar approach was taken as for the regular ghosts 
- diversification explorer head which adds random q values between [0, 20] for a configurable number of steps to allow each episode to start randomly

Each GVF is configured with virtual rewards and the calculated q values are at the end multiplied by the actual game rewards/points.

At the end to aggregate the q values, two aggregation methods were implemented, as described in the orginal paper:

- a linear aggregator summing all q values and then taking the action with the largest q value
- a custom aggregator taking all heads with a positive reward and normalizes them and adding the negative q values weighted by a weight vector

See hra/config.yaml for the set of configurable parameters.

A pre-trained HRA Pacman on 200 games is available in this repo and a series of ten games can be viewed here: https://www.youtube.com/watch?v=iDUfI4soSxI

To let pacman with HRA play, type: python pacman.py -l smallClassic -p HRAAgent To train pacman with HRA, type: python pacman.py -l smallClassic -p HRAAgent -a learn=True -q -n 200

For help, type: python pacman.py -h See http://inst.eecs.berkeley.edu/~cs188 for more information about the pacman game.

Enjoy!

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
hra		hra
layouts		layouts
terminal		terminal
HRAAgents.py		HRAAgents.py
README.md		README.md
game.py		game.py
ghostAgents.py		ghostAgents.py
grader-all.js		grader-all.js
grader.py		grader.py
graderUtil.py		graderUtil.py
grader_mod.py		grader_mod.py
graphicsDisplay.py		graphicsDisplay.py
graphicsUtils.py		graphicsUtils.py
index.html		index.html
keras_neural_network.py		keras_neural_network.py
keyboardAgents.py		keyboardAgents.py
layout.py		layout.py
license.txt		license.txt
pacman.py		pacman.py
pacman_multi_agent.png		pacman_multi_agent.png
plugins		plugins
submission.py		submission.py
textDisplay.py		textDisplay.py
util.py		util.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hybrid Reward Architecture with Stanford CS221 Pacman

About

Releases

Packages

Languages

License

SebastianHurubaru/cs221_pacman_with_hra

Folders and files

Latest commit

History

Repository files navigation

Hybrid Reward Architecture with Stanford CS221 Pacman

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages