Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: pacman environment #186

Merged
merged 138 commits into from
Jan 29, 2024
Merged
Show file tree
Hide file tree
Changes from 98 commits
Commits
Show all changes
138 commits
Select commit Hold shift + click to select a range
74cf5ed
feat: added networks
siddarthsingh1 Jun 30, 2023
f35313d
chore: update registry
siddarthsingh1 Jun 30, 2023
0a9445e
feat: trainable
siddarthsingh1 Jun 30, 2023
028c977
chore: requirement jax verision fix
siddarthsingh1 Jun 30, 2023
0642bc2
chore: limit chex version
siddarthsingh1 Jun 30, 2023
8d055af
chore: typing updates
siddarthsingh1 Jul 1, 2023
66e095c
chore: remove extra player step function
siddarthsingh1 Jul 1, 2023
10d0b44
feat: updated networks to use location data augmentation
siddarthsingh1 Jul 1, 2023
369deda
chore: updated vmaps and column and row sizings to be dynamic
siddarthsingh1 Jul 2, 2023
909b8ce
feat: ascii generator added
siddarthsingh1 Jul 2, 2023
a1da873
feat: max timer added
siddarthsingh1 Jul 2, 2023
ec43136
chore: update requirements to match main
siddarthsingh1 Jul 3, 2023
4d9168a
chore: set correct map and update utils
siddarthsingh1 Jul 3, 2023
7a2e3da
feat: linting erros fixed and generator unit tests added
siddarthsingh1 Jul 3, 2023
2655a2e
chore: reset maze
siddarthsingh1 Jul 3, 2023
9b9c0bb
chore: fixed embedding dims
siddarthsingh1 Jul 3, 2023
afd0139
chore: update env typings
siddarthsingh1 Jul 3, 2023
68e6f47
feat: updated collision detection
siddarthsingh1 Jul 6, 2023
376bf43
feat: added gif, png and updated docs
siddarthsingh1 Jul 6, 2023
fb557d1
chore: update docs
siddarthsingh1 Jul 6, 2023
12e0a31
feat: typing docstrings updated
siddarthsingh1 Jul 6, 2023
513b59e
feat: dosctring up to date
siddarthsingh1 Jul 6, 2023
3293ec9
feat: vmapped distance calculations
siddarthsingh1 Jul 6, 2023
13d9f30
feat: increase obs size
siddarthsingh1 Jul 7, 2023
025ff90
feat: more realistic renderer
siddarthsingh1 Jul 7, 2023
6d795ff
chore: obs size fix
siddarthsingh1 Jul 7, 2023
4b2a1f0
chore: fixed ghost freezing and removed spawn camp exploit
siddarthsingh1 Jul 7, 2023
8540591
chore: smaller obs space
siddarthsingh1 Jul 7, 2023
25ba125
feat: fixed collision detection
siddarthsingh1 Jul 8, 2023
37e00ce
feat: updated action mask
siddarthsingh1 Jul 8, 2023
4f0d986
feat: better action masking
siddarthsingh1 Jul 9, 2023
5c014f8
chore: faster termination
siddarthsingh1 Jul 9, 2023
3ce890e
chore: player directino updated
siddarthsingh1 Jul 9, 2023
86f3996
chore: small obs
siddarthsingh1 Jul 9, 2023
3114d68
chore: updated jax typing retracing
siddarthsingh1 Jul 9, 2023
8407aeb
chore: test larger obs
siddarthsingh1 Jul 9, 2023
62f3917
chore: fix y axis teleporting
siddarthsingh1 Jul 9, 2023
eb2b60f
chore: test small obs
siddarthsingh1 Jul 9, 2023
3a9b410
feat: points only awarded for first ghost eaten
siddarthsingh1 Jul 10, 2023
3428682
chore: big obs
siddarthsingh1 Jul 10, 2023
442607e
chore: no vector
siddarthsingh1 Jul 10, 2023
7177f72
chore: large obs
siddarthsingh1 Jul 10, 2023
6c0e69d
chore: small obs
siddarthsingh1 Jul 10, 2023
223f3e9
feat: ghost point detection
siddarthsingh1 Jul 10, 2023
831e38d
chore: aug network
siddarthsingh1 Jul 10, 2023
214685a
chore: big obs
siddarthsingh1 Jul 10, 2023
36eb9f1
chore: ghost eat reset
siddarthsingh1 Jul 11, 2023
dcebfb4
chore: small obs
siddarthsingh1 Jul 11, 2023
75fae5a
chore: linting errors fixed
siddarthsingh1 Jul 17, 2023
61a930b
chore: linting errors
siddarthsingh1 Jul 17, 2023
dd9a687
feat: better rendering
siddarthsingh1 Jul 17, 2023
4b4ed15
chore: new gifs
siddarthsingh1 Jul 17, 2023
39e63a8
chore: added unicode
siddarthsingh1 Jul 17, 2023
5024f53
chore: linting errors
siddarthsingh1 Jul 17, 2023
2c18bb5
chore: fixed retracing from typing bug
siddarthsingh1 Jul 17, 2023
d7323b8
chore: renderer update
siddarthsingh1 Jul 17, 2023
5c3d9a3
chore: update gif and fix render bug
siddarthsingh1 Jul 17, 2023
a660f29
chore: remove extra comments
siddarthsingh1 Jul 20, 2023
cfd62ef
chore: addessing review
siddarthsingh1 Jul 24, 2023
4495478
chore: added type ignore for state replace
siddarthsingh1 Jul 24, 2023
6d42d91
chore: moved `get_directions` into utils
siddarthsingh1 Jul 24, 2023
980587a
chore: moved larger functions into utils file and moved renderer
siddarthsingh1 Jul 24, 2023
45759f6
chore: update generator
siddarthsingh1 Jul 24, 2023
bf3b58c
fix: typing error causing retracing sometimes
siddarthsingh1 Jul 25, 2023
68bedfd
fix: replace state with next_state in step
siddarthsingh1 Jul 25, 2023
2122a92
chore: easier
siddarthsingh1 Jul 25, 2023
fefb54e
chore: test harder setting
siddarthsingh1 Jul 25, 2023
4e05e44
chore: test harder default settings
siddarthsingh1 Jul 25, 2023
0ccf507
fix: state to next state for variables in step function
siddarthsingh1 Jul 25, 2023
8205568
chore: test easier setting
siddarthsingh1 Jul 25, 2023
502c5fa
Merge branch 'main' into main
clement-bonnet Aug 20, 2023
4552dc5
chore: update test
siddarthsingh1 Aug 20, 2023
fd28a29
Merge branch 'main' of github.com:siddarthsingh1/jumanji
siddarthsingh1 Aug 20, 2023
424d2a8
Merge branch 'main' into main
sash-a Jan 8, 2024
29470ed
Update docs/environments/pacman.md
siddarthsingh1 Jan 8, 2024
8366c0a
chore: remove sizes from state
siddarthsingh1 Jan 8, 2024
0ceee07
chore: jnp.bool_ .> bool
siddarthsingh1 Jan 8, 2024
3344e31
chore: reformat action_spc
siddarthsingh1 Jan 8, 2024
49e8022
chore: action_mask_bool to default_action_mask
siddarthsingh1 Jan 8, 2024
8e9282b
chore: remove action mask from state
siddarthsingh1 Jan 8, 2024
8ee7589
chore: remove ignores
siddarthsingh1 Jan 8, 2024
78e7058
chore: remove ignores
siddarthsingh1 Jan 8, 2024
e057a91
chore: remove ignores
siddarthsingh1 Jan 8, 2024
11c68f8
Update jumanji/environments/routing/pacman/env.py
siddarthsingh1 Jan 8, 2024
7a0493a
chore: rename check_reward variables
siddarthsingh1 Jan 8, 2024
64687c9
chore: added comments to check_rewards
siddarthsingh1 Jan 8, 2024
8d19d78
Update jumanji/environments/routing/pacman/env.py
siddarthsingh1 Jan 8, 2024
5856985
chore: remove *1
siddarthsingh1 Jan 8, 2024
f8755b9
chore: update boolean amsk
siddarthsingh1 Jan 8, 2024
abba40b
chore: renaming in check_power_up
siddarthsingh1 Jan 8, 2024
73a35d9
chore: removed boolean statement
siddarthsingh1 Jan 8, 2024
8b03169
chore: remove extra lmabda function
siddarthsingh1 Jan 8, 2024
5aad4bd
chore: distance to direction
siddarthsingh1 Jan 8, 2024
c224e8a
chore: remove needless lambdas
siddarthsingh1 Jan 8, 2024
07f892c
fix: change not to use correct syntax
siddarthsingh1 Jan 9, 2024
cae21ac
fix: fix retracing caused by pellets typing in state
siddarthsingh1 Jan 9, 2024
fcec2a3
Merge branch 'main' into main
sash-a Jan 10, 2024
12d66e8
Merge branch 'main' into main
clement-bonnet Jan 11, 2024
1d7452b
Update jumanji/environments/routing/pacman/generator.py
siddarthsingh1 Jan 11, 2024
959ccf7
Update docs/environments/pacman.md
siddarthsingh1 Jan 11, 2024
e87864c
Update docs/environments/pacman.md
siddarthsingh1 Jan 11, 2024
53e0e63
Update docs/environments/pacman.md
siddarthsingh1 Jan 11, 2024
0e98dab
Update jumanji/environments/routing/pacman/env.py
siddarthsingh1 Jan 11, 2024
3807120
Update jumanji/training/networks/pacman/actor_critic.py
siddarthsingh1 Jan 11, 2024
823c632
chore: update action mask in docs
siddarthsingh1 Jan 12, 2024
bc8d1db
chore: pacman to pac_man folder name
siddarthsingh1 Jan 12, 2024
813c168
fix: naming and readd folder
siddarthsingh1 Jan 12, 2024
50161bf
chore: update pellet spaces shape setting
siddarthsingh1 Jan 12, 2024
df0ec9c
chore: update docs and naming
siddarthsingh1 Jan 12, 2024
fa1d68d
chore: more naming fixes and doc updates
siddarthsingh1 Jan 12, 2024
e151b44
Merge branch 'main' into main
siddarthsingh1 Jan 15, 2024
59a19bd
chore: add pacman config and set for zathura run
siddarthsingh1 Jan 16, 2024
a482f6f
Merge branch 'main' of github.com:siddarthsingh1/jumanji
siddarthsingh1 Jan 16, 2024
19c1e84
chore: revert zathura changes
siddarthsingh1 Jan 16, 2024
ad46721
fix: ghost pathing issues
siddarthsingh1 Jan 17, 2024
aca46fa
fix: update tests due to bugfix
siddarthsingh1 Jan 19, 2024
4ae9e5a
chore: update train configs
siddarthsingh1 Jan 23, 2024
f8c99ad
fix: set tp a2c
siddarthsingh1 Jan 23, 2024
8ad463e
chore: add back delay to ghost movement
siddarthsingh1 Jan 23, 2024
a306269
chore: another seed
siddarthsingh1 Jan 23, 2024
6b62044
chore: update seed
siddarthsingh1 Jan 23, 2024
4e1c133
chore: try old seed
siddarthsingh1 Jan 23, 2024
1d24a92
chore: different seed
siddarthsingh1 Jan 23, 2024
041abc8
chore: add ghost delay
siddarthsingh1 Jan 23, 2024
ea0af1b
chore: change initial spawns
siddarthsingh1 Jan 23, 2024
9deca42
chore: seed 0
siddarthsingh1 Jan 23, 2024
06b33af
chore: seed 13
siddarthsingh1 Jan 23, 2024
6b9e4a1
chore: seed 24
siddarthsingh1 Jan 23, 2024
1c46d25
chore: seed 6
siddarthsingh1 Jan 24, 2024
e548d63
chore: seed 144
siddarthsingh1 Jan 24, 2024
ca11e71
chore: random run
siddarthsingh1 Jan 25, 2024
e9c4fe1
chore: revert configs and reset tests
siddarthsingh1 Jan 25, 2024
162e2d0
Merge branch 'main' into main
siddarthsingh1 Jan 25, 2024
2055aee
Update jumanji/training/configs/config.yaml
siddarthsingh1 Jan 29, 2024
b58980b
Update jumanji/training/configs/config.yaml
siddarthsingh1 Jan 29, 2024
f4dda17
chore: update gif name
siddarthsingh1 Jan 29, 2024
d778354
chore: update doc names and add api and mkdir
siddarthsingh1 Jan 29, 2024
8c4b93d
chore: update api referance
siddarthsingh1 Jan 29, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,9 @@
<img src="docs/env_anim/tetris.gif" alt="Tetris" width="16%">
<img src="docs/env_anim/tsp.gif" alt="Tetris" width="16%">
</div>
<div class="row" align="center">
<img src="docs/env_anim/pacman.gif" alt="RobotWarehouse" width="16%">
</div>
</div>


Expand Down Expand Up @@ -110,6 +113,7 @@ problems.
| 🐍 Snake | Routing | `Snake-v1` | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/snake/) | [doc](https://instadeepai.github.io/jumanji/environments/snake/) |
| 📬 TSP (Travelling Salesman Problem) | Routing | `TSP-v1` | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/tsp/) | [doc](https://instadeepai.github.io/jumanji/environments/tsp/) |
| Multi Minimum Spanning Tree Problem | Routing | `MMST-v0` | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/mmst) | [doc](https://instadeepai.github.io/jumanji/environments/mmst/) |
| ᗧ•••ᗣ•• PacMan | Routing | `PacMan-v0` | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/pacman/) | [doc](https://instadeepai.github.io/jumanji/environments/pacman/)

<h2 name="install" id="install">Installation 🎬</h2>

Expand Down
Binary file added docs/env_anim/pacman.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/env_img/pacman.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
66 changes: 66 additions & 0 deletions docs/environments/pacman.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# PacMan Environment

<p align="center">
<img src="../env_anim/pacman.gif" width="600"/>
siddarthsingh1 marked this conversation as resolved.
Show resolved Hide resolved
</p>

We provide here a minimal Jax JIT-able implementation of the game pacman. The game is played in a 2D matrix where a cell is a free space (black), a wall (dark blue), pacman (yellow) or a ghost.
siddarthsingh1 marked this conversation as resolved.
Show resolved Hide resolved


The goal is for the agent (yellow) to collect all of the pellets (small pink blocks) on the map without touching any of the ghosts. The agent receives a reward of +10 when collecting a pellet for the first time and pellets are removed from the map after being collected.

The power-ups (large pink blocks) trigger a 'scatter mode' which changes the colour of the ghosts to dark blue for 30 in game steps. When the ghosts are in this state, the player can touch them which causes them to return to the center of the map. This gives a reward of +200 for each unique ghost.

The agent selects an action at each timestep (up, left, right, down, no-op) which determines the direction they wil travel for that step. However, even if an action is in an invalid direction it will still be taken as input and the player will remain stationary. If the no-op action is used the player will not stop but instead take the last action that was selected.

The game takes place on a fixed map and the same map is generated on each reset. The generator can be used to generate new maps based on an ASCII representation of the desired map.

## Observation
As an observation, the agent has access to the current maze configuration in the array named
`grid`. It also has access to its current position `player_locations`, the ghosts' locations

`ghost_locations`, the power-pellet locations `power_up_location`, the time left for the scatter state `frightened_state_time`, the pellet locations `pellet_locations` and the action
siddarthsingh1 marked this conversation as resolved.
Show resolved Hide resolved
mask `action_mask`.

- `agent_position`: Position(row, col) (int32) each of shape `()`, agent position in the maze.

- `ghost_locations`: jax array (int) of shape `(4,4)`, with the (y,x) coordinates of each ghost

- `power_up_locations`: jax array (int) of shape `(4,4)`, with the (y,x) coordinates of each power-pellet

- `pellet_locations`: jax array (int) of shape `(4,4)`, with the (y,x) coordinates of each pellet
siddarthsingh1 marked this conversation as resolved.
Show resolved Hide resolved

- `frightened_state_time`: jax array (int32) of shape `()`, number of steps left of the scatter state.

- `action_mask`: jax array (bool) of shape `(4,)`, binary values denoting whether each action is
siddarthsingh1 marked this conversation as resolved.
Show resolved Hide resolved
possible.
- `frightened_state_time`: (int32) tracking the number of steps for the scatter state.
- `score`: (int32) tracking the total points accumulated since the last reset.

An example 5x5 observation `grid` array, is shown below. 1 represents a wall, and 0 represents free
space.

```
[0, 1, 0, 0, 0],
[0, 1, 0, 1, 1],
[0, 1, 0, 0, 0],
[0, 0, 0, 1, 1],
[0, 0, 0, 0, 0]
```


## Action
The action space is a `DiscreteArray` of integer values in the range of [0, 4]. I.e. the agent can
take one of four actions: up (`0`), right (`1`), down (`2`), left (`3`) or no-op (`4`). If an invalid action is
taken, or an action is blocked by a wall, a no-op is performed and the agent's position remains
unchanged. Additionally if a no-op is performed the agent will use the last normal action used.


## Reward
Pacman is a dense reward setting, where the agent receives a reward of +10 for each pellet collected. The agent also recieve a reward of 20 for collecting a power pellet. The game ends when the agent has collected all 316 pellets on the map or touches a ghost.

Eating a ghost when scatter mode is enabled also awards +200 points but, points are only awarded the first time each unique ghost is eaten.


## Registered Versions 📖
- `PacMan-v0`, Pacman in a 31x28 map with simple grid observations.
3 changes: 3 additions & 0 deletions jumanji/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -127,3 +127,6 @@

# TSP with 20 randomly generated cities and a dense reward function.
register(id="TSP-v1", entry_point="jumanji.environments:TSP")

# Pacman - minimal version of Atarti Pacman game
register(id="PacMan-v0", entry_point="jumanji.environments:PacMan")
2 changes: 2 additions & 0 deletions jumanji/environments/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
maze,
mmst,
multi_cvrp,
pacman,
robot_warehouse,
snake,
tsp,
Expand All @@ -42,6 +43,7 @@
from jumanji.environments.routing.maze.env import Maze
from jumanji.environments.routing.mmst.env import MMST
from jumanji.environments.routing.multi_cvrp import MultiCVRP
from jumanji.environments.routing.pacman.env import PacMan
siddarthsingh1 marked this conversation as resolved.
Show resolved Hide resolved
from jumanji.environments.routing.robot_warehouse.env import RobotWarehouse
from jumanji.environments.routing.snake.env import Snake
from jumanji.environments.routing.tsp.env import TSP
Expand Down
16 changes: 16 additions & 0 deletions jumanji/environments/routing/pacman/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Copyright 2022 InstaDeep Ltd. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from jumanji.environments.routing.pacman.env import PacMan
from jumanji.environments.routing.pacman.types import Observation, State
55 changes: 55 additions & 0 deletions jumanji/environments/routing/pacman/constants.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# Copyright 2022 InstaDeep Ltd. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import jax.numpy as jnp

MOVES = jnp.array(
[[0, -1], [-1, 0], [0, 1], [1, 0], [0, 0]]
) # Up, Right, Down, Left, No-op


# Default Maze design
DEFAULT_MAZE = [
"XXXXXXXXXXXXXXXXXXXXXXXXXXXX",
"X S XX S X",
"X XXXX XXXXX XX XXXXX XXXX X",
"X XXXXOXXXXX XX XXXXXOXXXX X",
"X XXXX XXXXX XX XXXXX XXXX X",
"X X",
"X XXXX XX XXXXXXXX XX XXXX X",
"X XXXX XX XXXXXXXX XX XXXX X",
"X XX TXXT XX X",
"XXXXXX XXXXX XX XXXXX XXXXXX",
"XXXXXX XXXXX XX XXXXX XXXXXX",
"XXXXXX XXT TXX XXXXXX",
"XXXXXX XX XXX XXXX XX XXXXXX",
"XXXXXX XX X G X XX XXXXXX",
" GXXXXG ",
"XXXXXX XX X G X XX XXXXXX",
"XXXXXX XX XXX XXXX XX XXXXXX",
"XXXXXX XX XX XXXXXX",
"XXXXXX XX XXXXXXXX XX XXXXXX",
"XXXXXX XX XXXXXXXX XX XXXXXX",
"X XX X",
"X XXXX XXXXX XX XXXXX XXXX X",
"X XXXX XXXXX XX XXXXX XXXX X",
"X XX S P S XX X",
"XXX XX XX XXXXXXXX XX XX XXX",
"XXX XX XX XXXXXXXX XX XX XXX",
"X XX XX XX X",
"X XXXXXXXXXX XX XXXXXXXXXX X",
"X XXXXXXXXXX XX XXXXXXXXXX X",
"X O O X",
"XXXXXXXXXXXXXXXXXXXXXXXXXXXX",
]
Loading