Skip to content

Commit

Permalink
Merge pull request #114 from Bam4d/rllib_level_generator
Browse files Browse the repository at this point in the history
Rllib level generator
  • Loading branch information
Bam4d authored Jun 4, 2021
2 parents 2c99a7b + faeb208 commit 0e0c311
Show file tree
Hide file tree
Showing 20 changed files with 321 additions and 104 deletions.
4 changes: 4 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
Expand Up @@ -18,3 +18,7 @@
path = python/examples/experiments/rts-self-play
url = https://github.com/Bam4d/rts-self-play
ignore = dirty
[submodule "python/examples/experiments/autoregressive-cats"]
path = python/examples/experiments/autoregressive-cats
url = https://github.com/Bam4d/autoregressive-cats
ignore = dirty
1 change: 0 additions & 1 deletion bindings/wrapper/GriddlyLoaderWrapper.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@
#include "../../src/Griddly/Core/GDY/Objects/ObjectGenerator.hpp"
#include "../../src/Griddly/Core/GDY/TerminationGenerator.hpp"
#include "../../src/Griddly/Core/Grid.hpp"
#include "../../src/Griddly/Core/Observers/Vulkan/VulkanObserver.hpp"
#include "GDYWrapper.cpp"

namespace griddly {
Expand Down
1 change: 1 addition & 0 deletions docs/_static/video/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
!*.mp4
Binary file added docs/_static/video/griddly_rts.mp4
Binary file not shown.
2 changes: 1 addition & 1 deletion docs/about/halloffame.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
Hall of Fame
############

If you create a project that uses Griddly, please let us know and we will link it here. This includes if you use Griddly in any papers, use the griddly engine in another game project and want to share your work.
If you create a project that uses Griddly, please let us know and we will link it here. This includes if you use Griddly in any papers, use the Griddly engine in another game project and want to share your work.

.. note:: You can Be the first!

Expand Down
2 changes: 1 addition & 1 deletion docs/about/introduction.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
Introduction
############

One of the most important things about AI research is data. In many Game Environments the rate of data (rendered frames per second, or state representations per second) is relatively slow meaning very long training times. Researchers can compensate for this problem by parallelising the number of games being played, sometimes on expensive hardward and sometimes on several servers requiring network infrastructure to pass states to the actual learning algorithms. For many researchers and hobbyists who want to learn. This approach is unobtainable and only the research teams with lots of funding and engineers supporting the hardware and infrastrcuture required.
One of the most important things about AI research is data. In many Game Environments the rate of data (rendered frames per second, or state representations per second) is relatively slow meaning very long training times. Researchers can compensate for this problem by parallelizing the number of games being played, sometimes on expensive hardware and sometimes on several servers requiring network infrastructure to pass states to the actual learning algorithms. For many researchers and hobbyists who want to learn. This approach is unobtainable and only the research teams with lots of funding and engineers supporting the hardware and infrastructure required.

Griddly provides a solution to this issue.

Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
240 changes: 240 additions & 0 deletions docs/getting-started/procedural content generation/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,240 @@
.. _doc_tutorials_pcg:

#############################
Procedural Content Generation
#############################

Reinforcement learning can be prone to over-fitting in environments where the initial conditions are limited and the environment dynamics are deterministic.
Procedural content generation is an important tool in Reinforcement learning, as it allows level maps to be created on-the-fly. This gives the agent a much more complex challenge, and stops it from being able to overfit on a small dataset of levels.


**********
Level Maps
**********

Levels in Griddly environments are defined by strings of characters. The ``MapCharacter`` used are defined in the GDY files of the game. These ``MapCharacter`` can be found in the GDY files or in the game's documentation.

Basic Map
=========

.. code-block:: python
W W W W W W
W A . . . W
W . . . . W
W . . . . W
W . . . g W
W W W W W W
.. figure:: img/Doggo-level-Sprite2D-0.png
:align: center

How the above Doggo level is rendered.


You can see in this map example above that the ``A`` character defines the Dog and the ``g`` character defines the goal. ``W`` defines the walls and ``.`` is reserved for empty space.

This is a basic example and generating levels for this environment might not be too interesting...


************************
Clusters Level Generator
************************

A much more complicated example would be to use the `Clusters<doc_clusters>` game and generate new levels. The aim of the Clusters game is for the agent to push coloured blocks together to form "clusters", whilst avoiding spikes.
The game is fully deterministic and there are only 5 levels supplied in the original GDY file. This makes it a perfect candidate for building new levels and testing if Reinforcement Learning can still solve these levels!


Level Generator Class
=====================

Here's an example of a level generator for the cluster's game.

The ``LevelGenerator`` class can be used as a base class. Only the ``generate`` function needs to be implemented.

.. code-block:: python
class ClustersLevelGenerator(LevelGenerator):
BLUE_BLOCK = 'a'
BLUE_BOX = '1'
RED_BLOCK = 'b'
RED_BOX = '2'
GREEN_BLOCK = 'c'
GREEN_BOX = '3'
AGENT = 'A'
WALL = 'w'
SPIKES = 'h'
def __init__(self, config):
super().__init__(config)
self._width = config.get('width', 10)
self._height = config.get('height', 10)
self._p_red = config.get('p_red', 1.0)
self._p_green = config.get('p_green', 1.0)
self._p_blue = config.get('p_blue', 1.0)
self._m_red = config.get('m_red', 5)
self._m_blue = config.get('m_blue', 5)
self._m_green = config.get('m_green', 5)
self._m_spike = config.get('m_spike', 5)
def _place_walls(self, map):
# top/bottom wall
wall_y = np.array([0, self._height - 1])
map[:, wall_y] = ClustersLevelGenerator.WALL
# left/right wall
wall_x = np.array([0, self._width - 1])
map[wall_x, :] = ClustersLevelGenerator.WALL
return map
def _place_blocks_and_boxes(self, map, possible_locations, p, block_char, box_char, max_boxes):
if np.random.random() < p:
block_location_idx = np.random.choice(len(possible_locations))
block_location = possible_locations[block_location_idx]
del possible_locations[block_location_idx]
map[block_location[0], block_location[1]] = block_char
num_boxes = 1 + np.random.choice(max_boxes - 1)
for k in range(num_boxes):
box_location_idx = np.random.choice(len(possible_locations))
box_location = possible_locations[box_location_idx]
del possible_locations[box_location_idx]
map[box_location[0], box_location[1]] = box_char
return map, possible_locations
def generate(self):
map = np.chararray((self._width, self._height), itemsize=2)
map[:] = '.'
# Generate walls
map = self._place_walls(map)
# all possible locations
possible_locations = []
for w in range(1, self._width - 1):
for h in range(1, self._height - 1):
possible_locations.append([w, h])
# Place Red
map, possible_locations = self._place_blocks_and_boxes(
map,
possible_locations,
self._p_red,
ClustersLevelGenerator.RED_BLOCK,
ClustersLevelGenerator.RED_BOX,
self._m_red
)
# Place Blue
map, possible_locations = self._place_blocks_and_boxes(
map,
possible_locations,
self._p_blue,
ClustersLevelGenerator.BLUE_BLOCK,
ClustersLevelGenerator.BLUE_BOX,
self._m_blue
)
# Place Green
map, possible_locations = self._place_blocks_and_boxes(
map,
possible_locations,
self._p_green,
ClustersLevelGenerator.GREEN_BLOCK,
ClustersLevelGenerator.GREEN_BOX,
self._m_green
)
# Place Spikes
num_spikes = np.random.choice(self._m_spike)
for k in range(num_spikes):
spike_location_idx = np.random.choice(len(possible_locations))
spike_location = possible_locations[spike_location_idx]
del possible_locations[spike_location_idx]
map[spike_location[0], spike_location[1]] = ClustersLevelGenerator.SPIKES
# Place Agent
agent_location_idx = np.random.choice(len(possible_locations))
agent_location = possible_locations[agent_location_idx]
map[agent_location[0], agent_location[1]] = ClustersLevelGenerator.AGENT
level_string = ''
for h in range(0, self._height):
for w in range(0, self._width):
level_string += map[w, h].decode().ljust(4)
level_string += '\n'
return level_string
This generates levels like the following:

.. figure:: img/generated_clusters.png
:align: center

A 10x10 map generated by the above code.



Using ``LevelGenerator``
========================

In the most simple case, the level generator can be used just before the level resets and the generated string can be passed to ``env.reset(level_string=...)``

.. code-block:: python
if __name__ == '__main__':
config = {
'width': 10,
'height': 10
}
renderer = RenderToFile()
level_generator = ClustersLevelGenerator(config)
env = gym.make('GDY-Clusters-v0')
env.reset(level_string=level_generator.generate())
...
Using ``LevelGenerators`` with RLLib
====================================

The ``LevelGenerator`` base class is compatible with RLLib and can be used and configured through the standard RLLib configuration.

For example, the level generator and its parameters can be set up in the ``env_config`` in the following way:

.. code-block:: python
'config': {
...
'env_config': {
'generate_valid_action_trees': True,
'level_generator': {
'class': ClustersLevelGenerator,
'config': {
'width': 6,
'height': 6,
'p_red': 0.7,
'p_green': 0.7,
'p_blue': 0.7,
'm_red': 4,
'm_blue': 4,
'm_green': 4,
'm_spike': 4
}
},
...
}
1 change: 1 addition & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ Griddly documentation.
getting-started/action spaces/index
getting-started/observation spaces/index
getting-started/visualization/index
getting-started/procedural content generation/index

.. toctree::
:maxdepth: 2
Expand Down
1 change: 1 addition & 0 deletions python/examples/experiments/autoregressive-cats
Submodule autoregressive-cats added at d2143f
2 changes: 1 addition & 1 deletion python/examples/experiments/rts-self-play
Submodule rts-self-play updated 2 files
+3 −0 .gitignore
+64 −20 rts_self_play.py
2 changes: 1 addition & 1 deletion python/griddly/GymWrapper.py
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,7 @@ def step(self, action):
elif len(action) == self.player_count:

if np.ndim(action) == 1 or np.ndim(action) == 3:
if isinstance(action[0], list) or isinstance(action[0], np.ndarray):
if isinstance(action[0], list) or isinstance(action[0], np.ndarray) or isinstance(action[0], tuple):
# Multiple agents that can perform multiple actions in parallel
# Used in RTS games
reward = []
Expand Down
Loading

0 comments on commit 0e0c311

Please sign in to comment.