Merge pull request #105 from Bam4d/docs_update

Docs update
Bam4d · Jan 30, 2021 · 3c40bac · 3c40bac
2 parents 27c561b + 9f85ede
commit 3c40bac
Show file tree

Hide file tree

Showing 53 changed files with 1,857 additions and 303 deletions.
diff --git a/bindings/python.cpp b/bindings/python.cpp
@@ -35,6 +35,9 @@ PYBIND11_MODULE(python_griddly, m) {
   gdy.def("get_avatar_object", &Py_GDYWrapper::getAvatarObject);
   gdy.def("create_game", &Py_GDYWrapper::createGame);
 
+  // Get list of objects in the order of their assigned ID
+  gdy.def("get_object_names", &Py_GDYWrapper::getObjectNames);
+
 
   py::class_<Py_GameWrapper, std::shared_ptr<Py_GameWrapper>> game_process(m, "GameProcess");
 
@@ -55,6 +58,8 @@ PYBIND11_MODULE(python_griddly, m) {
   // Get available actions for objects in the current game
   game_process.def("get_available_actions", &Py_GameWrapper::getAvailableActionNames);
   game_process.def("get_available_action_ids", &Py_GameWrapper::getAvailableActionIds);
+
+
 
   // Width and height of the game grid 
   game_process.def("get_width", &Py_GameWrapper::getWidth);

diff --git a/bindings/wrapper/GDYWrapper.cpp b/bindings/wrapper/GDYWrapper.cpp
@@ -35,6 +35,10 @@ class Py_GDYWrapper {
     return gdyFactory_->getExternalActionNames();
   }
 
+  std::vector<std::string> getObjectNames() const {
+    return gdyFactory_->getObjectGenerator()->getObjectNames();
+  }
+
   py::dict getActionInputMappings() const {
     auto actionInputsDefinitions = gdyFactory_->getActionInputsDefinitions();
     py::dict py_actionInputsDefinitions;

diff --git a/docs/about/artwork.rst b/docs/about/artwork.rst
@@ -1,6 +1,6 @@
-=======
+#######
 Artwork
-=======
+#######
 
 The Artwork is provided by the `Oryx Design Lab <https://www.oryxdesignlab.com/>`_.
 

diff --git a/docs/about/community.rst b/docs/about/community.rst
@@ -1,6 +1,6 @@
-=========
+#########
 Community
-=========
+#########
 
 Come join the `Griddly Discord <https://discord.gg/xuR8Dsv>`_ community, get support and share game levels that you have created.
 

diff --git a/docs/about/faq.rst b/docs/about/faq.rst
@@ -1,7 +1,7 @@
 .. _doc_about_faq:
 
-==========================
+##########################
 Frequently Asked Questions
-==========================
+##########################
 
 Nothing here yet!
diff --git a/docs/about/halloffame.rst b/docs/about/halloffame.rst
@@ -1,14 +1,14 @@
-============
+############
 Hall of Fame
-============
+############
 
 If you create a project that uses Griddly, please let us know and we will link it here. This includes if you use Griddly in any papers, use the griddly engine in another game project and want to share your work.
 
 .. note:: You can Be the first!
 
-
+********
 Academia
-========
+********
 
 Please use the following snippet to reference the Griddly project:
 

diff --git a/docs/about/index.rst b/docs/about/index.rst
@@ -1,6 +1,6 @@
-=====
+#####
 About
-=====
+#####
 
 .. toctree::
    :maxdepth: 1

diff --git a/docs/about/introduction.rst b/docs/about/introduction.rst
@@ -1,8 +1,8 @@
 .. _doc_about_introduction:
 
-============
+############
 Introduction
-============
+############
 
 One of the most important things about AI research is data. In many Game Environments the rate of data (rendered frames per second, or state representations per second) is relatively slow meaning very long training times. Researchers can compensate for this problem by parallelising the number of games being played, sometimes on expensive hardward and sometimes on several servers requiring network infrastructure to pass states to the actual learning algorithms. For many researchers and hobbyists who want to learn. This approach is unobtainable and only the research teams with lots of funding and engineers supporting the hardware and infrastrcuture required.
 
@@ -12,26 +12,30 @@ Griddly is an open-source project aimed to be a all-encompassing platform for gr
 
 Here are some of the highlighted features:
 
+***********
 Flexibility
------------
+***********
 
 Griddly games are defined using a simple configuration language GDY in which you can configure the number of players, how inputs are converted into game mechanics, the objects and how they are rendered and what design of the levels.
 
 Read more about :ref:`GDY here<doc_getting_started_gdy>`
 
+********************
 Speed + Memory Usage
---------------------
+********************
 
 The Griddly engine is written entirely in c++ and and uses the `Vulkan API <https://www.khronos.org/vulkan/>`_ to render observational states. This means that all the games have significantly faster frame rates. Griddly also offers lightweight vectorized state rendering, which can render games states at 30k+ FPS in some games.
 
+*****************
 Pre-Defined Games
------------------
+*****************
 
 Visit the :ref:`games section<doc_games>` here to see which games are currently available. Several games have been ported from the GVGAI and MiniGrid RL environments, which can now be run at significantly higher speeds and less memory overhead.
 
 .. note:: More games are being added as Griddly is being developed. Feel free to design your own games and let the discord community see what your have built!
 
+********************
 OpenAI Gym Interface
---------------------
+********************
 
 Griddly provides an open ai gym interface out-of-the-box which wraps the underlying raw API making Reinforcement Learning research significantly easier.
diff --git a/docs/conf.py b/docs/conf.py
@@ -33,9 +33,12 @@
 extensions = [
     'recommonmark',
     'sphinx_rtd_theme',
-    'sphinxcontrib.images'
+    'sphinxcontrib.images',
+    'sphinx.ext.autosectionlabel',
 ]
 
+autosectionlabel_prefix_document=True
+
 # Add any paths that contain templates here, relative to this directory.
 templates_path = ['_templates']
 

diff --git a/docs/getting-started/action spaces/index.rst b/docs/getting-started/action spaces/index.rst
@@ -0,0 +1,251 @@
+.. _doc_action_spaces:
+
+#############
+Action Spaces
+#############
+
+********
+Overview
+********
+
+Griddly provides a common interface for action spaces in python which can be access using:
+
+.. code-block:: python
+
+  env = gym.make('GDY-[your game here]-v0')
+  
+  # This contains a description of the action space
+  env.action_space
+
+All actions follow the following format:
+
+.. code-block:: python
+
+  action = [
+
+      # (Only required if there is no avatar)
+      x, # X coordinate of action to perform. 
+      y, # Y coordinate of action to perform.
+
+      # (Only required if there is more than one action type)
+      action_type, # The type of action to perform (move, gather, attack etc...., 
+
+      # (Always required)
+      action_id, # The ID of the action (These are defined by InputMapping in GDY)
+  ]
+
+  env.step(action)
+
+All values in this array are integers.
+
+
+:x, y:
+    These coordinates are required when the environment does not specify that there is an avatar to control. The coordinates chosen become the location of the action that will be performed.
+
+    For example in a game like chess, or checkers, the coordinates would correspond to the piece that the player wants to move.
+
+:action_type:
+  The action type refers to the index of the action type as defined in the GDY. For example `move`, `gather`, `push` etc...
+
+  A list of the registered (and correctly ordered for use in actions) types can be found using ``env.gdy.get_action_names()``.
+
+:action_id:
+  The action id is commonly used for the "direction" component of the action. The action_id directly corresponds to the ``InputMapping`` of the action. 
+
+.. note:: if no ``InputMapping`` is set for an action, a default of 4 action ids is applied. These action ids resolve to "UP", "DOWN", "LEFT" and "RIGHT"
+
+.. note:: All action types include action_id `0` which corresponds to a no-op
+
+
+Sampling
+========
+
+Sampling the action space is the same as any other environment:
+
+:env.action_space.sample():
+  This will always produce the correct format of actions for the environment that is loaded.
+
+
+Sampling Valid Actions
+======================
+
+In many environment, certain actions may have no effects at all, for example moving an avatar into an immovable object such as a wall. Or attacking a tile that has no objects. 
+
+Griddly provides some helper methods for reducing the action spaces to only sample valid actions and produce masks for calculating valid policies
+
+:env.game.get_available_actions(player_id):
+  Returns a dict of locations of objects that can be controlled and the actions that can be used at those locations
+
+.. warning:: player_id=0 is reserved for NPCs and internal actions
+
+:env.game.get_available_action_ids(location, action_names):
+  Returns a dict of available action_ids at the given location for the given action_names.
+
+ValidActionSpaceWrapper
+-----------------------
+
+In order to easily support games with large action spaces such as RTS games, several helper functions are included a wrapper ``ValidActionSpaceWrapper``. The ``ValidActionSpaceWrapper`` has two functions:
+
+- Sampling actions using this wrapper only returns valid actions in the environment. 
+- Two helper functions are available to create action masks which can be applied during neural network training to force the network to choose only valid actions.
+
+:env.get_unit_location_mask(player_id, mask_type='full'):
+  Returns a mask of all the locations in the grid which can be selected by a particular player.
+
+  If ``mask_type == 'full'`` then a mask of dimensions (grid_height, grid_width) is returned. This mask can be used in the case where a one-hot representation of the entire grid is used for location selection. 
+
+  If ``mask_type == 'reduced'`` then two masks are returned. One for ``grid_height`` and one for ``grid_width``. This mask can be used when two seperate one-hot representations are used for ``x`` and ``y`` selection.
+
+.. warning:: player_id=0 is reserved for NPCs and internal actions
+
+:env.get_unit_action_mask(location, action_names, padded=True):
+  Returns a mask for the ``action_type`` and and ``action_id``
+
+  If ``padded == True`` all masks will be returned with the length padded to the size of the largest number of action ids across all the actions.
+
+  If ``padded == False`` all masks are returned with the length of the number of action ids per action.
+
+.. code-block:: python
+
+    env.reset() # Wrapper must be applied after the reset
+
+    env = ValidActionSpaceWrapper(env)
+
+    unit_location_mask = env.get_unit_location_mask(player_id, mask_type='full')
+    unit_action_mask = env.get_unit_action_mask(location, action_names, padded=True)
+
+
+
+
+.. seealso:: A Closer Look at Action Masking in Policy Gradient Algorithms: https://arxiv.org/abs/2006.14171
+
+
+
+********
+Examples
+********
+
+In this section we break down some example action spaces. In all Griddly environments, ``env.action_space.sample()`` can be used to see what valid action spaces look like.
+
+Here are some explanations of valid actions in different environments are and how to use them.
+
+Single Player
+=============
+
+Single Action Type
+------------------
+
+If the environment has a single action type then only the ``action_id`` needs to be sent to ``env.step``.
+
+This is usually the case in environments where there is an avatar that can only be moved and there are no special actions defined like ``attack`` or ``pick_up``.
+
+Assuming that our only ``action_type`` in the environment is ``move`` then the following code can be used to move the avatar in a particular direction:
+
+.. code-block:: python
+
+  # env.step(action_id)
+  # OR env.step([action_id])
+
+  env.step(3) # Move the avatar right 
+  env.step(1) # Move the avatar left
+
+
+Multiple Action Types
+---------------------
+
+In the case where there may be a more complicated action space, for example if there is an avatar that can "move", but also "attack" in any direction around it, the ``action_type`` and ``action_id`` must both be supplied.
+
+For example:
+
+.. code-block:: python
+
+  # env.step([action_type, action_id])
+
+  env.step([0, 3]) # Move the avatar right 
+  env.step([1, 1]) # Attack to the left of the avatar
+
+Multi-Agent
+===========
+
+Multiple Player Actions
+-----------------------
+
+In multi-agent environments, ``env.step`` expects a list of actions for all players. To send actions to individual players in a call to ``env.step``, set ``action_id = 0`` for any of the players that are not performing an action.
+
+for example:
+
+.. code-block:: python
+
+  env.step([
+    1, # Action for player 1
+    0 # Action for player 2 (which is a no-op)
+  ])
+
+
+Single Action Type
+------------------
+
+If there is only a single action type available, a list of ``action_id`` values can be sent directly to ``env.step`` 
+
+.. code-block:: python
+  
+  env.step([
+    1, # Action for player 1
+    2 # Action for player 2
+  ])
+
+Multiple Action Types
+---------------------
+
+If there are multiple action types available, ``env.step`` must contain a list of values for each player giving the ``action_type`` and ``action_id``:
+
+Given that there are two action types "move" and "attack" and each action type has default ``InputMapping``, the following code can be used to send "move left" to player 1 and "attack forward" to player 2.
+
+.. code-block:: python
+  
+  env.step([
+    [0, 1], # Action for player 1 (move left)
+    [1, 2]  # Action for player 2 (attack forward)
+  ])
+
+
+Real Time Strategy (RTS)
+========================
+
+Multiple players, Multiple Action Types, Action Coordinates
+-----------------------------------------------------------
+
+In RTS games, multiple actions for multiple players can be performed in single time-steps. 
+
+Lets say our RTS game has units that have an action ``move`` and an action ``gather`` (to gather resources). Leta also say that there are three units for each player. We can control them in one call to ``env.step``.
+
+.. code-block:: python
+
+  # env.step([
+  #   [ # List of actions for player 1
+  #     [x1, y1, action_type1, action_id1],
+  #     [x2, y2, action_type2, action_id2],
+  #     ...
+  #   ], 
+  #   [ # List of actions for player 2
+  #     [x1, y1, action_type1, action_id1],
+  #     [x2, y2, action_type2, action_id2],
+  #     ..
+  #   ],
+  # ])
+
+  env.step([
+    # Player 1
+    [ 
+      [3, 10, 0, 3], # Move the unit at [3,10] right
+      [4, 7, 1, 1], # The unit at [4,7] will gather resources in front of it
+      [4, 4, 0, 0] # The unit at [4, 4] will do nothing. (this can also be ommitted with the same effect) 
+    ],
+
+    # Player 2
+    [
+      [10, 4, 1, 3], # The unit at [10,4] will gather resources to the right
+      [13, 2, 1, 1] # The unit at [13,2] will gather resources to the left
+    ]
+  ])
+
-Original file line number
+Diff line change
@@ -1,6 +1,6 @@
-    =======
+    #######
     Artwork
-    =======
+    #######
     The Artwork is provided by the `Oryx Design Lab <https://www.oryxdesignlab.com/>`_.
@@ Expand Down @@