Updated various documentations

GAIGResearch · Dec 3, 2020 · d6bd3fc · d6bd3fc
1 parent 4539976
commit d6bd3fc
Show file tree

Hide file tree

Showing 7 changed files with 303 additions and 128 deletions.
diff --git a/README.md b/README.md
@@ -1,54 +1,47 @@
 # Malmö #
-
 Project Malmö is a platform for Artificial Intelligence experimentation and research built on top of Minecraft. We aim to inspire a new generation of research into challenging new problems presented by this unique environment.
 
 [![Join the chat at https://gitter.im/Microsoft/malmo](https://badges.gitter.im/Microsoft/malmo.svg)](https://gitter.im/Microsoft/malmo?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge) [![Build Status](https://travis-ci.org/Microsoft/malmo.svg?branch=master)](https://travis-ci.org/Microsoft/malmo) [![license](https://img.shields.io/github/license/mashape/apistatus.svg?maxAge=2592000)](https://github.com/Microsoft/malmo/blob/master/LICENSE.txt)
 ----
-
 ## Getting Started ##
 
 ### MalmoEnv ###
 
-MalmoEnv implements an Open AI "gym"-like environment in Python without any native code (communicating directly with Java Minecraft). If you only need this functionallity then please see [MalmoEnv](https://github.com/Microsoft/malmo/tree/master/MalmoEnv). This will most likely be the preferred way to develop with Malmo Minecraft going forward.
-
-If you wish to use the "native" Malmo implementation, either install the "Malmo native Python wheel" (if available for your platform) or a pre-built binary release (more on these options below). Building Malmo yourself from source is always an option!
-
-Advantages:
-
-1. No native code - you don't have to build or install platform dependent code.
-2. A single network connection is used to run missions. No dynamic ports means it's more virtualization friendly.
-3. A simpler multi-agent coordination protocol. 
-One Minecraft client instance, one single port is used to start missions.
-4. Less impedance miss-match with the gym api.
-
-Disadvantages:
+MalmoEnv implements an Open AI "gym"-like environment in Python without any native code (communicating directly with Java Minecraft). If you only need this functionality then please see [MalmoEnv](MalmoEnv/README.md). This will most likely be the preferred way to develop with Malmo Minecraft going forward.
 
-1. The existing Malmo examples are not supported (as API used is different). 
-Marlo envs should work with this [port](https://github.com/AndKram/marLo/tree/malmoenv).
-2. The API is more limited (e.g. selecting video options) - can edit mission xml directly.
+## Setup process 
+- 1, clone malmo (```git clone https://github.com/martinballa/malmo```)
+- 2, install java 8 and python 3. 
+- 3, ```cd malmo/``` and install malmo using pip ```pip install -e MalmoEnv/``` 
+- 4, Test if Malmo works correctly by running the examples in the ```examples/``` directory.
+- 4*, Some examples requires ```ray``` (with ```tune``` and ```rllib```) installed and ```ffmpeg-python```. 
+- +1, to run malmo headless on a linux headless server you should install xvfb ```sudo apt-get install -y xvfb```
 
-### Malmo as a native Python wheel ###
+*Note:* Minecraft uses gradle to build the project and it's not compatible with newer versions of Java, so make sure that you use java version 8 for the build and make sure that $JAVA_HOME is pointing to the correct version.
+If you have any issues with running Malmo check the [FAQ](FAQ.md) as it might cover the issues.
 
-On common Windows, MacOSX and Linux variants it is possible to use ```pip3 install malmo``` to install Malmo as a python with native code package: [Pip install for Malmo](https://github.com/Microsoft/malmo/blob/master/scripts/python-wheel/README.md). Once installed, the malmo Python module can be used to download source and examples and start up Minecraft with the Malmo game mod. 
+This repository contains various improvements to the Malmo framework. This mainly involves the launcher to automatically handle the Malmo instances instead of the need to run them manually. We also updated the ```malmoenv``` python package to facilitate working with malmo. We also got some guides and examples to show how to work with Malmo in both single and multi-agent setups. The examples use RLlib, which provides a wide range of state-of-the-art Reinforcement Learning algorithms. In the examples we have created wrappers to make Malmo compatible to RLlib, but based on these examples it is easy to adapt Malmo to other frameworks.
 
-Alternatively, a pre-built version of Malmo can be installed as follows:
+We provide some examples with explanations in the form of IPython notebooks that are ready to run after getting the dependencies installed.
+The notebooks go through the basics and we recommend to check them in the following order as they explain different ideas along the way:
+- 1, [Random Player in Malmo](notebooks/random_agent_malmo.ipynb) - Explains the setup and shows how to interact with the environment using random action sampling.
+- 2, [RLlib single agent training](notebooks/rllib_single_agent.ipynb) - Expands the random agent example with using RLlib to handle RL experiments.
+- 3, [RLlib multi-agent training](notebooks/rllib_multi_agent.ipynb) - A multi-agent version of the previous example.
+- 4, [RLlib checkpoint restoration](notebooks/rllib_restore_checkpoint.ipynb) - load checkpoint and evaluate the trained agent with capturing the agent's observations as a GIF. Can use this method to continue a training using ray's tune API.
+- 5, [RLlib checkpoint evaluation](notebooks/rllib_evaluate_checkpoint.ipynb) - load a checkpoint and manually evaluate it by extracting the agent's policy.
 
-1. [Download the latest *pre-built* version, for Windows, Linux or MacOSX.](https://github.com/Microsoft/malmo/releases)   
-      NOTE: This is _not_ the same as downloading a zip of the source from Github. _Doing this **will not work** unless you are planning to build the source code yourself (which is a lengthier process). If you get errors along the lines of "`ImportError: No module named MalmoPython`" it will probably be because you have made this mistake._
+We also provided non-notebook versions of these guides, which contain less explanation, but might be more reusable in your projects.
 
-2. Install the dependencies for your OS: [Windows](doc/install_windows.md), [Linux](doc/install_linux.md), [MacOSX](doc/install_macosx.md).
-
-3. Launch Minecraft with our Mod installed. Instructions below.
-
-4. Launch one of our sample agents, as Python, C#, C++ or Java. Instructions below.
-
-5. Follow the [Tutorial](https://github.com/Microsoft/malmo/blob/master/Malmo/samples/Python_examples/Tutorial.pdf) 
+----
+## Baseline results
+**PPO Single-agent mobchase**
 
-6. Explore the [Documentation](http://microsoft.github.io/malmo/). This is also available in the readme.html in the release zip.
+We trained PPO in single and multi-agent setups on the Mob chases tasks. The tensorboard learning curves are shown below from a run of 1 million agent-env interactions. The checkpoint is available in the ```examples/checkpoints/``` package.
+![Single Agent PPO learning curves](imgs/PPO_single_agent_mobchase.png)
 
-7. Read the [Blog](http://microsoft.github.io/malmo/blog) for more information.
+![Evaluation](imgs/PPO_single_agent_mobchase.gif)
 
-If you want to build from source then see the build instructions for your OS: [Windows](doc/build_windows.md), [Linux](doc/build_linux.md), [MacOSX](doc/build_macosx.md).
+**PPO Multi-agent mobchase**
 
 ----
 
@@ -84,65 +77,6 @@ a machine for network use these TCP ports should be open.
 
 ----
 
-## Launch an agent: ##
-
-#### Running a Python agent: ####
-
-```
-cd Python_Examples
-python3 run_mission.py
-``` 
-
-#### Running a C++ agent: ####
-
-`cd Cpp_Examples`
-
-To run the pre-built sample:
-
-`run_mission` (on Windows)  
-`./run_mission` (on Linux or MacOSX)
-
-To build the sample yourself:
-
-`cmake .`  
-`cmake --build .`  
-`./run_mission` (on Linux or MacOSX)  
-`Debug\run_mission.exe` (on Windows)
-
-#### Running a C# agent: ####
-
-To run the pre-built sample (on Windows):
-
-`cd CSharp_Examples`  
-`CSharpExamples_RunMission.exe`
-
-To build the sample yourself, open CSharp_Examples/RunMission.csproj in Visual Studio.
-
-Or from the command-line:
-
-`cd CSharp_Examples`
-
-Then, on Windows:  
-```
-msbuild RunMission.csproj /p:Platform=x64
-bin\x64\Debug\CSharpExamples_RunMission.exe
-```
-
-#### Running a Java agent: ####
-
-`cd Java_Examples`  
-`java -cp MalmoJavaJar.jar:JavaExamples_run_mission.jar -Djava.library.path=. JavaExamples_run_mission` (on Linux or MacOSX)  
-`java -cp MalmoJavaJar.jar;JavaExamples_run_mission.jar -Djava.library.path=. JavaExamples_run_mission` (on Windows)
-
-#### Running an Atari agent: (Linux only) ####
-
-```
-cd Python_Examples
-python3 ALE_HAC.py
-```
-
-----
-
 # Citations #
 
 Please cite Malmo as:

diff --git a/examples/README.md b/examples/README.md
@@ -1,32 +1 @@
 # Malmo
-
-This repository contains various improvements to the Malmo framework. This mainly involves the launcher to automatically handle the Malmo instances instead of the need to run them manually. We also updated the ```malmoenv``` python package to facilitate working with malmo. We also got some guides and examples to show how to work with Malmo in both single and multi-agent setups. The examples use RLlib, which provides a wide range of state-of-the-art Reinforcement Learning algorithms. In the examples we have created wrappers to make Malmo compatible to RLlib, but based on these examples it is easy to adapt Malmo to other frameworks.
-
-We provide some examples with explanations in the form of IPython notebooks that are ready to run after getting the dependencies installed.
-The notebooks go through the basics and we recommend to check them in the following order as they explain different ideas along the way:
-- 1 [Random Player in Malmo](notebooks/random_agent_malmo.ipynb) - Explains the setup and shows how to interact with the environment using random action sampling.
-- 2 [RLlib single agent training](notebooks/rllib_single_agent.ipynb) - Expands the random agent example with using RLlib to handle RL experiments.
-- 3 [RLlib multi-agent training](notebooks/rllib_multi_agent.ipynb) - A multi-agent version of the previous example.
-- 4 [RLlib checkpoint restoration](notebooks/rllib_restore_checkpoint.ipynb) - load checkpoint and evaluate the trained agent with capturing the agent's observations as a GIF. Can use this method to continue a training using ray's tune API.
-- 5 [RLlib checkpoint evaluation](notebooks/rllib_evaluate_checkpoint.ipynb) - load a checkpoint and manually evaluate it by extracting the agent's policy.
-
-We also provided non-notebook versions of these guides, which contain less explanation, but might be more reusable in your projects.
-
-## Setup process 
-- 1, clone malmo (```git clone https://github.com/martinballa/malmo```)
-- 2, install java 8 and python 3. 
-- 3, ```cd malmo/``` and install malmo using pip ```pip install -e MalmoEnv/``` 
-- 4, Test if Malmo works correctly by running the examples in the ```examples/``` directory.
-- 4*, Some examples requires ```ray``` (with ```tune``` and ```rllib```) installed and ```ffmpeg-python```. 
-- +1, to run malmo headless on a linux headless server you should install xvfb ```sudo apt-get install -y xvfb```
-
-*Note:* Minecraft uses gradle to build the project and it's not compatible with newer versions of Java, so make sure that you use java version 8 for the build and make sure that $JAVA_HOME is pointing to the correct version.
-If you have any issues with running Malmo check the [FAQ](FAQ.md) as it might cover the issues.
-
-## Baseline results
-**Single-agent PPO**
-
-We trained PPO in single and multi-agent setups on the Mob chases tasks. The tensorboard learning curves are shown below from a run of 1 million agent-env interactions. The checkpoint is available in the ```examples/checkpoints/``` package.
-![Single Agent PPO learning curves](imgs/PPO_single_agent_mobchase.png)
-
-Multi-agent PPO
diff --git a/examples/checkpoints/PPO_malmo_single_agent/checkpoint_209/checkpoint-209 b/examples/checkpoints/PPO_malmo_single_agent/checkpoint_209/checkpoint-209