Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Docs] Build the docs #34

Merged
merged 60 commits into from
Nov 25, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
60 commits
Select commit Hold shift + click to select a range
cae412e
docs
matteobettini Oct 21, 2023
7ffdbbe
docs
matteobettini Oct 21, 2023
1c7f033
docs banner
matteobettini Oct 21, 2023
bfb8ff2
change
matteobettini Oct 21, 2023
578114c
change
matteobettini Oct 21, 2023
b99e4bd
api
matteobettini Oct 21, 2023
2e51ea2
amend
matteobettini Oct 21, 2023
27dd56a
amend
matteobettini Oct 21, 2023
e933b2b
amend
matteobettini Oct 21, 2023
f3ebfbd
amend
matteobettini Oct 21, 2023
ef2a7cf
empty
matteobettini Oct 25, 2023
4896984
Merge branch 'main' into docs
matteobettini Nov 2, 2023
19af36a
amend
matteobettini Nov 2, 2023
982ff5a
amend
matteobettini Nov 2, 2023
685dc7d
amend
matteobettini Nov 2, 2023
a39b100
amend
matteobettini Nov 2, 2023
dcc988d
amend
matteobettini Nov 2, 2023
67597a5
amend
matteobettini Nov 2, 2023
084ff11
amend
matteobettini Nov 2, 2023
9b02ce0
amend
matteobettini Nov 2, 2023
1757b1a
Merge branch 'main' into docs
matteobettini Nov 13, 2023
ff02eac
amend
matteobettini Nov 14, 2023
56c1007
amend
matteobettini Nov 14, 2023
a5b199a
amend
matteobettini Nov 14, 2023
e2f747c
amend
matteobettini Nov 14, 2023
fd34c83
Amend
matteobettini Nov 22, 2023
dd880a4
Amend
matteobettini Nov 22, 2023
9dac567
Amend
matteobettini Nov 22, 2023
644ba75
Amend
matteobettini Nov 22, 2023
00faabf
Amend
matteobettini Nov 22, 2023
15b6cc3
Amend
matteobettini Nov 22, 2023
2cd5eb0
Amend
matteobettini Nov 22, 2023
eb58787
Amend
matteobettini Nov 22, 2023
935af2f
Amend
matteobettini Nov 22, 2023
e57ab45
Amend
matteobettini Nov 22, 2023
eb3eaef
Amend
matteobettini Nov 22, 2023
ac0faa6
Amend
matteobettini Nov 22, 2023
c88d311
Amend
matteobettini Nov 22, 2023
ff22b40
Revert "Amend"
matteobettini Nov 22, 2023
7cbefee
Amend
matteobettini Nov 23, 2023
9d759f1
Amend
matteobettini Nov 23, 2023
3b0811b
Amend
matteobettini Nov 23, 2023
0ad0510
Amend
matteobettini Nov 24, 2023
5be8ea8
Amend
matteobettini Nov 24, 2023
3d013e0
Amend
matteobettini Nov 24, 2023
e0263ce
Amend
matteobettini Nov 24, 2023
5f19111
Merge branch 'main' into docs
matteobettini Nov 24, 2023
b5891c3
Amend
matteobettini Nov 24, 2023
9589d86
Amend
matteobettini Nov 24, 2023
7a2c917
Make utils private
matteobettini Nov 25, 2023
62791dd
Amend
matteobettini Nov 25, 2023
1707806
Amend
matteobettini Nov 25, 2023
4eb2d3e
Merge remote-tracking branch 'origin/docs' into docs
matteobettini Nov 25, 2023
ba6c8d0
Docs
matteobettini Nov 25, 2023
3a6ae9d
Docs
matteobettini Nov 25, 2023
d2bde6b
Docs
matteobettini Nov 25, 2023
8e4bd82
Algorithm docs
matteobettini Nov 25, 2023
1b4d6e2
Algorithm docs
matteobettini Nov 25, 2023
bb2bffe
Model docs
matteobettini Nov 25, 2023
7244734
Links
matteobettini Nov 25, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,10 @@
**/outputs/
**/multirun/


# Docs
docs/output/
docs/source/generated/
docs/build/

# Byte-compiled / optimized / DLL files
__pycache__/
Expand Down
31 changes: 31 additions & 0 deletions .readthedocs.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# .readthedocs.yaml
# Read the Docs configuration file
# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details

# Required
version: 2

# Set the OS, Python version and other tools you might need
build:
os: ubuntu-22.04
tools:
python: "3.10"

# Build documentation in the "docs/" directory with Sphinx
sphinx:
fail_on_warning: true
configuration: docs/source/conf.py

# Optionally build your docs in additional formats such as PDF and ePub
formats:
- epub

# Optional but recommended, declare the Python requirements required
# to build your documentation
# See https://docs.readthedocs.io/en/stable/guides/reproducible-builds.html
python:
install:
- requirements: docs/requirements.txt
# Install our python package before building the docs
- method: pip
path: .
23 changes: 11 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,19 +1,20 @@
![BenchMARL](https://github.com/matteobettini/vmas-media/blob/main/media/benchmarl.png?raw=true)
![BenchMARL](https://raw.githubusercontent.com/matteobettini/benchmarl_sphinx_theme/master/benchmarl_sphinx_theme/static/img/benchmarl.png?raw=true)


# BenchMARL
[![tests](https://github.com/facebookresearch/BenchMARL/actions/workflows/unit_tests.yml/badge.svg)](test)
[![codecov](https://codecov.io/github/facebookresearch/BenchMARL/coverage.svg?branch=main)](https://codecov.io/gh/facebookresearch/BenchMARL)
[![Documentation Status](https://readthedocs.org/projects/benchmarl/badge/?version=latest)](https://benchmarl.readthedocs.io/en/latest/?badge=latest)
[![Python](https://img.shields.io/badge/python-3.8%20%7C%203.9%20%7C%203.10-blue.svg)](https://www.python.org/downloads/)
<a href="https://pypi.org/project/benchmarl"><img src="https://img.shields.io/pypi/v/benchmarl" alt="pypi version"></a>
[![Downloads](https://static.pepy.tech/personalized-badge/benchmarl?period=total&units=international_system&left_color=grey&right_color=blue&left_text=Downloads)](https://pepy.tech/project/benchmarl)
[![Discord Shield](https://dcbadge.vercel.app/api/server/jEEWCn6T3p?style=flat)](https://discord.gg/jEEWCn6T3p)

```bash
python benchmarl/run.py algorithm=mappo task=vmas/balance
```



[![Examples](https://img.shields.io/badge/Examples-blue.svg)](examples) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/facebookresearch/BenchMARL/blob/main/notebooks/run.ipynb)
[![Static Badge](https://img.shields.io/badge/Benchmarks-Wandb-yellow)](https://wandb.ai/matteobettini/benchmarl-public/reportlist)

Expand Down Expand Up @@ -58,6 +59,7 @@ the domain and want to easily take a picture of the landscape.
* [Reporting and plotting](#reporting-and-plotting)
* [Extending](#extending)
* [Configuring](#configuring)
+ [Experiment](#experiment)
+ [Algorithm](#algorithm)
+ [Task](#task)
+ [Model](#model)
Expand Down Expand Up @@ -280,10 +282,9 @@ Currently available ones are:

In the following, we report a table of the results:

| **<p align="center">Environment</p>** | **<p align="center">Sample efficiency curves (all tasks)</p>** | **<p align="center">Performance profile</p>** | **<p align="center">Aggregate scores</p>** |
|---------------------------------------|-------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------|
| VMAS | <img src="https://drive.google.com/uc?export=view&id=1fzfFn0q54gsALRAwmqD1hRTqQIadGPoE"/> | <img src="https://drive.google.com/uc?export=view&id=151pSR2sBluSpWiYxtq3jNX0tfE0vgAuR"/> | <img src="https://drive.google.com/uc?export=view&id=1q2So9V6sL8NHMtj6vL-S3KyzZi11Vfia"/> |

| **<p align="center">Environment</p>** | **<p align="center">Sample efficiency curves (all tasks)</p>** | **<p align="center">Performance profile</p>** | **<p align="center">Aggregate scores</p>** |
|---------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| VMAS | <img src="https://raw.githubusercontent.com/matteobettini/benchmarl_sphinx_theme/master/benchmarl_sphinx_theme/static/img/benchmarks/vmas/environemnt_sample_efficiency_curves.png"/> | <img src="https://raw.githubusercontent.com/matteobettini/benchmarl_sphinx_theme/master/benchmarl_sphinx_theme/static/img/benchmarks/vmas/performance_profile_figure.png"/> | <img src="https://raw.githubusercontent.com/matteobettini/benchmarl_sphinx_theme/master/benchmarl_sphinx_theme/static/img/benchmarks/vmas/aggregate_scores.png"/> |

## Reporting and plotting

Expand All @@ -295,9 +296,9 @@ your benchmarks. No more struggling with matplotlib and latex!

[![Example](https://img.shields.io/badge/Example-blue.svg)](examples/plotting)

![aggregate_scores](https://drive.google.com/uc?export=view&id=1q2So9V6sL8NHMtj6vL-S3KyzZi11Vfia)
![sample_efficiancy](https://drive.google.com/uc?export=view&id=1fzfFn0q54gsALRAwmqD1hRTqQIadGPoE)
![performace_profile](https://drive.google.com/uc?export=view&id=151pSR2sBluSpWiYxtq3jNX0tfE0vgAuR)
![aggregate_scores](https://raw.githubusercontent.com/matteobettini/benchmarl_sphinx_theme/master/benchmarl_sphinx_theme/static/img/benchmarks/vmas/aggregate_scores.png)
![sample_efficiancy](https://raw.githubusercontent.com/matteobettini/benchmarl_sphinx_theme/master/benchmarl_sphinx_theme/static/img/benchmarks/vmas/environemnt_sample_efficiency_curves.png)
![performace_profile](https://raw.githubusercontent.com/matteobettini/benchmarl_sphinx_theme/master/benchmarl_sphinx_theme/static/img/benchmarks/vmas/performance_profile_figure.png)


## Extending
Expand All @@ -322,7 +323,6 @@ in the script itself or via [hydra](https://hydra.cc/docs/intro/).
We suggest to read the hydra documentation
to get familiar with all its functionalities.

The project can be configured either the script itself or via hydra.
Each component in the project has a corresponding yaml configuration in the BenchMARL
[conf tree](benchmarl/conf).
Components' configurations are loaded from these files into python dataclasses that act
Expand All @@ -333,8 +333,7 @@ You can also directly load and validate configuration yaml files without using h

### Experiment

Experiment configurations are in [`benchmarl/conf/config.yaml`](benchmarl/conf/config.yaml),
with the experiment hyperparameters in [`benchmarl/conf/experiment`](benchmarl/conf/experiment).
Experiment configurations are in [`benchmarl/conf/config.yaml`](benchmarl/conf/config.yaml).
Running custom experiments is extremely simplified by the [Hydra](https://hydra.cc/) configurations.
The default configuration for the library is contained in the [`benchmarl/conf`](benchmarl/conf) folder.

Expand Down
13 changes: 11 additions & 2 deletions benchmarl/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,22 @@
# LICENSE file in the root directory of this source tree.
#


__version__ = "0.0.4"

import importlib

import benchmarl.algorithms
import benchmarl.benchmark
import benchmarl.environments
import benchmarl.experiment
import benchmarl.models

_has_hydra = importlib.util.find_spec("hydra") is not None

if _has_hydra:

def load_hydra_schemas():
def _load_hydra_schemas():
from hydra.core.config_store import ConfigStore

from benchmarl.algorithms import algorithm_config_registry
Expand All @@ -28,4 +37,4 @@ def load_hydra_schemas():
for task_schema_name, task_schema in _task_class_registry.items():
cs.store(name=task_schema_name, group="task", node=task_schema)

load_hydra_schemas()
_load_hydra_schemas()
22 changes: 22 additions & 0 deletions benchmarl/algorithms/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
# LICENSE file in the root directory of this source tree.
#

from .common import Algorithm, AlgorithmConfig
from .iddpg import Iddpg, IddpgConfig
from .ippo import Ippo, IppoConfig
from .iql import Iql, IqlConfig
Expand All @@ -14,6 +15,27 @@
from .qmix import Qmix, QmixConfig
from .vdn import Vdn, VdnConfig

classes = [
"Iddpg",
"IddpgConfig",
"Ippo",
"IppoConfig",
"Iql",
"IqlConfig",
"Isac",
"IsacConfig",
"Maddpg",
"MaddpgConfig",
"Mappo",
"MappoConfig",
"Masac",
"MasacConfig",
"Qmix",
"QmixConfig",
"Vdn",
"VdnConfig",
]

# A registry mapping "algoname" to its config dataclass
# This is used to aid loading of algorithms from yaml
algorithm_config_registry = {
Expand Down
28 changes: 15 additions & 13 deletions benchmarl/algorithms/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
from torchrl.objectives.utils import HardUpdate, SoftUpdate, TargetNetUpdater

from benchmarl.models.common import ModelConfig
from benchmarl.utils import DEVICE_TYPING, read_yaml_config
from benchmarl.utils import _read_yaml_config, DEVICE_TYPING


class Algorithm(ABC):
Expand All @@ -32,7 +32,7 @@ class Algorithm(ABC):
This should be overridden by implemented algorithms
and all abstract methods should be implemented.

Args:
Args:
experiment (Experiment): the experiment class
"""

Expand Down Expand Up @@ -104,14 +104,13 @@ def _check_specs(self):
def get_loss_and_updater(self, group: str) -> Tuple[LossModule, TargetNetUpdater]:
"""
Get the LossModule and TargetNetUpdater for a specific group.
This function calls the abstract self._get_loss() which needs to be implemented.
This function calls the abstract :class:`~benchmarl.algorithms.Algorithm._get_loss()` which needs to be implemented.
The function will cache the output at the first call and return the cached values in future calls.

Args:
group (str): agent group of the loss and updater

Returns: LossModule and TargetNetUpdater for the group

"""
if group not in self._losses_and_updaters.keys():
action_space = self.action_spec[group, "action"]
Expand Down Expand Up @@ -144,7 +143,7 @@ def get_replay_buffer(
) -> ReplayBuffer:
"""
Get the ReplayBuffer for a specific group.
This function will check self.on_policy and create the buffer accordingly
This function will check ``self.on_policy`` and create the buffer accordingly

Args:
group (str): agent group of the loss and updater
Expand All @@ -165,7 +164,7 @@ def get_replay_buffer(
def get_policy_for_loss(self, group: str) -> TensorDictModule:
"""
Get the non-explorative policy for a specific group loss.
This function calls the abstract self._get_policy_for_loss() which needs to be implemented.
This function calls the abstract :class:`~benchmarl.algorithms.Algorithm._get_policy_for_loss()` which needs to be implemented.
The function will cache the output at the first call and return the cached values in future calls.

Args:
Expand All @@ -192,7 +191,7 @@ def get_policy_for_loss(self, group: str) -> TensorDictModule:
def get_policy_for_collection(self) -> TensorDictSequential:
"""
Get the explorative policy for all groups together.
This function calls the abstract self._get_policy_for_collection() which needs to be implemented.
This function calls the abstract :class:`~benchmarl.algorithms.Algorithm._get_policy_for_collection()` which needs to be implemented.
The function will cache the output at the first call and return the cached values in future calls.

Returns: TensorDictSequential representing all explorative policies
Expand All @@ -217,7 +216,7 @@ def get_policy_for_collection(self) -> TensorDictSequential:
def get_parameters(self, group: str) -> Dict[str, Iterable]:
"""
Get the dictionary mapping loss names to the relative parameters to optimize for a given group.
This function calls the abstract self._get_parameters() which needs to be implemented.
This function calls the abstract :class:`~benchmarl.algorithms.Algorithm._get_parameters()` which needs to be implemented.

Returns: a dictionary mapping loss names to a parameters' list
"""
Expand Down Expand Up @@ -323,13 +322,16 @@ class AlgorithmConfig:
Dataclass representing an algorithm configuration.
This should be overridden by implemented algorithms.
Implementors should:
1. add configuration parameters for their algorithm
2. implement all abstract methods

1. add configuration parameters for their algorithm
2. implement all abstract methods

"""

def get_algorithm(self, experiment) -> Algorithm:
"""
Main function to turn the config into the associated algorithm

Args:
experiment (Experiment): the experiment class

Expand All @@ -349,7 +351,7 @@ def _load_from_yaml(name: str) -> Dict[str, Any]:
/ "algorithm"
/ f"{name.lower()}.yaml"
)
return read_yaml_config(str(yaml_path.resolve()))
return _read_yaml_config(str(yaml_path.resolve()))

@classmethod
def get_from_yaml(cls, path: Optional[str] = None):
Expand All @@ -359,7 +361,7 @@ def get_from_yaml(cls, path: Optional[str] = None):
Args:
path (str, optional): The full path of the yaml file to load from.
If None, it will default to
benchmarl/conf/algorithm/self.associated_class().__name__
``benchmarl/conf/algorithm/self.associated_class().__name__``

Returns: the loaded AlgorithmConfig
"""
Expand All @@ -370,7 +372,7 @@ def get_from_yaml(cls, path: Optional[str] = None):
)
)
else:
return cls(**read_yaml_config(path))
return cls(**_read_yaml_config(path))

@staticmethod
@abstractmethod
Expand Down
12 changes: 12 additions & 0 deletions benchmarl/algorithms/iddpg.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,16 @@


class Iddpg(Algorithm):
"""Same as :class:`~benchmarkl.algorithms.Maddpg` (from `https://arxiv.org/abs/1706.02275 <https://arxiv.org/abs/1706.02275>`__) but with decentralized critics.

Args:
share_param_critic (bool): Whether to share the parameters of the critics withing agent groups
loss_function (str): loss function for the value discrepancy. Can be one of "l1", "l2" or "smooth_l1".
delay_value (bool): whether to separate the target value networks from the value networks used for
data collection.

"""

def __init__(
self, share_param_critic: bool, loss_function: str, delay_value: bool, **kwargs
):
Expand Down Expand Up @@ -227,6 +237,8 @@ def get_value_module(self, group: str) -> TensorDictModule:

@dataclass
class IddpgConfig(AlgorithmConfig):
"""Configuration dataclass for :class:`~benchmarl.algorithms.Iddpg`."""

share_param_critic: bool = MISSING
loss_function: str = MISSING
delay_value: bool = MISSING
Expand Down
17 changes: 17 additions & 0 deletions benchmarl/algorithms/ippo.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,21 @@


class Ippo(Algorithm):
"""Independent PPO (from `https://arxiv.org/abs/2011.09533 <https://arxiv.org/abs/2011.09533>`__).

Args:
share_param_critic (bool): Whether to share the parameters of the critics withing agent groups
clip_epsilon (scalar): weight clipping threshold in the clipped PPO loss equation.
entropy_coef (scalar): entropy multiplier when computing the total loss.
critic_coef (scalar): critic loss multiplier when computing the total
loss_critic_type (str): loss function for the value discrepancy.
Can be one of "l1", "l2" or "smooth_l1".
lmbda (float): The GAE lambda
scale_mapping (str): positive mapping function to be used with the std.
choices: "softplus", "exp", "relu", "biased_softplus_1";

"""

def __init__(
self,
share_param_critic: bool,
Expand Down Expand Up @@ -270,6 +285,8 @@ def get_critic(self, group: str) -> TensorDictModule:

@dataclass
class IppoConfig(AlgorithmConfig):
"""Configuration dataclass for :class:`~benchmarl.algorithms.Ippo`."""

share_param_critic: bool = MISSING
clip_epsilon: float = MISSING
entropy_coef: float = MISSING
Expand Down
11 changes: 11 additions & 0 deletions benchmarl/algorithms/iql.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,15 @@


class Iql(Algorithm):
"""Independent Q Learning (from `https://www.semanticscholar.org/paper/Multi-Agent-Reinforcement-Learning%3A-Independent-Tan/59de874c1e547399b695337bcff23070664fa66e <https://www.semanticscholar.org/paper/Multi-Agent-Reinforcement-Learning%3A-Independent-Tan/59de874c1e547399b695337bcff23070664fa66e>`__).

Args:
loss_function (str): loss function for the value discrepancy. Can be one of "l1", "l2" or "smooth_l1".
delay_value (bool): whether to separate the target value networks from the value networks used for
data collection.

"""

def __init__(self, delay_value: bool, loss_function: str, **kwargs):
super().__init__(**kwargs)

Expand Down Expand Up @@ -175,6 +184,8 @@ def process_batch(self, group: str, batch: TensorDictBase) -> TensorDictBase:

@dataclass
class IqlConfig(AlgorithmConfig):
"""Configuration dataclass for :class:`~benchmarl.algorithms.Iql`."""

delay_value: bool = MISSING
loss_function: str = MISSING

Expand Down
Loading
Loading