Releases: Farama-Foundation/MO-Gymnasium
MO-Gymnasium 1.3.1 Release: Doc fixes
Doc fixes
Full Changelog: v1.3.0...v1.3.1
MO-Gymnasium 1.3.0 Release: New Mujoco v5 Environments
This release adds the new Mujoco v5 environments:
- mo-ant-v5
- mo-ant-2obj-v5
- mo-hopper-v5
- mo-hopper-2obj-v5
- mo-walker2d-v5
- mo-halfcheetah-v5
- mo-humanoid-v5
- mo-swimmer-v5
What's Changed
- Add Mujoco v5 environments by @LucasAlegre in #85
- Add Python 3.12 support by @ffelten in #108
- Remove pymoo dep by @ffelten in #109
Full Changelog: v1.2.0...v1.3.0
MO-Gymnasium 1.2.0 Release: Update Gymnasium to v1.0.0, New Mountaincar Environments, Documentation and Test Improvements, and more
Breaking Changes
- Similar to Gymnasium v1.0,
VecEnv
s now differ from normalEnv
s. The associated wrappers also differ. See Gymnasium 1.0.0 release notes. - Wrappers have been moved to their
wrappers
subpackage, e.g.,from mo_gymnasium import MORecordEpisodeStatistics
->from mo_gymnasium.wrappers import MORecordEpisodeStatistics
. Vector wrappers can be found undermo-gymnasium.wrappers.vector
. See thetests/
folder or our documentation for example usage.
Environments
- Update Gymnasium to v1.0.0 by @LucasAlegre & @ffelten in #95
- Add Gymnasium performance improvement to Lunar Lander by @pseudo-rnd-thoughts in #89
- Update Lunar lander step to match performance with Gymnasium by @pseudo-rnd-thoughts in #91
- Adding three different mo-mountain-car environments by @pranavg23 in #97
Lunar-Lander
is now v3 in #95
Documentation and Tests
- Add pydoc on how to map the multi-objective reward to the original gymnasium reward by @LucasAlegre in #92
- Documentation of new Mountain Car environments by @pranavg23 in #101
- Add test that Gymnasium and MO-Gymnasium envs match by @pseudo-rnd-thoughts in #90
- Add forgotten envs to doc by @ffelten in #94
- Add py.typed to allow mypy type checking by @sebimarkgraf in #107
Bug Fixes
- breakable-bottles observation space correction by @scott-j-johnson in #93
- Fix fishwood's inconsistent observation dimension by @timondesch in #103
- Fix Docs Generation by @LucasAlegre in #106
-
- (Issue #99) Fix disc_episode_returns off-by-one error by @Katze2664 in #100
- Bump deprecated action by @ffelten in #105
New Contributors
- @scott-j-johnson made their first contribution in #93
- @pranavg23 made their first contribution in #97
- @Katze2664 made their first contribution in #100
- @timondesch made their first contribution in #103
- @sebimarkgraf made their first contribution in #107
Full Changelog: v1.1.0...v1.2.0
MO-Gymnasium 1.1.0 Release: New MuJoCo environments, Mirrored Deep Sea Treasure, Fruit Tree rendering, and more
Environments
- Add new MuJoCo environments by @LucasAlegre in #87
- Add mirror DST env by @ffelten in #79
Other improvements and utils
- Use .unwrapped to access reward_space by @LucasAlegre in #77
- Add rendering for fruit_tree env by @tomekster in #81
Documentation
- Group environments by type in docs by @LucasAlegre in #83
- Add mirrored DST to docs by @ffelten in #80
- Update citations by @LucasAlegre in #86
Bug fixes
- unpin mujoco by @Kallinteris-Andreas in #84
Full Changelog: v1.0.1...v1.1.0
MO-Gymnasium 1.0.1 Release: Support Gymnasium 0.29, breakable-bottles pygame render, and more
Environments
- Add pygame render to breakable-bottles by @LucasAlegre in #75
Wrapper
- Add MOMaxAndSkipObservation Wrapper by @LucasAlegre in #76
Other improvements and utils
- Modify LinearReward to return reward weights as part of info_dict by @ianleongudri in #69
- Add warning for order of wrapping in the MORecordEpisodeStatistics Wrapper by @ffelten in #70
- Support Gymnasium 0.29 by @LucasAlegre in #73
Documentation
Bug fixes
- Fix test worker by @ffelten in #67
- Fix PF and CCS computation of minecart-deterministic-v0 by @LucasAlegre in #74
Full Changelog: v1.0.0...v1.0.1
MO-Gymnasium becomes mature
MO-Gymnasium 1.0.0 Release Notes
We are thrilled to introduce the mature release of MO-Gymnasium, a standardized API and collection of environments designed for Multi-Objective Reinforcement Learning (MORL).
MORL expands the capabilities of RL to scenarios where agents need to optimize multiple objectives, which may potentially conflict with each other. Each objective is represented by a distinct reward function. In this context, the agent learns to make trade-offs between these objectives based on a reward vector received after each step. For instance, in the well-known Mujoco halfcheetah environment, reward components are combined linearly using predefined weights as shown in the following code snippet from Gymnasium:
ctrl_cost = self.control_cost(action)
forward_reward = self._forward_reward_weight * x_velocity
reward = forward_reward - ctrl_cost
With MORL, users have the flexibility to determine the compromises they desire based on their preferences for each objective. Consequently, the environments in MO-Gymnasium do not have predefined weights. Thus, MO-Gymnasium extends the capabilities of Gymnasium to the multi-objective setting, where the agents receives a vectorial reward.
For example, here is an illustration of the multiple policies learned by an MORL agent for the mo-halfcheetah
domain, balancing between saving battery and speed:
This release marks the first mature version of MO-Gymnasium within Farama, indicating that the API is stable, and we have achieved a high level of quality in this library.
API
import gymnasium as gym
import mo_gymnasium as mo_gym
import numpy as np
# It follows the original Gymnasium API ...
env = mo_gym.make('minecart-v0')
obs, info = env.reset()
# but vector_reward is a numpy array!
next_obs, vector_reward, terminated, truncated, info = env.step(your_agent.act(obs))
# Optionally, you can scalarize the reward function with the LinearReward wrapper.
# This allows to fall back to single objective RL
env = mo_gym.LinearReward(env, weight=np.array([0.8, 0.2, 0.2]))
Environments
We support environments ranging from MORL literature to inherently multi-objective problems in the RL literature such as Mujoco. An exhaustive list of environments is available on our documentation website.
Wrappers
Additionally, we provide a set of wrappers tailor made for MORL, such as MONormalizeReward
which normalizes an element of the reward vector, or LinearWrapper
which transforms the MOMDP into an MDP. See also our documentation.
New features and improvements
- Bump
highway-env
version in #50 - Add
mo-lunar-lander-continuous-v2
andmo-hopper-2d-v4
environments in #51 - Add normalized action option to
water-reservoir-v0
in #52 - Accept zero-dimension numpy array as discrete action in #55
- Update pre-commit versions and fix small spelling mistake in #56
- Add method to compute known Pareto Front of fruit tree in #57
- Improve reward bounds on: Mario, minecart, mountain car, resource gathering, reacher in #59, #60, #61
- Add Python 3.11 support, drop Python 3.7 in #65
Bug fixes and documentation updates
- Fix water-reservoir bug caused by numpy randint deprecation in #53
- Fix missing edit button in website in #58
- Fix reward space and add reward bound tests in #62
- Add MO-Gymnasium logo to docs in #64
Full Changelog: v0.3.4...v1.0.0
MO-Gymnasium 0.3.4 Release: Known Pareto Front, improved renders and documentation
Changelogs
Environments
- Add new pixel art rendering for
deep-sea-treasure-v0
,resource-gathering-v0
andwater-reservoir-v0
by @LucasAlegre in #41 - Add
pareto_front
function to get known optimal front in DST, Minecart and Resource Gathering by @LucasAlegre and @ffelten in #45, #43; - Add
deep-sea-treasure-concave-v0
by @ffelten in #43
Utils
- Moved evaluation utils to MORL-Baselines by @ffelten in #47
Documentation
- Improve documentation and README by @LucasAlegre in #40
- Create docs/README.md to link to a new CONTRIBUTING.md for docs by @mgoulao in #42
- Enable documentation versioning and release notes in website by @mgoulao in #46
New Contributors
Full Changelog: v0.3.3...0.3.4
MO-Gymnasium 0.3.3 Release: Policy Evaluation bug fix, better documentation page
New improvements/features
- Add EzPickle to all envs by @ffelten in #34
- Automatic generation of tests by @LucasAlegre in #37
Bugs fixed
- Fix highway env observation conversion by @LucasAlegre in #33
- Fix bug in eval_mo which was passing None to all weight vectors
- Fix minecart and water-reservoir ObservationSpace dtype and bounds
Documentation
- Improve documentation and readme by @LucasAlegre in #35
Full Changelog: 0.3.2...v0.3.3
MO-Gymnasium 0.3.2 Release: Bug fixes, improved webpage
Bug fixes
- Bump highway-env version, to fix rendering
- Add assets to the pypi release package
Documentation
- Add gifs to the webpage
Full Changelog: 0.3.1...0.3.2
MO-Gymnasium 0.3.1 Release: Improved documentation and MuJoco MO-Reacher environment
This minor release adds "mo-reacher-v4", a MuJoco version of the Reacher environment, fixes a bug in Lunar Lander and improves the library documentation.
Environments
- Add mo-reacher-v4 by @LucasAlegre in #25
Documentation
- Use readme info directly in website by @ffelten in #26
- Add pydoc to all environments by @LucasAlegre in #31
Bug Fixes
- Hotfix lunar lander by @ffelten in #27
- MORecordEpisodeStatistics return scalars when not VecEnv by @ffelten in #30
Full Changelog: 0.3.0...0.3.1