Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gym.vector.SyncVectorEnv does not works with the "mo-mountaincarcontinuous-v0" #112

Closed
wilhem opened this issue Aug 9, 2024 · 4 comments

Comments

@wilhem
Copy link

wilhem commented Aug 9, 2024

There was a similar correlated issued here.
I'm using gymnasium 0.29.1 and I created a vector.SyncVectorEnv for the mo-mountaincarcontinuous-v0 environment
The problem is that the step method of the environment returns the following error:

TypeError                                 Traceback (most recent call last)

TypeError: only length-1 arrays can be converted to Python scalars

The above exception was the direct cause of the following exception:
ValueError                                Traceback (most recent call last)

[<ipython-input-32-4571864c046e>](https://localhost:8080/#) in <cell line: 53>()
     52 
     53 if __name__ == "__main__":
---> 54     main()

2 frames

[<ipython-input-32-4571864c046e>](https://localhost:8080/#) in main()
     10 
     11     action = env.action_space.sample()
---> 12     obs, reward, terminated, truncated, info = env.step(action)
     13 
     14     print(reward)

[/usr/local/lib/python3.10/dist-packages/gymnasium/vector/vector_env.py](https://localhost:8080/#) in step(self, actions)
    202         """
    203         self.step_async(actions)
--> 204         return self.step_wait()
    205 
    206     def call_async(self, name, *args, **kwargs):

[/usr/local/lib/python3.10/dist-packages/gymnasium/vector/sync_vector_env.py](https://localhost:8080/#) in step_wait(self)
    143             (
    144                 observation,
--> 145                 self._rewards[i],
    146                 self._terminateds[i],
    147                 self._truncateds[i],

ValueError: setting an array element with a sequence.

The minimal code to reproduce the error is the following:

import mo_gymnasium as mo_gyms
import gymnasium as gyms
import numpy as np
import torch
import time
import os

def main():

    env_id = "mo-mountaincarcontinuous-v0"

    # env setup
    env = gyms.vector.SyncVectorEnv([make_env(env_id, 1, 0, "Example", 0.98)])
    assert isinstance(env.single_action_space, gyms.spaces.Box), "ERROR: only continuous action space is supported"

    obs, info = env.reset()
    
    action = env.action_space.sample()
    obs, reward, terminated, truncated, info = env.step(action)

    print(reward)

if __name__ == "__main__":
    main()
@wilhem
Copy link
Author

wilhem commented Aug 9, 2024

Above is just a minimal example, because the same error happens when running the following code:

from morl_baselines.single_policy.ser.mo_ppo import MOPPO
from morl_baselines.single_policy.ser.mo_ppo import MOPPONet
from morl_baselines.single_policy.ser.mo_ppo import make_env

def main():

    env_id = "mo-mountaincarcontinuous-v0"

    # env setup
    env = gyms.vector.SyncVectorEnv([make_env(env_id = env_id, 
                                              seed = 1,
                                              idx = 0, 
                                              run_name = "Example", 
                                              gamma = 0.98)],
                                   )
    
    assert isinstance(env.single_action_space, gyms.spaces.Box), "ERROR: only continuous action space is supported"

    device = "cuda" if torch.cuda.is_available() else "cpu"

    moppo_net = MOPPONet(obs_shape = env.observation_space.shape,
                         action_shape = env.action_space.shape,
                         reward_dim = 1,
                         net_arch = [64, 64],
                        ).to(device)

    algo = MOPPO(id = 0,
                 networks = moppo_net,
                 weights = np.array([0.8, 0.2]),
                 envs = env,
                 log = False,
                 steps_per_iteration = 2048,
                 num_minibatches = 32,
                 update_epochs = 10,
                 learning_rate = 3e-4,
                 gamma = 0.98,
                 anneal_lr = False,
                 clip_coef = 0.2,
                 ent_coef = 0.01,
                 vf_coef = 0.1,
                 clip_vloss = True,
                 max_grad_norm = 0.5,
                 norm_adv = True,
                 target_kl = None,
                 gae = True,
                 gae_lambda = 0.95,
                 device = device,
                 seed = 88,
                 )

    algo.train(start_time = time.time(),
               current_iteration = 0,
               max_iterations = 1e6,
               )

if __name__ == "__main__":
    main()

@ffelten
Copy link
Collaborator

ffelten commented Aug 12, 2024

Could you try with Gymnasium 0.28.1?

@wilhem
Copy link
Author

wilhem commented Aug 13, 2024

No way.
I tried already even with older versions. I recreated many virtual environments and tried even to change the order of installation.

The only think I could do, which is working, is to change this file sync_vector_env.py (which is from gymnasium) in the following way:

# Replace the following line....
#self._rewards = np.zeros((self.num_envs,), dtype=np.float64)
# ...with the following one, where self.single_reward_space.shape[0] is the dimension of the array of the reward
self._rewards = create_empty_array(self.single_reward_space.shape[0], n = self.num_envs, fn = np.zeros)

@ffelten
Copy link
Collaborator

ffelten commented Aug 13, 2024

Sorry I misread the problem. Indeed, the Gymnasium wrapper does not work out of the box on this matter...
That's why we have MOSyncVectorEnv: https://mo-gymnasium.farama.org/wrappers/wrappers/#mosyncvectorenv

Note: these will change in the next release as Gymnasium now differentiate vector wrappers from normal wrappers

@ffelten ffelten closed this as completed Aug 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants