-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementation of MOPPO fails #102
Comments
Hi @thomaslautenb, Thanks for reporting this. Which version of Gymnasium are you using? The vector wrapper heavily relies on Gymnasium (and it changed recently in Gymnasium). The last time I ran that code it worked, it was (I think) with Gymnasium 0.28.1. Now, for the longer term, we are revamping the wrapper implementations to match Gymnasium 1.0. The PR is here: Farama-Foundation/MO-Gymnasium#95 it would be great if we could validate that the new vector wrappers work with MORL-Baselines. I don't have time to finish this month as I'm defending my thesis and moving abroad but it is on my todo for this summer. If you have time, you could also try it on your own by working from the current PR on MO-Gymnasium and Gymnasium 1.0 :-). |
Hi @thomaslautenb, First, the PR for the migration to 1.0 is moving forward, see #109. Second, I've talked with different people running PGMORL these days and they did not experience any issues. Could you tell me if it is still buggy? |
Hi @ffelten Thanks for the feedback! |
I encountered a Problem with implementation of the MO_PPO.
There was a mismatch in dimensions of the _reward vector in the sync_vector_env environment.
I worked around the issues by extending the dimension of the reward vector by the number of objectives -> reward_dim. (its hardcoded and not pretty)
You might want to have a look on this.
I can provide more extensive report on the issue if required.
The text was updated successfully, but these errors were encountered: