-
What is the training setting and hyperparameter of caption videos? The Humanoid's gait looks very realistic, what is the final episodic return? I trained PPO with Humanoid, but it always has a jumping behavior (~14000 episodic return). |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
@btaba Is it > 13000? I see in the notebook that the humanoidstandup is 75000 BTW, how many training steps used, same as the notebook (50M)? |
Beta Was this translation helpful? Give feedback.
-
(Ah mis-spoke on the last comment, deleting it). |
Beta Was this translation helpful? Give feedback.
(Ah mis-spoke on the last comment, deleting it).
We set the hparams in https://github.com/google/brax/blob/main/notebooks/training.ipynb based on a sweep a while ago, and were getting (avg over 3 seeds) ~11300 for humanoid, ~55000 for humanoidstandup. The walking gaits were a bit better than the one in the notebook now, although there is definitely variance between seeds. It's not too surprising that things may have deviated, but I suspect another set of params/seeds could recover a better walking gait.