AutoResetWrapper
does not, in fact, call env.reset()
#278
Theo-Cheynel
started this conversation in
General
Replies: 2 comments 2 replies
-
Hi @Theo-Cheynel ! The intention of AutoResetWrapper is indeed to use the |
Beta Was this translation helpful? Give feedback.
0 replies
-
Thanks, that makes sense ! I think this has an impact on the training of my models though, since I have to set a moderate batch size in order to stay within my GPU's vRAM. Do you know what could be done to provide more randomness without impacting training performance too much ? |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
The AutoResetWrapper (
brax/brax/envs/wrappers.py
Line 123 in 5ef4f65
reset
method : instead, it uses theqp
stored instate.info['first_qp']
after the last external call to reset.This assumes that the call to
env.reset
will always return the same value. However, in the example environments provided, there is some randomness introduced during the reset, which is not captured by the current AutoResetWrapper.I think the expected behaviour is to have env.reset be called when needed, however I guess this might make things really slow. Could anything be done to fix this ?
Thanks
Beta Was this translation helpful? Give feedback.
All reactions