-
Notifications
You must be signed in to change notification settings - Fork 420
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do the V5 environments perform frame pooling? #467
Comments
Hey @danijar, max-pooling isn't performed in the environment itself. We've gone back and forth whether these type of preprocessing steps should be in the environment or not. We ended up deciding that although doing preprocessing in the emulator would result in improved performance we don't want users to think that these preprocessing steps are the benchmark. Over time they've been confounded with the benchmark but they are primarily a result of design decisions made in 2015 specifically for DQN. The v5 environments define the necessary environment specific parameters, e.g., environment stochasticity, episode truncation, etc. It could be said that max-pooling is necessary to have environment side (to solve flickering issues in some games) but all of the different forms of max-pooling have their own issues which I didn't want bake into the environment. A couple of my thoughts specifically on max-pooling:
Finally, if you are in need of a good preprocessing stack I'm a big fan of the abstractions made in DQN Zoo. Preprocessing is entirely implemented in agent space and is the most faithful to the original DQN preprocessing. It does use DeepMind Environment (which I prefer) but it would be fairly easy to adapt to Gym: https://github.com/deepmind/dqn_zoo/blob/807379c19f8819e407329ac1b95dcaccb9d536c3/dqn_zoo/processors.py#L405-L577. I think going forward it might make sense to have an official preprocessing wrapper in Hope that helps! |
Thanks for the detailed comment! I agree that averaging is nicer than taking the maximum, but at this point it's more important to be compatible with the vast existing literature than to have nicer preprocessing. The problem I'm running into is that I think you can't efficiently perform pooling on the agent side because the maximum should be taken over two consecutive frames. With action repeat, the environment doesn't give me access to the actual last frame. Without action repeat, the environment renders at every step and thus 2x more than necessary. This makes the env Btw if it would be possible to add max pooling as a kwarg to the v5 envs, that would be amazing! It's basically guaranteed that a decent number of researchers will continue to use max pooling for comparability to the prior literature despite it not being implemented in the v5 envs, so providing an implementation as part of the env will consolidate more than if everybody writes their own. |
Hi @JesseFarebro, do you have an idea for a workaround here? It's the only issue holding me back from switching to (That said, I think having a standard implementation that everybody can use would still be valuable! It could also easily avoid the slowdown since it can be implemented inside the env.) |
Hey @danijar, you make a good point about the excess renders. I'm guessing you want to stick to frame pooling and not switch over to averaging? If so, are you currently pooling over Grayscale or RGB? I think pooling does make sense to have in the emulator and it's probably the lowest hanging fruit to improving emulation speed. I'm about to make a release of ale-py so I'll see if it can be done this release. If you want to discuss further I'll be more responsive via email (see my profile). |
Awesome, would be great if you can add it in the release! Both grayscale and RGB would be nice but grayscale is more important because that's the most common evaluation protocol. |
Hi Jesse, do you have any update on whether frame pooling is implemented now? I would really like to upgrade to using ale_py and along that allow unpinning the old Gym version I'm stuck at because of this. Thanks! |
Having looked into this a little more it should be feasible but will require reworking how we render frames. Right now every emulator step we perform a memory copy of the current frame buffer to the Also, the only way this makes sense is if we make frame maxing and frame averaging mutually exclusive which should be fine. |
That would be great! My implementation is currently 4x slower than the old one on top of atari_py and this is likely the reason. Adding this natively would allow me to switch over and use the ale_py version :) |
Ended up getting everything working and it's now about ~25% faster than |
Fantastic, thanks! |
it seems PR stuck in review :( any chance we could see it soon? |
The README says the V5 environments follow Machado et al. 2018, who use max-pooling over the last two frames. I couldn't find any code or config option of this in the repository, so I'm wondering do the V5 environments automatically perform frame pooling?
The text was updated successfully, but these errors were encountered: