Env
- add taxi env (#799) (#807)
- add ising model env (#782)
- add new Flozen Lake env (#781)
- optimize ppo continuous config in MuJoCo (#801)
- fix masac smac config multi_agent=True bug (#791)
- update/speed up pendulum ppo
Algorithm
- fix gtrxl compatibility bug (#796)
- fix complex obs demo for ppo pipeline (#786)
- add naive PWIL demo
- fix marl nstep td compatibility bug
Enhancement
Style
- relax flask requirement (#811)
- add new badge (hellogithub) in readme (#805)
- update discord link and badge in readme (#795)
- fix typo in config.py (#776)
- polish rl_utils api docs
- add constraint about numpy<2
- polish macos platform test version to 12
- polish ci python version
News
- PsyDI: Towards a Multi-Modal and Interactive Chatbot for Psychological Assessments
- ReZero: Boosting MCTS-based Algorithms by Backward-view and Entire-buffer Reanalyze
- UniZero: Generalized and Efficient Planning with Scalable Latent World Models
Full Changelog: v0.5.1...v0.5.2
Contributors: @PaParaZz1 @zjowowen @YinminZhang @TuTuHuss @nighood @ruiheng123 @rongkunxue @ooooo-create @eltociear