Skip to content

Latest commit

 

History

History
119 lines (96 loc) · 7.77 KB

README.md

File metadata and controls

119 lines (96 loc) · 7.77 KB

RL Algorithms and Environments by PyTorch

Welcome to the RL Algorithms and Environments (RLAlgoEnv) repository! This project provides a collection of reinforcement learning (RL) algorithms implemented in PyTorch and some customized or packed-up environments.

Table of Contents

Requirements

  • The codes are well tested on pytorch==2.0.1+cu117.
  • Install all dependent packages:
    pip install -r requirements.txt
  • The project is based the gymnasium>=0.29.1 package, to render the mujoco, robotics, etc. environments, you need to modify the site-packages\gymnasium\envs\mujoco\mujoco_rendering.py file: replace the solver_iter (at around line #593) to solver_niter.
  • The running scripts can be run directly in PyCharm, or you may need to execute: export PYTHONPATH=<path to RLEnvsAlgos>:$PYTHONPATH.

Algorithms

The project implements RL algorithms in separate independent classes, easy to read and modify.

The implementation of these algorithms is primarily based on the CleanRL library, which is also an excellent resource that we recommend for reference.

Algorithm Description Auther & Year Discrete Control Continuous Control
DQN An enhanced version of Deep Q-Networks algorithm. Mnih et al., 2015 ✔️
CategoricalDQN An extension of DQN with categorical distributional Q-learning. Bellemare et al., 2017 ✔️
NoisyNet (DQN) An extension of DQN with noisy networks for exploration. Fortunato et al., 2019 ✔️
PPO Proximal Policy Optimization. Schulman et al., 2017 ✔️ ✔️ ️
RPO An improved version of PPO. Md Masudur Rahman and Yexiang Xue, 2023 ✔️ ✔️
RND Random Network Distillation, extended from PPO. Burda et al., 2018 ✔️ ✔️
DDPG Deep Deterministic Policy Gradient algorithm. Silver et al., 2014 ✔️
TD3 Twin Delayed DDPG, an improved version of DDPG. Fujimoto et al., 2018 ✔️
SAC Soft Actor-Critic. Haarnoja et al., 2018 ✔️ ✔️

Algorithms to be Implemented

  • Soft Q-Learning (SQL)
  • Advantage Actor-Critic (A2C)
  • Asynchronous Advantage Actor-Critic (A3C)
  • and more...

Running Scripts

We provide a variety of running scripts for different algorithms and environments. You can find them here.

Environments

We pack up and customize a variety of environments for testing and benchmarking RL algorithms. All environments packages can be found in RLEnvs folder.

Templates

We provide some templates to interact with the environments:

Contributing

We welcome contributions!

Actually, the codes are not thoroughly tested, so we sincerely invite you to help us update the repository. If you have improvements or bug fixes, please feel free to open an issue or a pull request. Thanks in advance for your help!