XuanJing is a benchmark library of decision algorithms for reinforcement learning, imitation learning, multi-agent learning and planning algorithms.
In both supervised learning and reinforcement learning, the algorithm consists of two main components. : the data and the update formula. XuanJing abstracts these two parts, so that it is possible to train reinforcement learning algorithms in the same way as supervised learning.
WIP. Not released yet.
Env is in responsible for parallelizing and wrapping the environment. The task of interacting with the environment falls to the actor. The data produced during the interaction between the actor and the environment is stored in the buffer(if needed.). When an actor interacts with an environment, learner is in charge of managing the data and algorithms. enhancement is used to enhance the data in the buffer. Model parameters are updated by the learner using data and algorithms. utils are a class of useful functions.
TODO
TODO
Supported algorithms are as following:
- Deep Q-Network (DQN)
- Double DQN
- Dueling DQN
- Proximal Policy Optimization (PPO)
- Soft Actor-Critic (SAC)
- Cross-Entropy Method (CEM)
- Evolution Strategies as a Scalable Alternative to Reinforcement Learning (ES)
To see how the specification has been applied, see the example-readmes.
This project exists thanks to all the people who contribute.
Made with contributors-img.
MIT © tinyzqh
If you find XuanJing useful, please cite it in your publications.
@software{XuanJing,
author = {Zhiqiang He},
title = {XuanJing},
year = {2022},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/tinyzqh/XuanJing}},
}