Skip to content

Latest commit

 

History

History
42 lines (20 loc) · 1.11 KB

README.md

File metadata and controls

42 lines (20 loc) · 1.11 KB

Explanation this package

This package is a Recurrent Behavior Cloning.

And it is compatible with Imitation library

It is okay to use a expert dataset which is from human , whether it has recurrent state or not (like lstm_state or gru_state).

Training

python3 train_gru_bc.py

result in BipedalWalker-v3

BC loss (ent_weight = 1e-3 , l2_weight = 0.0)

image

Capability

Pytorch == 1.12.1

Stable-baselines3 == 2.0.0

Sb3-contrib == 2.0.o

Imitation == 1.0.0

Sister GRU packages

RecurrentRLHF (Preference based RL with Recurrent reward model)

GRU_AC (Actor-critic or Proximal Policy Optimizer with GRU)

references

BipedalWalker policy's hyper-parameter [git repo]

GRU BC reference [git repo] [paper]