Skip to content

JmfanBU/Safety_Guided_DRL_with_GP

Repository files navigation

Safety_Guided_DRL_with_GP

This repository is the official implementation of Paper "Safety-guided deep reinforcement learning via online gaussian process estimation".

Prerequisites

Our implementation is based on GPflow and OpenAI Baselines.

Tensorflow and GPflow

In our implementation, we use Tensorflow 1.13.1 and GPflow 1.3.0

  • You can install Tensorflow via

    pip install tensorflow-gpu==1.13.1  # if you have a CUDA-compatible gpu and proper drivers
    pip install numpy==1.17.5

    or

    pip install tensorflow==1.13.1
    pip install numpy==1.17.5
  • Install GPflow via

    pip install gpflow==1.3.0

OpenAI Baseliens

You can find detailed instructions for installing OpenAI Baselines here.

Our implementation is based on a commit c57528573ea695b19cd03e98dae48f0082fb2b5e

MuJoCO

Instructions on setting up MuJoCo can be found here

The MuJoCo environments used in our paper depend on OpenAI Gym as well.

  • Install gym from gym via
    pip install -e ".[classic_control]"
    pip install -e ".[mujoco]"

Installation

Run the following command from the project directory:

pip install -e .

How to use

Our implementation includes two methods: vanilla DDPG and DDPG with online GP estimation.

To train a vanilla ddpg policy, use the code in ddpg_baseline.

For DDPG using online GP, use the code in safe_ddpg.

As default, training results will be saved to data_ddpg.

We also provide some samples of console outputs in outputs.

DDPG_with_Online_GP_Estimation

An example of training DDPG with online GP policy for pendulum can be found in train:

./train/pendulum_0.1M_safe_ddpg.sh

Vanilla DDPG

To train with vanila DDPG:

./train/pendulum_1M_ddpg.sh

DDPG_with_init_GP

./train/half_cheetah_0.1M_init_safe_ddpg.sh

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published