DeepMDP Replication Study

Self-supervised representation learning for RL promises to alleviate two problems agents face in real-world environments:

Learning good policies from high dimensional and noisy input.
Data efficient learning in settings where training data is expensive to obtain

DeepMDP is one of the first algorithms in this area that brings theoretical guarantees to the deep learning world. In line with that it also shows good empirical results in comparison to model-free RL. It can be combined with any model-free policy learning strategy. In the paper the author's used C51. We are going to use a DQN for simplicity.

Since the code was not published we provide an implementation of the algorithm with a DQN as base algorithm. Along with the code we also ran experiments to check the author's claims. Overall we found evidence that the learnt latent space is more expressive compared to a baseline using no self-supervision 🎉. .

Have a look at the report for more details.

Setup

Clone the repo.
Run pip install -e . to install deepmdp as a python package.

Running an Experiment

All experiments are logged and executed using sacred. The configs for the experiments are located in scripts/configs. If you want to use the auxiliary DeepMDP losses simply set the flag in the config to true. The specifed DQN architecture is then split into an encoder part that is shared between the q-head, transition-head and reward head and the q-head itself. When you are done simply run python scripts/experiment_dqn_baseline.py --config_path [path].

Furthermore, we use Visdom to log experiment data. Make sure to have your Visdom server running on port 9098. For that run visdom -port 9098. Then you can access it any time via http://localhost:9098.

Name		Name	Last commit message	Last commit date
Latest commit History 286 Commits
deepmdp		deepmdp
notebooks		notebooks
notes		notes
scripts		scripts
.gitattributes		.gitattributes
.gitignore		.gitignore
Readme.md		Readme.md
latent_space_ensitivity_-0.1.png		latent_space_ensitivity_-0.1.png
report.pdf		report.pdf
setup.py		setup.py
submit_new_arch.sh		submit_new_arch.sh
submit_new_arch_deepmdp.sh		submit_new_arch_deepmdp.sh
submit_obf_states_deepmdp.sh		submit_obf_states_deepmdp.sh
visdom-screenshot.png		visdom-screenshot.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DeepMDP Replication Study

Setup

Running an Experiment

About

Releases

Packages

Languages

MkuuWaUjinga/DeepMDP-SSL4RL

Folders and files

Latest commit

History

Repository files navigation

DeepMDP Replication Study

Setup

Running an Experiment

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages