Reinforcement-Learning-repo

A repo where I look into the details and practice on how to properly utilize reinforcement learning. I try out different AI models and evaluate their performance to test out which does best. Please note this is a project i do in my free time and will update periodically when i get the time to complete it!

Please note - running current model on cpu will be mind numbingly slow. I highly recommend either reducing network size (i.e parameters/layers/number of hidden layers) and reducing batch size(if running on cpu). Running on cuda cores is highly recommended

Phases(base line alg and setup).

Classic control(1/3): pendulum(solved), mountain car, mountain car continuous.
Mujoco (0/10)
Robotics (0/10)

Model(s) used/Evaluation metrics.

Classic control.

Pendulum(TD(0)) - Doesnt work.

Episodes trained	#-	#-
Average/Overall reward(per 200 eps)	-	-

Pendulum(Policy grad monte carlo) - Doesnt work.

Episodes trained	#-	#-
Average/Overall reward(per 200 eps)	-	-

Pendulum(DDPG) - Works like a charm.

Episodes trained	#300	#-
Average/Overall reward(last 200 eps)	-	-

Mountain car(LSTM + 2 FC LAYER).

Episodes trained	#-	#-
Average/Overall reward(per 200 eps)	-	-

Mountain car(Convnet + 2 FC LAYER).

Episodes trained	#-	#-
Average/Overall reward(per 200 eps)	-	-

Mountain car cont.(LSTM + 2 FC LAYER).

Episodes trained	#-	#-
Average/Overall reward(per 200 eps)	-	-

Mountain car cont.(Convnet + 2 FC LAYER).

Episodes trained	#-	#-
Average/Overall reward(per 200 eps)	-	-

Credits.

All credits goes to deeplizard. Visit their yt channel here: https://www.youtube.com/channel/UC4UJ26WkceqONNF5S26OiVw

Name		Name	Last commit message	Last commit date
Latest commit History 187 Commits
.ipynb_checkpoints		.ipynb_checkpoints
Classic Control		Classic Control
Mujoco		Mujoco
Robotics		Robotics
grokking-advanced-discrete-actions		grokking-advanced-discrete-actions
grokking-practice-folder		grokking-practice-folder
state_of_the_art		state_of_the_art
.gitattributes		.gitattributes
README.md		README.md
versions.ipynb		versions.ipynb
versions.json		versions.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reinforcement-Learning-repo

Phases(base line alg and setup).

Model(s) used/Evaluation metrics.

Classic control.

Pendulum(TD(0)) - Doesnt work.

Pendulum(Policy grad monte carlo) - Doesnt work.

Pendulum(DDPG) - Works like a charm.

Mountain car(LSTM + 2 FC LAYER).

Mountain car(Convnet + 2 FC LAYER).

Mountain car cont.(LSTM + 2 FC LAYER).

Mountain car cont.(Convnet + 2 FC LAYER).

Credits.

Other resources.

About

Releases

Packages

Languages

Kikumu/Reinforcement-Learning-repo

Folders and files

Latest commit

History

Repository files navigation

Reinforcement-Learning-repo

Phases(base line alg and setup).

Model(s) used/Evaluation metrics.

Classic control.

Pendulum(TD(0)) - Doesnt work.

Pendulum(Policy grad monte carlo) - Doesnt work.

Pendulum(DDPG) - Works like a charm.

Mountain car(LSTM + 2 FC LAYER).

Mountain car(Convnet + 2 FC LAYER).

Mountain car cont.(LSTM + 2 FC LAYER).

Mountain car cont.(Convnet + 2 FC LAYER).

Credits.

Other resources.

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages