Reinforcement learning

Reinforcement learning practice

lab 1

Basic maze vs minataur problem

Creating a MDP and doing value iteration to converge to a policy.

Police vs Bank Robber

Q learning and Sarsa algorithm for robber to get maximum reward in the game.