You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I change the size to 12 (and in the mode = "player"), the agent no longer learning. It always move towards the borders, i.e. keep taking the action moving towards the borders even when it is already at the border.
Is it because there is no penalty for such action?
The text was updated successfully, but these errors were encountered:
I added the following penalty and it is still not working:
if (self.board.components['Player'].pos[0] < 0 or self.board.components['Player'].pos[1] < 0 or self.board.components['Player'].pos[0] > self.board.size-1 or self.board.components['Player'].pos[1] > self.board.size-1) :
return -10
This is likely due to the fact that with a larger grid size it becomes increasingly rare for the agent to randomly hit the goal, which we need it to do a few times for the algorithm to learn. When the grid is too large, a random walk will almost never hit the goal. To solve large grids with very sparse rewards you will have to implement some of the more advanced techniques like curiosity that are covered later in the book.
When I change the size to 12 (and in the mode = "player"), the agent no longer learning. It always move towards the borders, i.e. keep taking the action moving towards the borders even when it is already at the border.
Is it because there is no penalty for such action?
The text was updated successfully, but these errors were encountered: