Skip to content

TicTacToe playing agent based on Reinforcement learning.

Notifications You must be signed in to change notification settings

marzekan/tictactoe-rl-agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

65 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🤖 TicTacToe agent based on Reinforcement leaning

This project was made as a part of the Intelligent Systems course at Faculty of Organization and Informatics.

🥅 Project goals

Goal of this project was simple - to make an agent learn to play the Tic-tac-toe game!

And to showcase it with a slick tkinter GUI!

 

playing_gif

 

⚒️ Features

  • Train Reinforcement learning Tic-Tac-Toe agents.
  • Set training parameters (number of iterations and learning strategy).
  • Save/Load trained agents.
  • Keep game score.
  • Switch players each game.
  • No dependencies.

 


Install

  1. Clone the repo or download ZIP

    $ git clone https://github.com/marzekan/tictactoe-rl-agent.git
  2. Run main.py

    $ cd tictactoe-rl-agent
    
    $ python3 main.py

 


📜 Docs

This next sections is supposed to be a quick-n-easy-straight-to-the-point documentation.

Documentation begins with a demo and a step-by-step explanation of it.

Train-Save-Load-Play Demo

Following demo shows:

  • Training the agent
  • Saving trained agent
  • Loading trained agent
  • Playing loaded agent

playing_gif

Next submodules contain step-by-step explanations of actions performed in the gif above.

Training the agent

  • Click TRAIN AGENT button that opens Agent training window.

  • Select 'Q Agent vs. Q Agent' from the Pick agent strategy dropdown.

  • Enter 20000 as number of training iterations in Num. iter textbox.

  • Click the green Train! button.

  • This will begin agent training and when that is done, the File explorer window will open.

Saving trained agent

  • When agent training finishes File explorer window will open.

  • Name file: 'Another Bot' and click Save.

Loading trained agent

  • After saving, the app tells you that agent is trained and that you need to load it to play.

  • Press LOAD AGENT to load the trained agent.

  • This will open File explorer window where you can choose the agent 'save folder'.

  • Select 'Another Bot' folder and press Select Folder.

  • After loading, the app will instruct you to restart the game to play the loaded agent.

Playing loaded agent

  • Press RESET GAME and play the trained agent.

Game rules

This chapter explains the rules of engagement.

  • You always play first (as 'X') when new game is started.
  • When starting the game, opposing agent IS NOT trained, you need to load the trained agent from a file to play him.
  • EACH game you and agent switch signs.
  • Whoever plays as 'X' - plays first.
  • You need to get three of your signs in a row (column, row, diagonal) to win a single game.

Scoring

  • Score is kept.
  • You gain 1 point if you win, agent also gains 1 point if he wins.
  • Nobody gets a point when game is a 'draw'.
  • Score is kept only while the app runs. Scores ARE NOT saved.

Agent training

Agents are trained in the Agent training window that is accessed via TRAIN AGENT button.

There are 2 hyperparameters you can tune for agent training:

  • Learning strategy: Q-Learning or random.
  • Number of training iterations.

Learning strategies

Main learning strategy is of course now famous Q-Learning. When this strategy is selected the agent will play against itself for the given number of iterations. While training, it will fill up its Q-table with score values for each move. This table will later be used for determining the best move to play against you.

While training, one instance of Q-Learning agent always plays with 'O' and other with 'X'. In other words, when the training is done there are 2 agent, each learned to play with as a different sign. This way, we can save both of theirs Q-tables' as features of a SINGLE agent that knows how to play both as 'O' and 'X'.

Second possible strategy is agent making filling its Q-table by just making random moves and trying to learn that way.

Training iterations

Number of iterations represents the number of games the agent will plays against itself before playing you. As in other reinforcement learning bots, the meat of it's knowledge comes from the large number of played games.

We found that agent trained with Q-Learning strategy for 200.000 iterations is quite hard to beat. Feel free to train them even longer, there is no limit ;)

Saving agents

  • You can save the agents save folder anywhere on your disk.

  • As mentioned, by saving trained agent you really just save his Q-table values for 'x' and 'O' moves.

  • After saving, if you look in the save directory you will those 2 files, one staring with trained_O_<name of save folder> and trained_X_<name of save folder>. These contain the Q-tables.

  • Files are saved as pickle files - .pkl

  • If you don't provide a name for your save folder the app will give it the DEFAULT name. The default is a directory named save_<datetime>.

Loading agents

  • When loading an agent, you need to select the save FOLDER containing saved agent files, not save files themselves.
  • After loading you need to press the RESET GAME button to continue playing the loaded agent.

🔗 Credits

This project was greatly inspired by the following sources:

[1] How to use reinforcement learning to play tic-tac-toe by Rickard Karlsson

[2] Reinforcement Learning Tic Tac Toe Python Implementation by Marius Borcan

Thank you for doing great work!