This repository contains Tetris with reinforcement learning. The game is written in Python and the reinforcement learning is done with stable-baselines3. Some pretrained models are included in the models folder.
The used Tetris game is custom made and is not based on any other Tetris game. The game is written in Python and uses Pygame for the graphics. The game is simplified to improve the performance of the reinforcement learning. This is done by only having 1x1 blocks falling instead of the regular shapes.
The application exists of two parts:
- The training script for training a model
- The loading script for loading a trained model
To install necessary packages run:
pip install -r requirements.txt
To train a model, run the following command:
python train_model.py
The model will train for 100 episodes of 10 000 steps each. After each episode, the model will be saved to the models folder.
To load a model, run the following command:
python load_model.py
The model will load the model from the models folder and play the game for 5 episodes. Model names are defined as follows: {used_algorithm}{start_training_time_epoch}_{used_algorithm}{number_of_timestamps_trained}.zip
It is possible to view the results of the training using Tensorboard. To do this, run the following command:
python -m tensorboard.main --logdir=logs
After Tensorboard has started, open a browser and go to http://localhost:6006/
.
It is possible to customize the environment by changing the environment.py
file. One of the easiest ways to change the environment is changing the reward function. The reward function is defined in get_reward(...)
.
Is is also possible to change the used algorithm by changing the train_model.py
file. To do this, replace all instances of PPO
(in lines 4, 10, 11, 23, 27 and 28) with the algorithm of your choice. The available algorithms are defined in the stable_baselines3
package.
After a environment has been modified, it is recommended to check if it functions as expected. To do this, run the following command:
python check_environment.py
This will run the enviroment for 50 epsiodes while executing random actions. The observation, action and reward will be printed to the console.