Skip to content

A reinforcement learning based APP to play the game Kniffel(Yahtzee)

Notifications You must be signed in to change notification settings

Stefanwuu/Qniffel

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Qniffel

A reinforcement learning based App to play the dice game Kniffel(Yahtzee)

General information

The final goal of this project is to build an App that plays the game Kniffel(english: Yahtzee) automatically with super-human performance. More about the game can be found here. The project contains two main parts: the training of a Kniffel agent using reinforcement learning and the implementation of an iOS App that plays Kniffel with physical dice based on the trained agent. The training is based on open-ai gym and Pytorch. The iOS App will be implemented using SwiftUI.

RL agent training

The plan is to test two different reinforcement learning algorithms. One is the deep Q-network(DQN), which was applied here for the game of Yahtzee. The other one is the advantage actor-critic algorithm(A2C), whose performance was reported here for the game of Yahtzee.

Training details

  • We only optimize the agent under single-player mode. Multi-player mode is although potentially helpful for training, it could be difficult for agent to learn the strategy(see discussion here).
  • Augmented input feature similar to this paper is used. The 112-dimensional input feature encodes the current round of the game, the current dice roll, sum of dice, availability of score categories, and current upper scores(for bonus).
  • Invalid action masking is used to prevent the agent from choosing invalid actions in the game(e.g. choosing already used score category).

DQN model

  • Double DQN is used to improve performance.
  • NN structure: 2 linear hidden layers of 128 units with ReLU activation.
  • Tricks for convergence:
    • lower target network update frequency( around 2000 steps) to stabilize training
    • linear decay of epsilon requires less tuning and works well
    • auxiliary feature to help agent understand the game(e.g. order of dice is irrelevant).

Setup

The gym environment is mostly from this repository with some modification of joker rules and debugging. All the experimental results are available on wandb. For simplicity, all training is done on a MacBook Pro with M1 Pro chip using the mps backend extension of pytorch.

Performance

Agent avg score in 3000 games
random 46.5
greedy TODO
DQN 106.5
DQN in paper 77.8
A2C TODO
A2C in article 239.7

iOS App

Core functionality

  • Detection and understanding of dice roll as in example using Vision framework
  • Correct feature conversion and forwarding of pytorch model
  • Understandable instruction for user to roll the dice based on the decision
  • Recognizing and handling the situation where the instruction is not followed
  • Automatic calculation of score

About

A reinforcement learning based APP to play the game Kniffel(Yahtzee)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages