Qniffel

A reinforcement learning based App to play the dice game Kniffel(Yahtzee)

General information

The final goal of this project is to build an App that plays the game Kniffel(english: Yahtzee) automatically with super-human performance. More about the game can be found here. The project contains two main parts: the training of a Kniffel agent using reinforcement learning and the implementation of an iOS App that plays Kniffel with physical dice based on the trained agent. The training is based on open-ai gym and Pytorch. The iOS App will be implemented using SwiftUI.

RL agent training

The plan is to test two different reinforcement learning algorithms. One is the deep Q-network(DQN), which was applied here for the game of Yahtzee. The other one is the advantage actor-critic algorithm(A2C), whose performance was reported here for the game of Yahtzee.

Training details

We only optimize the agent under single-player mode. Multi-player mode is although potentially helpful for training, it could be difficult for agent to learn the strategy(see discussion here).
Augmented input feature similar to this paper is used. The 112-dimensional input feature encodes the current round of the game, the current dice roll, sum of dice, availability of score categories, and current upper scores(for bonus).
Invalid action masking is used to prevent the agent from choosing invalid actions in the game(e.g. choosing already used score category).

DQN model

Double DQN is used to improve performance.
NN structure: 2 linear hidden layers of 128 units with ReLU activation.
Tricks for convergence:
- lower target network update frequency( around 2000 steps) to stabilize training
- linear decay of epsilon requires less tuning and works well
- auxiliary feature to help agent understand the game(e.g. order of dice is irrelevant).

Setup

The gym environment is mostly from this repository with some modification of joker rules and debugging. All the experimental results are available on wandb. For simplicity, all training is done on a MacBook Pro with M1 Pro chip using the mps backend extension of pytorch.

Performance

Agent	avg score in 3000 games
random	46.5
greedy	TODO
DQN	106.5
DQN in paper	77.8
A2C	TODO
A2C in article	239.7

iOS App

Core functionality

Detection and understanding of dice roll as in example using Vision framework
Correct feature conversion and forwarding of pytorch model
Understandable instruction for user to roll the dice based on the decision
Recognizing and handling the situation where the instruction is not followed
Automatic calculation of score

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.idea		.idea
training		training
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Qniffel

General information

RL agent training

Training details

DQN model

Setup

Performance

iOS App

Core functionality

About

Releases

Packages

Languages

Stefanwuu/Qniffel

Folders and files

Latest commit

History

Repository files navigation

Qniffel

General information

RL agent training

Training details

DQN model

Setup

Performance

iOS App

Core functionality

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages