greedy-tictactoe-agent

This is an experimental project within the topic of reinforcement learning. It is still in progess. Feel free to reuse my code or give me some new inputs!

TODOs

The following changes have to be done to make it work properly:

Starting learning with epsilon = 1 and slowly decreasing it until it is 0.
Create two dnns with same structure, one as "target dnn", one as "current dnn" and calculate with it target_y. -- Example: target_y = r + Qs_of_Target_DNN -- learn current DNN with target_y as difference

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
example		example
saved_models		saved_models
.gitignore		.gitignore
README.md		README.md
agent.py		agent.py
run.py		run.py
situations_Agent1.csv		situations_Agent1.csv
situations_Agent2.csv		situations_Agent2.csv
test.py		test.py
ttt.py		ttt.py
ttthistory.py		ttthistory.py
tttsituation.py		tttsituation.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

greedy-tictactoe-agent

TODOs

About

Releases

Packages

Languages

fabianbusch/greedy-tictactoe-agent

Folders and files

Latest commit

History

Repository files navigation

greedy-tictactoe-agent

TODOs

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages