A neural network-based Blackjack AI that learns optimal playing strategies through deep Q-learning. The project includes both training capabilities and interactive GUI interfaces for playing against or getting advice from the trained agent.
- Deep Q-Learning implementation for Blackjack strategy learning
- Interactive Blackjack GUI for playing games
- Decision Helper GUI for getting real-time advice
- Detailed training metrics and logging
- Basic and advanced reward shaping based on optimal Blackjack strategy
main.py
- Core Blackjack game environment and logicblackjack_agent.py
- Deep Q-Learning agent implementationtrain_agent.py
- Training script with metrics loggingblackjack_gui.py
- Interactive Blackjack game interfacedecision_helper_gui.py
- Real-time strategy advice interface
- Clone the repository:
git clone https://github.com/jonathanung/blackjackNN.git
cd blackjackNN
- Create and activate a virtual environment:
python -m venv venv
source venv/bin/activate
- Install required packages:
pip install -r requirements.txt
python ./agent/train_agent.py [num_episodes]
This will:
- Train the agent for the specified number of episodes (default is 5000)
- Generate training metrics plots
- Save the trained model as 'blackjack_agent.pkl'
- Create detailed training logs
If you want to train the agent on a GPU or MPS, you can uncomment the relevant lines in blackjack_agent.py
under the __init__
method.
python ./gui/blackjack_gui.py
Features:
- Play Blackjack with a graphical interface
- Get real-time suggestions from the trained agent
- View agent's confidence in its suggestions
- Color-coded advice based on confidence levels
python ./gui/decision_helper_gui.py
This will open a GUI interface for getting real-time strategy advice from the trained agent.
Input your:
- Current hand value
- Number of cards
- Dealer's visible card
Get:
- Recommended action
- Confidence levels
- Q-values for all possible actions
The training process generates:
- Win rate over time
- Average reward progression
- Detailed state-action logs
- Agent's exploration rate (epsilon) decay
- Built using PyTorch for deep learning
- Implements standard Blackjack rules and optimal strategy rewards
- Uses epsilon-greedy exploration strategy
- More sophisticated reward shaping
- Train with multiple agents in parallel to simulate multiplayer Blackjack
- Implement agents in multiplayer Blackjack game, allowing players to choose their position on the table