Skip to content

Latest commit

 

History

History
77 lines (48 loc) · 4.06 KB

README.md

File metadata and controls

77 lines (48 loc) · 4.06 KB

Flappy Bird AI

This project aims to implement a Reinforcement Learning agent using Q-learning to play the Flappy Bird game.

Flappy Bird is a popular mobile game that was developed by Dong Nguyen. The game features a bird which the player controls by tapping the screen. Each tap causes the bird to briefly fly upwards, and then the bird starts falling due to gravity. The objective of the game is to navigate the bird through pairs of pipes that are coming from the right side of the screen to the left without touching them. The pipes are positioned at different heights and have a gap in between them. The game is over if the bird touches a pipe or the ground. The player's score is increased by one each time the bird successfully passes through a pair of pipes. The game is known for its high level of difficulty and addictive nature.

🌟 About the Project

📷 Screenshots

image image image

🧰 Getting Started

‼️ Prerequisites

  • Install python (python 3.8 Preferably) Here
  • Install gym environment
pip install gym
  • Install flappy_bird_gym
pip install flappy_bird_gym

📜 Code of Conduct

Here's a breakdown of the class and its methods:

__init__(self, iterations): Initializes the agent with an empty Q-table, learning rate (alpha), discount factor (landa), exploration rate (epsilon), and the number of training iterations.

policy(self, state): Returns the action with the highest Q-value for the given state.

get_all_actions(): Returns all possible actions (0 and 1).

convert_continuous_to_discrete(state): Converts the continuous state values to discrete values by rounding them to one decimal place.

compute_reward(self, prev_info, new_info, done, observation): Computes the reward based on the game status. If the game is over (done is True), it returns a large negative reward. If the bird is inside the pipe, it increases the score and returns a large positive reward. Otherwise, it returns a small positive reward.

get_action(self, state): Decides whether to take a random action (exploration) or the action with the highest Q-value (exploitation) based on the epsilon value. In test mode (mode is 1), it always chooses the action with the highest Q-value.

maxQ(self, state): Returns the maximum Q-value for the given state.

max_arg(self, state): Returns the action with the maximum Q-value for the given state.

update(self, reward, state, action, next_state): Updates the Q-value for the given state-action pair based on the Q-Learning update rule.

update_epsilon_alpha(self): Decreases the alpha and epsilon values over time to reduce the learning rate and the exploration rate.

run_with_policy(self, landa): Runs the game with the current policy for a given number of iterations. It updates the Q-table after each action and resets the game when it's over.

run_with_no_policy(self, landa): Runs the game without the policy (always choosing random actions) and prints the scores.

run(self): Runs the game with the policy, switches to test mode, and then runs the game without the policy.

The last lines of the code create an instance of the SmartFlappyBird class and run the game.

⚠️ License

Distributed under the MIT License.

This code is provided by ErfanXH and all rights are reserved.

Project Link: https://github.com/ErfanXH/Flappy-Bird