Reinforcement Learning Assignment 4: DQN vs. PPO

This project demonstrates Deep Q-Network (DQN) and Proximal Policy Optimization (PPO) agents for reinforcement learning (RL) experiments using gymnasium environments. The agents are trained and evaluated through configurable settings, with the results recorded and logged.

Project Structure

agents/: Contains implementations of the DQN and PPO agents.
- dqn.py: DQN agent implementation.
- ppo.py: PPO agent implementation.
config/: Holds configuration classes for agents and experiments.
- agents_config.py: Agent-specific hyperparameters.
- experiment_config.py: Settings for training and evaluation.
utils/: Utility functions.
- helpers.py: Various utility functions.
main.py: The main entry point of the project, containing the training and evaluation pipeline.
environment.yml: Contains all dependencies required for Conda environment setup.

Setup

Clone the Repository:

git clone [email protected]:infamous-flu/RL_04.git

Create a Conda environment:
```
conda env create -f environment.yml
```

Activate the Environment:

conda activate deep_rl_env  # Or replace with the environment name in the YAML file

Usage

Run the Training and Evaluation

Use the main.py script to train and evaluate the agents with customizable command-line arguments:

python main.py --env_id LunarLander-v2 --agent_type dqn --device cuda --n_timesteps 300000

This will run the DQN agent on the LunarLander-v2 environment with 300,000 timesteps on a CUDA device, if available.

Command Line Options

Environment ID (--env_id): Specify the gym environment ID.
Agent Type (--agent_type): Choose between 'dqn' and 'ppo' for the type of RL agent.
Computation Device (--device): Choose 'cuda' or 'cpu' depending on available resources.
Number of Training Timesteps (--n_timesteps): Set the number of timesteps for training.
Seed (--seed): Provide a seed for reproducibility of training results.

Additional Customizations

Update the agent configuration (DQNConfig or PPOConfig) directly in their respective configuration classes as needed.
Update the experiment configuration (TrainingConfig or EvaluationConfig) directly in their configuration classes as needed.

Visualize Results

Check the recordings folder for training and evaluation videos (if recording is enabled).
Use TensorBoard for training progress visualization:
```
tensorboard --logdir=runs
```

Name		Name	Last commit message	Last commit date
Latest commit History 79 Commits
agents		agents
config		config
figures		figures
utils		utils
.gitignore		.gitignore
README.md		README.md
environment.yml		environment.yml
main.py		main.py
plot_graph.ipynb		plot_graph.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reinforcement Learning Assignment 4: DQN vs. PPO

Project Structure

Setup

Usage

Run the Training and Evaluation

Command Line Options

Additional Customizations

Visualize Results

About

Releases

Packages

Languages

infamous-flu/RL_04

Folders and files

Latest commit

History

Repository files navigation

Reinforcement Learning Assignment 4: DQN vs. PPO

Project Structure

Setup

Usage

Run the Training and Evaluation

Command Line Options

Additional Customizations

Visualize Results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages