Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Gym Env and Implement RL Training #203

Open
wants to merge 17 commits into
base: master
Choose a base branch
from
Open

Conversation

tztsai
Copy link

@tztsai tztsai commented Oct 23, 2024

  • Fixes gym.py in psl to use BuildingEnv to wrap any BuildingEnvelope system (the original version wraps any instance of ODE_NonAutonomous, but in its implementation, it seems to assume that the system is a BuildingEnvelope).
  • Implements two DRL algorithms ppo.py and sac.py (largely adopted from https://github.com/vwxyzjn/cleanrl) in the rl folder. Both DRL algorithms can successfully run in the BuildingEnv environment.
  • Implements gym_dpc.py and gym_nssm.py where a DPCTrainer or NSSMTrainer can directly accept a gym environment as input and use a neuromancer.Trainer to train a neural network in this environment.
  • Drafts hybrid_control.py as an attempt to implement the technical proposal in README.md, illustrated by diagram.svg. Currently the program can successfully train a DPC policy in a BuildingEnv, and insert its policy model as the actor network of an actor-critic PPO agent, and then continue the training in an RL workflow. This hybrid approach may have the benefit of improving the DPC policy by learning from long-term cumulative reward, as well as accelerating the DRL training by providing a pre-trained DPC policy model.

@drgona drgona self-requested a review October 25, 2024 20:14
@tztsai tztsai changed the title Fix Gym Env and Implement PPO RL Training Fix Gym Env and Implement RL Training Oct 30, 2024
@tztsai tztsai marked this pull request as ready for review October 30, 2024 22:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant