[WIP] ReinforcementLearning.jl integration #9

rejuvyesh · 2022-03-09T19:48:19Z

I realized that CommonRLInterface.jl never settled on what to do with continuous action spaces, so directly integrating with RLBase from ReinforcementLearning.jl.

Will add tests and examples with PPO and DDPG.

codecov-commenter · 2022-03-09T23:39:13Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.31%. Comparing base (5563639) to head (eb379f6).
Report is 411 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main       #9      +/-   ##
==========================================
- Coverage   92.48%   92.31%   -0.17%     
==========================================
  Files          81       81              
  Lines        4005     3761     -244     
==========================================
- Hits         3704     3472     -232     
+ Misses        301      289      -12

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

findmyway · 2022-03-12T15:56:38Z

examples/deeprl/cartpole_ppo.jl

+                actor = Chain(
+                    Dense(ns, 256, relu; init = glorot_uniform(rng)),
+                    Dense(256, na; init = glorot_uniform(rng)),
+                ),


Note that you are using the discrete version of PPO here. But the cart pole env here seems to be a continuous version. (The actions space is [-1.0, 1.0]). So you may take reference from https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/blob/935f68b6cb378f9929a8d9914eb388e86213c86d/src/ReinforcementLearningExperiments/deps/experiments/experiments/Policy%20Gradient/JuliaRL_PPO_Pendulum.jl#L43-L50

Good point! Thanks for checking in. Although currently I also need to define the reward/cost function for cartpole on Dojo side.

janbruedigam · 2023-04-12T08:24:42Z

We should probably rethink the interface to ReinforcementLearning.jl once their updates are done (JuliaReinforcementLearning/ReinforcementLearning.jl#614)

rejuvyesh force-pushed the jkg/rlbase branch from fa08c49 to 1043fc3 Compare March 9, 2022 23:39

rejuvyesh force-pushed the jkg/rlbase branch from c656da1 to 137d64a Compare March 10, 2022 01:26

rejuvyesh mentioned this pull request Mar 10, 2022

MultiThreadEnv with custom (continuous) action spaces fails JuliaReinforcementLearning/ReinforcementLearning.jl#596

Closed

rejuvyesh force-pushed the jkg/rlbase branch 2 times, most recently from 22e4549 to b606aa1 Compare March 11, 2022 19:07

findmyway reviewed Mar 13, 2022

View reviewed changes

rejuvyesh and others added 7 commits March 13, 2022 02:45

Towards ReinforcementLearning.jl integration

2c277fb

get basic interface working

2dad1d9

add an example

b12adf4

add simple cartpole example; issue is with RL.jl

6200f6c

cartpole is meaningless as no reward defined

4af3538

fix space related definitions

b7b11e0

fix ppo policy

08188a6

rejuvyesh force-pushed the jkg/rlbase branch from dc065e2 to 08188a6 Compare March 13, 2022 02:47

rejuvyesh added 2 commits March 13, 2022 03:01

fixes for RL integration, still errors

e57b198

try some more fixes

eb379f6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] ReinforcementLearning.jl integration #9

[WIP] ReinforcementLearning.jl integration #9

rejuvyesh commented Mar 9, 2022 •

edited

Loading

codecov-commenter commented Mar 9, 2022 •

edited by codecov bot

Loading

findmyway Mar 12, 2022

rejuvyesh Mar 13, 2022

janbruedigam commented Apr 12, 2023

[WIP] ReinforcementLearning.jl integration #9

Are you sure you want to change the base?

[WIP] ReinforcementLearning.jl integration #9

Conversation

rejuvyesh commented Mar 9, 2022 • edited Loading

codecov-commenter commented Mar 9, 2022 • edited by codecov bot Loading

Codecov Report

findmyway Mar 12, 2022

Choose a reason for hiding this comment

rejuvyesh Mar 13, 2022

Choose a reason for hiding this comment

janbruedigam commented Apr 12, 2023

rejuvyesh commented Mar 9, 2022 •

edited

Loading

codecov-commenter commented Mar 9, 2022 •

edited by codecov bot

Loading