Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chapter7 Discussion #80

Open
PaParaZz1 opened this issue May 31, 2023 · 2 comments
Open

Chapter7 Discussion #80

PaParaZz1 opened this issue May 31, 2023 · 2 comments
Labels
discussion Topic discussion

Comments

@PaParaZz1
Copy link
Member

本 issue 将会追踪和记录各种有关课程第七讲的问题和延伸思考,欢迎有兴趣的同学在这个 issue 中评论,课程组会定期整理信息

@PaParaZz1 PaParaZz1 added the discussion Topic discussion label May 31, 2023
@PaParaZz1 PaParaZz1 pinned this issue May 31, 2023
@xianglunkai
Copy link

@PaParaZz1
您好!非常感谢分享。
最近我遇到一个问题,动作空间是间断连续的时候(例如,动作空间[-1, 1]中规定[-0.3 0.6]不可取),我尝试了标准的DDPG,SAC,PPO等算法似乎都无能为力。我是通过设置is_done=ture作为一个巨大的惩罚来限制agent动作映射的。
非常期望您的建议。谢谢!

@zjowowen
Copy link

My suggestion is to clip action into a proper interval before calling step method to env by using an env wrapper.

Here is an example:

example.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion Topic discussion
Projects
None yet
Development

No branches or pull requests

3 participants