Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

value_iteration 算法不收敛 ? #138

Open
chensisi0730 opened this issue Jun 1, 2023 · 1 comment
Open

value_iteration 算法不收敛 ? #138

chensisi0730 opened this issue Jun 1, 2023 · 1 comment
Assignees

Comments

@chensisi0730
Copy link

value_iteration 测试的成功率是: 0.638 ,价值算法需要不断 的迭代,做策略评估, 代码里面只做了一次迭代

@sherlcok314159
Copy link

All of these algorithms converge to an optimal policy for discounted finite MDPs. FYI,引自强化学习导论,你可以尝试添加discount

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants