Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add prioritized experience replay #1

Open
r7vme opened this issue Oct 18, 2018 · 0 comments
Open

Add prioritized experience replay #1

r7vme opened this issue Oct 18, 2018 · 0 comments

Comments

@r7vme
Copy link
Owner

r7vme commented Oct 18, 2018

Right now policy learning process is vague and does not improve over time (i.e. good policy can be learned in 3 episodes, but after 10 episodes policy can degrade completely).

So after spending time adjusting optimal buffer size and optimization steps, i see that it's pretty random. I assume that prioritized experience buffer for DDPG will help imrove situation.

In short, it will make sure

  • to sample unseen observations (by using infinite priority)
  • to sample "valuable" observations (by computing priority based on TD error)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant