-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
train/enjoy_husky_gibson_flagrun.py issues! #104
Comments
|
Can you plot your reward curve during your training process? This would be insightful! Thanks. |
Thank you for your quick response Fei. You are awesome :) |
Hello everyone,
I have been studying about husky flagrun algorithms for a long time. I have some problems about it. Despite of trying everyhing, agent can not able to learn how to go to cube(target).
First of all, I couldn't understand the reward function which contains alive_score,progress and obstacle_dist only. There is no any close_to_target option to go target.
Second thing, The target location does not change in any file. There is only two line in _flagreposition as self.walk_to_target = ballxyz. It seems that not contribute to reward function and learning process.
The last thing, there is a sentence in the paper: "We trained a perceptual and non-perceptual husky agent according to the setting in Sec. 4.1 with PPO [78] for 150 episodes (300 iterations, 150k frames)." Is the true calculation 150k frame/300 Iteration = 500 Timesteps*Batch ? Timesteps and batch multiplication seems too low.
If I took answers to questions, I would be grateful to you. Thanks.
The text was updated successfully, but these errors were encountered: