train/enjoy_husky_gibson_flagrun.py issues! #104

Berk035 · 2019-12-10T06:06:14Z

Hello everyone,

I have been studying about husky flagrun algorithms for a long time. I have some problems about it. Despite of trying everyhing, agent can not able to learn how to go to cube(target).

First of all, I couldn't understand the reward function which contains alive_score,progress and obstacle_dist only. There is no any close_to_target option to go target.
Second thing, The target location does not change in any file. There is only two line in _flagreposition as self.walk_to_target = ballxyz. It seems that not contribute to reward function and learning process.
The last thing, there is a sentence in the paper: "We trained a perceptual and non-perceptual husky agent according to the setting in Sec. 4.1 with PPO [78] for 150 episodes (300 iterations, 150k frames)." Is the true calculation 150k frame/300 Iteration = 500 Timesteps*Batch ? Timesteps and batch multiplication seems too low.

If I took answers to questions, I would be grateful to you. Thanks.

fxia22 · 2019-12-10T06:20:42Z

Reward: alive_score is a reward function to prevent agent from tipping over; progress is the difference of the potential function for two consecutive timesteps (dense reward); obstacle distance penalize going too close to an obstacle.
The target location is changed in _flag_reposition(), in that function a random force is applied to the red cube and throws it within the room, in this way the target location is changed.
The policy is able to converge with a small number of environment steps because it receives ground truth localization, i.e. the agent knows where the target is and only needs to perform local planning/obstacle avoidance.

fxia22 · 2019-12-10T06:34:53Z

Can you plot your reward curve during your training process? This would be insightful! Thanks.

Berk035 · 2019-12-10T06:41:45Z

Can you plot your reward curve during your training process? This would be insightful! Thanks.

Thank you for your quick response Fei. You are awesome :)
I know that the rewards but according to the enjoy results, the agent couldn't go the target. I tried also training with adding self.robot.set_target_position(ball_xyz). Anyway, I will plot my results a few minutes later. Thank you.

Berk035 · 2019-12-10T06:44:35Z

Timesteps:600, Episode:20, Iterations:250

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

train/enjoy_husky_gibson_flagrun.py issues! #104

train/enjoy_husky_gibson_flagrun.py issues! #104

Berk035 commented Dec 10, 2019

fxia22 commented Dec 10, 2019

fxia22 commented Dec 10, 2019

Berk035 commented Dec 10, 2019

Berk035 commented Dec 10, 2019

train/enjoy_husky_gibson_flagrun.py issues! #104

train/enjoy_husky_gibson_flagrun.py issues! #104

Comments

Berk035 commented Dec 10, 2019

fxia22 commented Dec 10, 2019

fxia22 commented Dec 10, 2019

Berk035 commented Dec 10, 2019

Berk035 commented Dec 10, 2019