Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reinforcement Learning for Robot Obstacle Avoidance #3

Open
Gizmotronn opened this issue Jan 27, 2020 · 9 comments
Open

Reinforcement Learning for Robot Obstacle Avoidance #3

Gizmotronn opened this issue Jan 27, 2020 · 9 comments
Assignees
Labels
acord published on github/acord-robotics/stellarios/issues aws documentation Improvements or additions to documentation rl-agent robomaker

Comments

@Gizmotronn
Copy link
Contributor

This issue is for us to document how we will use reinforcement learning to train our rl-agent to avoid obstacles. The goal of this issue is:

A useful link to get started: https://pdfs.semanticscholar.org/0fcd/a4e464c9d55ccd9f8e8e3521c286e4b47933.pdf

@Gizmotronn Gizmotronn self-assigned this Jan 27, 2020
@Gizmotronn Gizmotronn added acord published on github/acord-robotics/stellarios/issues aws documentation Improvements or additions to documentation rl-agent robomaker labels Jan 27, 2020
@Gizmotronn
Copy link
Contributor Author

SemanticScholar - RL-agent Obstacle Avoidance

Abstract*

Reinforcement Learning

  • Reinforcement Learning is learning how to map environment situations to actions, with the goal of maximising a reward signal/value
  • It is a computational approach to learn from interaction. Learning from interaction is a foundational idea in almost all learning methods
  • The agent must learn from its own experience(s)
  • Exploration vs exploitation:
    • The agent must take actions that give a higher reward score (on the reward function) to get the best accumulative rewards
    • However, to find the best actions/choices in certain situations, the agent needs to try actions that it has not selected before
    • The agent has no idea what the reward will be unless it takes the action (otherwise the agent would be able to finish the program on the first try, every time)
    • The agent therefore has to exploit the best known actions to obtain rewards, while also exploring unknown options (to either increase its reward or to get further)

@Gizmotronn
Copy link
Contributor Author

@Gizmotronn
Copy link
Contributor Author

These resources may be useful

@Gizmotronn
Copy link
Contributor Author

SemanticScholar - RL-agent Obstacle Avoidance

Abstract*

Reinforcement Learning

  • Reinforcement Learning is learning how to map environment situations to actions, with the goal of maximising a reward signal/value

  • It is a computational approach to learn from interaction. Learning from interaction is a foundational idea in almost all learning methods

  • The agent must learn from its own experience(s)

  • Exploration vs exploitation:

    • The agent must take actions that give a higher reward score (on the reward function) to get the best accumulative rewards
    • However, to find the best actions/choices in certain situations, the agent needs to try actions that it has not selected before
    • The agent has no idea what the reward will be unless it takes the action (otherwise the agent would be able to finish the program on the first try, every time)
    • The agent therefore has to exploit the best known actions to obtain rewards, while also exploring unknown options (to either increase its reward or to get further)

Experiments

  • The experiment from this link was not just about obstacle avoidance, it was also about wall following (i.e. the agent would attempt to stay as close to the wall as possible and "trace" its path with the wall as its target line)
  • I'm not sure if we have a specific location our osr agent needs to start from, however it is not practical to have a wall-following policy as on Mars there are no walls. While we could say to the agent something along the lines of "stay at the edge of the map unless there is an obstacle that will impede you from doing so", however this would mean that in real life, with Mars being a sphere - and therefore no map edges - the system would fail. While the challenge is to get the highest reward score, which means that wall/map-edge following would be eligible, it would not do much good to the actual OSR, which is what our code would be going into. However, this article is still interesting to read and still has useful information

@Gizmotronn
Copy link
Contributor Author

Might also want to have a look at these links:

@Gizmotronn
Copy link
Contributor Author

Arxiv.org - Unmanned Aerial Vehicles

https://arxiv.org/pdf/1811.03307.pdf (or above comment)

Part 1

Introduction

  • To be able to avoid obstacles, UAVs (or rl-agents) need to be able to perceive the distance between itself and the obstacles

@Gizmotronn
Copy link
Contributor Author

@Gizmotronn
Copy link
Contributor Author

Google Research - Comparison of DRL Policies for Moving Obstacle Avoidance

Abstract

  • Deep RL learns to simultaneously predict the motion of objects and corresponding avoidance actions directly from robotic sensors.

@Gizmotronn
Copy link
Contributor Author

More resources:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
acord published on github/acord-robotics/stellarios/issues aws documentation Improvements or additions to documentation rl-agent robomaker
Projects
None yet
Development

No branches or pull requests

1 participant