MyoChallenge - IARAI-JKU

This package contains the wining solution for the MyoChallenge die reorient task. This is adapted from EvoTorch starter baseline.

Methodology

We build upon the evotorch baseline which uses PGPE and the ClipUp optimizer on a 3 layer RNN. Our primary approach to solving this hard exploration challenge is by using potential function based reward shaping [refer to RUDDER, Arjona-Medina et al. for a detailed overview] and task subdivision using adapting curricula [Similar to POET, Wang et al.]. We have a population of environments (256), each of which starts of at "easy" difficulty and adapts its difficulty based on the success achieved by the agent in the recent past (20 episodes). The environment difficulty is controlled using the goal_rot value. Hence, the difficulty distribution is shaped by the agent's current performance.

Our recently published work(At the DRL workshop in NeurIPS 2022) shows that minimizing task-irrlevant exploration speeds up learning and improves generalization. This is because visiting task-irrelevant states forces the policy/value networks to fit irrelevant targets that affect their capacity and generalization capabilities. This was also our primary motivation for entering the challenge, to verify some of our ideas on the challenging continuous control tasks posed here.

Essentially, using a reward function based on the differences of a potential function avoids spurious optima (E.g. committing suicide) and also provides a much easier to optimize reward. (Since, it's always possible to obtain a positive reward at every state). Further, the curriculum minimizes task-irrelevant exploration speeding up learning and allowing the trained policy to generalize much better to downstream tasks in the curriculum.

Authors

Special thanks to Prof. Sepp Hochreiter & Dr. Michael Kopp for their guidance and support.

Setup

Create Conda Environment

conda env create -f env.yml

Run Jupyterlab

conda activate iarai-jku-myochallenge
jupyterlab

Run train_die_reorient.ipynb. Further instructions and explanations can be found there.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
media		media
.gitignore		.gitignore
README.md		README.md
env.yml		env.yml
policy.py		policy.py
reorient2_v10.py		reorient2_v10.py
train_die_reorient.ipynb		train_die_reorient.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MyoChallenge - IARAI-JKU

Methodology

Authors

Setup

About

Releases

Packages

Languages

iarai/MyoChallgenge-IARAI-JKU

Folders and files

Latest commit

History

Repository files navigation

MyoChallenge - IARAI-JKU

Methodology

Authors

Setup

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages