From 6ebb9207e7a7cf5158fc1f0703acf8563d1aa7c2 Mon Sep 17 00:00:00 2001 From: Ryan Sullivan Date: Tue, 26 Nov 2024 01:48:26 -0500 Subject: [PATCH] Update docs --- docs/curricula/custom_curricula.html | 12 +- docs/curricula/implemented_curricula.html | 4 +- docs/genindex.html | 432 ++++++++++++++-- docs/index.html | 6 +- docs/logging.html | 2 +- docs/modules/modules.html | 17 +- docs/modules/syllabus.core.curriculum.html | 8 +- docs/modules/syllabus.core.html | 359 ++++++++++++- ...llabus.curricula.domain_randomization.html | 6 +- docs/modules/syllabus.curricula.html | 476 +++++++++++++++++- .../syllabus.curricula.learning_progress.html | 2 +- docs/modules/syllabus.curricula.plr.html | 303 ++++++++++- docs/modules/syllabus.examples.html | 8 +- .../syllabus.examples.task_wrappers.html | 42 +- docs/modules/syllabus.html | 146 +++++- docs/modules/syllabus.tests.html | 46 +- docs/objects.inv | Bin 3839 -> 4492 bytes docs/py-modindex.html | 80 ++- docs/search.html | 4 +- docs/searchindex.js | 2 +- .../syllabus.examples.task_wrappers.rst | 17 +- 21 files changed, 1857 insertions(+), 115 deletions(-) diff --git a/docs/curricula/custom_curricula.html b/docs/curricula/custom_curricula.html index fb7dda7..d49b43a 100644 --- a/docs/curricula/custom_curricula.html +++ b/docs/curricula/custom_curricula.html @@ -254,7 +254,7 @@

Creating Your Own Curriculum

Your curriculum method is REQUIRED to implement the following methods:

The sample method is how your curriculum decides which task the environments will play. Most methods use some combination of logic and probability distributions to choose tasks, but there are no restrictions on how you choose tasks.

@@ -264,9 +264,9 @@

Curriculum Dependent Methods -
  • update_task_progress(task, progress) - is called either after each step or each episode 1 . It receives a task name and a boolean or float value indicating the current progress on the provided task. Values of True or 1.0 typically indicate a completed task.

  • -
  • update_on_step(obs, rew, term, trunc, info) - is called once for each environment step.

  • -
  • update_on_episode - (Not yet implemented) will be called once for each completed episode by the environment synchronization wrapper.

  • +
  • update_task_progress(task, progress) - is called either after each step or each episode 1 . It receives a task name and a boolean or float value indicating the current progress on the provided task. Values of True or 1.0 typically indicate a completed task.

  • +
  • update_on_step(obs, rew, term, trunc, info) - is called once for each environment step.

  • +
  • update_on_episode - (Not yet implemented) will be called once for each completed episode by the environment synchronization wrapper.

  • update_on_demand(metrics) - is meant to be called by the main learner process to update a curriculum with information from the training process, such as TD errors or gradient norms. It is never used by the individual environments. It receives a dictionary of metrics of arbitrary types.

  • Your curriculum will probably only use one of these methods, so you can choose to only override the one that you need. For example, the Learning Progress Curriculum @@ -279,7 +279,7 @@

    Recommended Methods
  • _sample_distribution() - Returns a probability distribution over tasks

  • -
  • log_metrics(writer) - Logs curriculum-specific metrics to the provided tensorboard or weights and biases logger.

  • +
  • log_metrics(writer) - Logs curriculum-specific metrics to the provided tensorboard or weights and biases logger.

  • If your curriculum uses a probability distribution to sample tasks, you should implement _sample_distribution(). The default implementation of log_metrics will log the probabilities from _sample_distribution() for each task in a discrete task space to tensorboard or weights and biases. You can also override log_metrics to log other values for your specific curriculum.

    @@ -288,7 +288,7 @@

    Recommended Methods

    You can optionally choose to implement these additional methods:

    update_curriculum_batch and update_on_step_batch can be overridden to provide a more efficient curriculum-specific implementation. The default implementation simply iterates over the updates.

    diff --git a/docs/curricula/implemented_curricula.html b/docs/curricula/implemented_curricula.html index 11f6528..b072e5c 100644 --- a/docs/curricula/implemented_curricula.html +++ b/docs/curricula/implemented_curricula.html @@ -252,7 +252,7 @@

    Curriculum Methods -

    Domain Randomization

    +

    Domain Randomization

    A simple but strong baseline for curriculum learning that uniformly samples a task from the task space.

    @@ -267,7 +267,7 @@

    -

    Learning Progress

    +

    Learning Progress

    Uses a heuristic to estimate the learning progress of a task. It maintains a fast and slow exponential moving average (EMA) of the task completion rates for a set of discrete tasks. By measuring the difference between the fast and slow EMAs and reweighting it to adjust for the time delay created by the EMA, this method can diff --git a/docs/genindex.html b/docs/genindex.html index 5bf31ce..ad7a6f8 100644 --- a/docs/genindex.html +++ b/docs/genindex.html @@ -184,7 +184,7 @@

    Synchronization:

  • close_extras() (syllabus.examples.utils.vecenv.VecEnv method) +
  • +
  • closed (syllabus.examples.utils.vecenv.VecEnv attribute)