diff --git a/tutorials/ENV.md b/tutorials/ENV.md index 98ffd069..3fb30718 100644 --- a/tutorials/ENV.md +++ b/tutorials/ENV.md @@ -15,7 +15,7 @@ The logic of the environment is handled by the methods `maskless_step` and `mask - The `log_reward` function that assigns the logarithm of a nonnegative reward to every terminating state (i.e. state with all $s_f$ as a child in the DAG). If `log_reward` is not implemented, `reward` needs to be. -For `DiscreteEnv`s, the user can define a`get_states_indices` method that assigns a unique integer number to each state, and a `n_states` property that returns an integer representing the number of states (excluding $s_f$) in the environment. The function `get_terminating_states_indices` can also be implemented and serves the purpose of uniquely identifying terminating states of the environment, which is useful for [tabular `GFNModule`s](https://github.com/saleml/torchgfn/tree/master/src/gfn/estimators.py). Other properties and functions can be implemented as well, such as the `log_partition` or the `true_dist_pmf` properties. +For `DiscreteEnv`s, the user can define a`get_states_indices` method that assigns a unique integer number to each state, and a `n_states` property that returns an integer representing the number of states (excluding $s_f$) in the environment. The function `get_terminating_states_indices` can also be implemented and serves the purpose of uniquely identifying terminating states of the environment, which is useful for [tabular `GFNModule`s](https://github.com/saleml/torchgfn/tree/master/src/gfn/utils/modules.py). Other properties and functions can be implemented as well, such as the `log_partition` or the `true_dist_pmf` properties. For reference, it might be useful to look at one of the following provided environments: - [HyperGrid](https://github.com/saleml/torchgfn/tree/master/src/gfn/gym/hypergrid.py) is an example of a discrete environment where all states are terminating states. - [DiscreteEBM](https://github.com/saleml/torchgfn/tree/master/src/gfn/gym/discrete_ebm.py) is an example of a discrete environment where all trajectories are of the same length but only some states are terminating.