[question] How to use custom callbacks #127

maxmatical · 2019-07-19T16:27:40Z

I think it would be useful to use a callback to save the best model in training. The usage of callbacks is described in https://stable-baselines.readthedocs.io/en/master/guide/examples.html#using-callback-monitoring-training

best_mean_reward, n_steps = -np.inf, 0

def callback(_locals, _globals):
  """
  Callback called at each step (for DQN an others) or after n steps (see ACER or PPO2)
  :param _locals: (dict)
  :param _globals: (dict)
  """
  global n_steps, best_mean_reward
  # Print stats every 1000 calls
  if (n_steps + 1) % 1000 == 0:
      # Evaluate policy training performance
      x, y = ts2xy(load_results(log_dir), 'timesteps')
      if len(x) > 0:
          mean_reward = np.mean(y[-100:])
          print(x[-1], 'timesteps')
          print("Best mean reward: {:.2f} - Last mean reward per episode: {:.2f}".format(best_mean_reward, mean_reward))

          # New best model, you could save the agent here
          if mean_reward > best_mean_reward:
              best_mean_reward = mean_reward
              # Example for saving best model
              print("Saving new best model")
              _locals['self'].save(log_dir + 'best_model.pkl')
  n_steps += 1
  return True

But for our custom environment, how can we specify that x is timestep and y is the reward function?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[question] How to use custom callbacks #127

[question] How to use custom callbacks #127

maxmatical commented Jul 19, 2019 •

edited

Loading

[question] How to use custom callbacks #127

[question] How to use custom callbacks #127

Comments

maxmatical commented Jul 19, 2019 • edited Loading

maxmatical commented Jul 19, 2019 •

edited

Loading