DeltaRandomRedispatchAgent and ML #259

PhPv · 2021-11-06T08:42:52Z

PhPv
Nov 6, 2021

Hello!
Please tell me, why when using the DeltaRandomRedispatchAgent according to the guide https://colab.research.google.com/github/rte-france/Grid2Op/blob/master/getting_started/04_TrainingAnAgent.ipynb#scrollTo=obf4nf-BWHFJ, the agent does not use redispatching?
Can I disable all actions and leave only redispatching?

Answered by BDonnot

Nov 19, 2021

Hello,

The good news is, it is possible with l2rpn-baselines, and rather easy to do:

import shutil
import numpy as np
from l2rpn_baselines.DuelQSimple import train
from l2rpn_baselines.utils import NNParam, TrainingParam
from grid2op import make

def filter_action_fun(grid2op_act):
    # filter out all non redispatching actions
    if np.any(grid2op_act.set_bus != 0):
        return False
    if np.any(grid2op_act.change_bus):
        return False
    if np.any(grid2op_act.curtail != -1.):
        return False
    if np.any(grid2op_act.storage_p != 0):
        return False
    if np.any(grid2op_act.line_set_status != 0):
        return False
    if np.any(grid2op_act.line_change_status):
…

View full answer

BDonnot · 2021-11-06T09:44:18Z

BDonnot
Nov 6, 2021
Maintainer

Hello,

There might be different reasons why your agent does not use redispatching.

The first one being that it is not the right things to do. Using default rewards, redispatching is pretty "costly". Your agent will most likely prefer topology (which is basically free) in this condition.
You can modify the reward function to get around this issue.

Another way is to have an agent that does only redispatching actions. In this case it will only use redispatching by design.

If you have your agent predict a redispatching vector of the right size (the number of generators) you can then super easily transform it to a valid grid2op actions with

def act(self, observation, reward, done) :
    redisp_vect =... # get a redispatching vector the way you like, with RL, a heuristic, an optimizer etc... 
    action = self.action_space()
    action.redispatch = redisp_vect 
    return action

This way your agent will be able to perform only redispatching.

4 replies

PhPv Nov 7, 2021
Author

Another way is to have an agent that does only redispatching actions. In this case it will only use redispatching by design.
How can i do this?
env = grid2op.make(env_name, test=True, action_class=DispatchAction)
Can I do this?

BDonnot Nov 7, 2021
Maintainer

Hello,
Yes you can do that for sure. But it's not a good idea, for two main reasons :

you will have to do the same kind of code as I did above, so it does not really make things much easier
your agent will never be able to reconnect a powerline. And powerline can be disconnected for multiple reasons (protections, hazards, maintenance, etc) If you use only dispatch action, there is absolutely no way to reconnect it if any of the above happened.

Which type of agent are you aiming for? I can maybe provide you with a code snippet to help you get started.

PhPv Nov 9, 2021
Author

Ok, it was rly not so good idea.
I want to create an agent that will use only one type of action, namely changing the active power of generators (Redispatching). Then train this agent with ML. The purpose of the training is to obtain an agent that seeks to reduce the current along the branches below a given limit, using only a change in the active power of the generators.
I rly on this guide:
https://colab.research.google.com/github/rte-france/Grid2Op/blob/master/getting_started/04_TrainingAnAgent.ipynb#scrollTo=ETPYWdhAVnzb


env = make(env_name, test=True)  

from l2rpn_baselines.DuelQSimple import train
from l2rpn_baselines.utils import NNParam, TrainingParam
agent_name = "test_agent"
save_path = "saved_agent_DDDQN_{}".format(train_iter)
shutil.rmtree(save_path, ignore_errors=True)

logs_dir="tf_logs_DDDQN"

li_attr_obs_X = ["gen_p", "gen_v", "load_p", "load_q"]

observation_size = NNParam.get_obs_size(env, li_attr_obs_X) 

sizes = [300, 300, 300]  # 3 hidden layers, of 300 units each, why not...
activs =  ["relu" for _ in sizes]  # all followed by relu activation, because... why not

kwargs_archi = {'observation_size': observation_size,
                'sizes': sizes,
                'activs': activs,
                "list_attr_obs": li_attr_obs_X}

baselines.readthedocs.io/en/latest/utils.html#l2rpn_baselines.utils.TrainingParam
tp = TrainingParam()
tp.batch_size = 32  # for example...
tp.update_tensorboard_freq = int(train_iter / 10)
tp.save_model_each = int(train_iter / 3)
tp.min_observation = int(train_iter / 5)
train(env,
      name=agent_name,
      iterations=train_iter,
      save_path=save_path,
      load_path=None, # put something else if you want to reload an agent instead of creating a new one
      logs_dir=logs_dir,
      kwargs_archi=kwargs_archi,
      training_param=tp)

PhPv Nov 9, 2021
Author

Colab
https://colab.research.google.com/drive/1hU741avvTFUmUc2FI3oqek7Ig4BVdEPr?usp=sharing

PhPv · 2021-11-18T19:31:21Z

PhPv
Nov 18, 2021
Author

? :-(

1 reply

BDonnot Nov 18, 2021
Maintainer

Hello,

Sorry for the late answer, I was trying to release new version of grid2op and lightsim2grid. I'll try to provide you with an example shortly

Benjamin

BDonnot · 2021-11-19T09:14:15Z

BDonnot
Nov 19, 2021
Maintainer

Hello,

The good news is, it is possible with l2rpn-baselines, and rather easy to do:

import shutil
import numpy as np
from l2rpn_baselines.DuelQSimple import train
from l2rpn_baselines.utils import NNParam, TrainingParam
from grid2op import make

def filter_action_fun(grid2op_act):
    # filter out all non redispatching actions
    if np.any(grid2op_act.set_bus != 0):
        return False
    if np.any(grid2op_act.change_bus):
        return False
    if np.any(grid2op_act.curtail != -1.):
        return False
    if np.any(grid2op_act.storage_p != 0):
        return False
    if np.any(grid2op_act.line_set_status != 0):
        return False
    if np.any(grid2op_act.line_change_status):
        return False
    # it should be a redispatching action
    return True

if __name__ == "__main__":

    train_iter = 1000
    env_name = "l2rpn_case14_sandbox"

    env = make(env_name)  


    agent_name = "test_agent"
    save_path = "saved_agent_DDDQN_{}".format(train_iter)
    shutil.rmtree(save_path, ignore_errors=True)
    logs_dir="tf_logs_DDDQN"

    li_attr_obs_X = ["gen_p", "gen_v", "load_p", "load_q"]

    observation_size = NNParam.get_obs_size(env, li_attr_obs_X) 

    sizes = [300, 300, 300]  # 3 hidden layers, of 300 units each, why not...
    activs =  ["relu" for _ in sizes]  # all followed by relu activation, because... why not

    kwargs_archi = {'observation_size': observation_size,
                    'sizes': sizes,
                    'activs': activs,
                    "list_attr_obs": li_attr_obs_X}

    # baselines.readthedocs.io/en/latest/utils.html#l2rpn_baselines.utils.TrainingParam
    tp = TrainingParam()
    tp.batch_size = 32  # for example...
    tp.update_tensorboard_freq = int(train_iter / 10)
    tp.save_model_each = int(train_iter / 3)
    tp.min_observation = int(train_iter / 5)
    train(env,
        name=agent_name,
        iterations=train_iter,
        save_path=save_path,
        load_path=None, # put something else if you want to reload an agent instead of creating a new one
        logs_dir=logs_dir,
        kwargs_archi=kwargs_archi,
        training_param=tp,
        filter_action_fun=filter_action_fun)

The bad news are:

it does not work with the l2rpn-baselines packages due to a bug in the "train" function, you'll probably have to install it from github from the dev branch (but this branch might contain some bugs): pip install git+https://github.com/rte-france/l2rpn-baselines.git@bd-dev
the modelling adopted here is far from adequate for redispatching. Indeed, these codes transforms all actions into discrete actions, which is definitely not what should be done
the modelling adopted here is not great at all ! Indeed, for these agents, you can redispatch only 1 generator at a time (this is linked to the way it's done above)
your agent will be terrible for most of the environments: indeed, when a powerline is disconnected, this agent will not be able to reconnect it. It will fail rather quickly

Lots of work needs to be done in this context, i would recommend you to use more advanced RL framework (such as RLLIB or jax, or whatever, see the last notebook of grid2op)

I hope that helps

Benjamin

6 replies

BDonnot Nov 20, 2021
Maintainer

Hello,

Have your properly installed l2rpn-baselines, even in colab, with:

!pip install -U git+https://github.com/rte-france/l2rpn-baselines.git@bd-dev on the first or second cell (for example on the same cell as !pip install grid2op[optional] # for use with google colab (grid2Op is not installed by default)

This will look like

!pip install grid2op[optional]  # for use with google colab (grid2Op is not installed by default)
!pip install -U git+https://github.com/rte-france/l2rpn-baselines.git@bd-dev

I just tried on the google colab and, with the training cell like this:

import shutil
import numpy as np
from l2rpn_baselines.DuelQSimple import train
from l2rpn_baselines.utils import NNParam, TrainingParam
from grid2op import make

def filter_action_fun(grid2op_act):
    # filter out all non redispatching actions
    if np.any(grid2op_act.set_bus != 0):
        return False
    if np.any(grid2op_act.change_bus):
        return False
    if np.any(grid2op_act.curtail != -1.):
        return False
    if np.any(grid2op_act.storage_p != 0):
        return False
    if np.any(grid2op_act.line_set_status != 0):
        return False
    if np.any(grid2op_act.line_change_status):
        return False
    # it should be a redispatching action
    return True

train_iter = 1000
env_name = "l2rpn_case14_sandbox"

env = make(env_name)  


agent_name = "test_agent"
save_path = "saved_agent_DDDQN_{}".format(train_iter)
shutil.rmtree(save_path, ignore_errors=True)
logs_dir="tf_logs_DDDQN"

li_attr_obs_X = ["gen_p", "gen_v", "load_p", "load_q"]

observation_size = NNParam.get_obs_size(env, li_attr_obs_X) 

sizes = [300, 300, 300]  # 3 hidden layers, of 300 units each, why not...
activs =  ["relu" for _ in sizes]  # all followed by relu activation, because... why not

kwargs_archi = {'observation_size': observation_size,
                'sizes': sizes,
                'activs': activs,
                "list_attr_obs": li_attr_obs_X}

# baselines.readthedocs.io/en/latest/utils.html#l2rpn_baselines.utils.TrainingParam
tp = TrainingParam()
tp.batch_size = 32  # for example...
tp.update_tensorboard_freq = int(train_iter / 10)
tp.save_model_each = int(train_iter / 3)
tp.min_observation = int(train_iter / 5)
baseline = train(env,
    name=agent_name,
    iterations=train_iter,
    save_path=save_path,
    load_path=None, # put something else if you want to reload an agent instead of creating a new one
    logs_dir=logs_dir,
    kwargs_archi=kwargs_archi,
    training_param=tp,
    filter_action_fun=filter_action_fun)

And the cell for the evaluation like this:

from l2rpn_baselines.DuelQSimple import evaluate
path_save_results = "{}_results".format(save_path)
shutil.rmtree(path_save_results, ignore_errors=True)

evaluated_agent, res_runner = evaluate(env,
                                       name=agent_name,
                                       load_path=save_path,
                                       logs_path=path_save_results,
                                       nb_episode=2,
                                       nb_process=1,
                                       max_steps=max_eval_step,
                                       verbose=True,
                                       save_gif=False)

It worked for me

PhPv Nov 20, 2021
Author

Hi!
This code works and has worked before.

from l2rpn_baselines.DuelQSimple import evaluate
path_save_results = "{}_results".format(save_path)
shutil.rmtree(path_save_results, ignore_errors=True)

evaluated_agent, res_runner = evaluate(env,
                                       name=agent_name,
                                       load_path=save_path,
                                       logs_path=path_save_results,
                                       nb_episode=2,
                                       nb_process=1,
                                       max_steps=max_eval_step,
                                       verbose=True,
                                       save_gif=False)

However, to get detailed information about the agent's actions and the state of the network, I need to use a runner, according to guide No. 5 https://colab.research.google.com/github/rte-france/Grid2Op/blob/master/getting_started/05_StudyYourAgent.ipynb#scrollTo=cTgBgd-4x3XU
However, when running the runner, I get a nontype error

PhPv Nov 20, 2021
Author

u can see my trouble in my colab
https://colab.research.google.com/drive/1hU741avvTFUmUc2FI3oqek7Ig4BVdEPr?usp=sharing&authuser=2#scrollTo=eeIArLsGDD4C
wiadw?;(

BDonnot Nov 20, 2021
Maintainer

I'm simply saying that the piece of code I sent you work and do what I understood from what you wanted to do.

If you do something different and it doesn't work, maybe it's because it's not suppose to. I can't really debug your code, but if you want to

train an agent you can do that:

import shutil
import numpy as np
from l2rpn_baselines.DuelQSimple import train
from l2rpn_baselines.utils import NNParam, TrainingParam
from grid2op import make

def filter_action_fun(grid2op_act):
    # filter out all non redispatching actions
    if np.any(grid2op_act.set_bus != 0):
        return False
    if np.any(grid2op_act.change_bus):
        return False
    if np.any(grid2op_act.curtail != -1.):
        return False
    if np.any(grid2op_act.storage_p != 0):
        return False
    if np.any(grid2op_act.line_set_status != 0):
        return False
    if np.any(grid2op_act.line_change_status):
        return False
    # it should be a redispatching action
    return True

train_iter = 1000
env_name = "l2rpn_case14_sandbox"

env = make(env_name)  


agent_name = "test_agent"
save_path = "saved_agent_DDDQN_{}".format(train_iter)
shutil.rmtree(save_path, ignore_errors=True)
logs_dir="tf_logs_DDDQN"

li_attr_obs_X = ["gen_p", "gen_v", "load_p", "load_q"]

observation_size = NNParam.get_obs_size(env, li_attr_obs_X) 

sizes = [300, 300, 300]  # 3 hidden layers, of 300 units each, why not...
activs =  ["relu" for _ in sizes]  # all followed by relu activation, because... why not

kwargs_archi = {'observation_size': observation_size,
                'sizes': sizes,
                'activs': activs,
                "list_attr_obs": li_attr_obs_X}

# baselines.readthedocs.io/en/latest/utils.html#l2rpn_baselines.utils.TrainingParam
tp = TrainingParam()
tp.batch_size = 32  # for example...
tp.update_tensorboard_freq = int(train_iter / 10)
tp.save_model_each = int(train_iter / 3)
tp.min_observation = int(train_iter / 5)
baseline = train(env,
    name=agent_name,
    iterations=train_iter,
    save_path=save_path,
    load_path=None, # put something else if you want to reload an agent instead of creating a new one
    logs_dir=logs_dir,
    kwargs_archi=kwargs_archi,
    training_param=tp,
    filter_action_fun=filter_action_fun)

evaluate the performance of an agent, you can do that:

from l2rpn_baselines.DuelQSimple import evaluate
path_save_results = "{}_results".format(save_path)
shutil.rmtree(path_save_results, ignore_errors=True)

evaluated_agent, res_runner = evaluate(env,
                                       name=agent_name,
                                       load_path=save_path,
                                       logs_path=path_save_results,
                                       nb_episode=2,
                                       nb_process=1,
                                       max_steps=max_eval_step,
                                       verbose=True,
                                       save_gif=False)

load your agent, you can do that:

path_model, path_target_model = DuelQ_NN.get_path_model(save_path, agent_name)
nn_archi = DuelQ_NNParam.from_json(os.path.join(path_model, "nn_architecture.json"))

# Run
# Create agent
agent = DuelQSimple(action_space=env.action_space,
                    name=name,
                    store_action=nb_process == 1,
                    nn_archi=nn_archi,
                    observation_space=env.observation_space)

# Load weights from file
agent.load(load_path)

# Build runner
runner = Runner(**runner_params,
                    agentClass=None,
                    agentInstance=agent)

# Run
os.makedirs(logs_path, exist_ok=True)
res = runner.run(path_save=logs_path,
                     nb_episode=nb_episode,
                     nb_process=nb_process,
                     max_iter=max_steps,
                     pbar=verbose)

Which is exactly what is done in the previous call (in the evaluate function) see code there https://github.com/rte-france/l2rpn-baselines/blob/bd-dev/l2rpn_baselines/DuelQSimple/evaluate.py

I can only recommend you either to use these pieces of code to have something working, or to try to see where you method is not the same as these one and try to debug it.

But definitely the error AttributeError: 'Environment_rte_case14_redisp' object has no attribute 'get_indx_extract' means that you are using, somewhere, a grid2op environment in place of an agent.

PhPv Nov 21, 2021
Author

Thank you very much for your help! I do appreciate it.
I think you have a bug in the runner code from https://colab.research.google.com/github/rte-france/Grid2Op/blob/master/getting_started/04_TrainingAnAgent.ipynb#scrollTo=vrPWtoRjyrd_

my_agent.init_obs_extraction(env) -> my_agent.init_obs_extraction(env.observation_space) (worked for me)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Grid2op

DeltaRandomRedispatchAgent and ML #259

{{title}}

Replies: 3 comments 11 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Grid2op

DeltaRandomRedispatchAgent and ML #259

PhPv Nov 6, 2021

Replies: 3 comments · 11 replies

BDonnot Nov 6, 2021 Maintainer

PhPv Nov 7, 2021 Author

BDonnot Nov 7, 2021 Maintainer

PhPv Nov 9, 2021 Author

PhPv Nov 9, 2021 Author

PhPv Nov 18, 2021 Author

BDonnot Nov 18, 2021 Maintainer

BDonnot Nov 19, 2021 Maintainer

BDonnot Nov 20, 2021 Maintainer

PhPv Nov 20, 2021 Author

PhPv Nov 20, 2021 Author

BDonnot Nov 20, 2021 Maintainer

PhPv Nov 21, 2021 Author

PhPv
Nov 6, 2021

Replies: 3 comments 11 replies

BDonnot
Nov 6, 2021
Maintainer

PhPv Nov 7, 2021
Author

BDonnot Nov 7, 2021
Maintainer

PhPv Nov 9, 2021
Author

PhPv Nov 9, 2021
Author

PhPv
Nov 18, 2021
Author

BDonnot Nov 18, 2021
Maintainer

BDonnot
Nov 19, 2021
Maintainer

BDonnot Nov 20, 2021
Maintainer

PhPv Nov 20, 2021
Author

PhPv Nov 20, 2021
Author

BDonnot Nov 20, 2021
Maintainer

PhPv Nov 21, 2021
Author