Disaggregate links and stores #629

gnn · 2023-03-24T05:58:37Z

I'm finally done with picking the commits from stashes/egon-disaggregation-tryouts. Please check if it still works as well as on the other branch, then we can move it into dev.

Doesn't change the behaviour of the code but makes it easier to understand.

Use `loguru` for the logging. Get rid of the `"---"` lines in logging, because log messages should contain some explicit information.

When closing over local variables, changing the outer variable also changes the value inside the function. Making the local variable an argument with default value prevents that and it's also a tiny bit more efficient.

And put a type annotation on the local variable. Helps me to figure out what values are getting passed around, i.e. it makes the code more readable for me. Also, actually makes the code shorter.

Same reasons as in the previous commit.

And move separating spaces to the beginning of the line while at it.

The `DataFrame.append` method is deprecated in favour of `pandas.concat` as of `pandas` version 1.4.0.

Use descriptive variable names for intermediate values to make the code shorter and basically read as what it does: assert that the `DataFrame` of rows which don't pass the sanity check, i.e. are "not sane", or `~sane`, is empty.

Instead of using a for loop to append them one at a time. Doing so triggered a few warnings about the `DataFrame` being fragmented. The new code should prevent that, while still inserting the same values. Additionally, no duplicate columns are created. Instead the data in already existing columns is overwritten, which is a plus.

A short docstring along with author, copyright and license information.

Using `noops` not only disables the profiling report, but also makes `profile.enable()` and `profile.disable()` calls do nothing with as little overhead as possible.

Results are stored directly in `disaggregated_network`'s attributes.

That's what `next(clb.itertuples()).Index` does. It just gets the index of the first row of `clb`, so `clb.index[0]` should be equivalent but way clearer and more readable. Also note that `clb` should only have one row anyway.

Instead of only retaining connecting components, i.e. "lines", "links" and "transformers", which are completely contained in a cluster, retain all components which have at least one endpoint inside the cluster. This is what the switch from `&` to `|` does. That way, these components are available as `cl_buses` when running `solve_partial_network`. Then, when solving the partial network, only select those components for disaggregation which have the left endpoint, i.e. `bus0` in the cluster and group them by right endpoint, i.e. `bus1`. Adjust sanity checks accordingly, as now a component doesn't have to have both of its endpoints inside the cluster, but only needs to have one.

Differentiate between the different columns of stores and other components, as stores use "e_.*" where other components use "p_.*".

Use a large constant to avoid NaNs as disaggregation weights.

Just putting `key in series` in parenthesis prevents `black` from spreading it over multiple lines, which means that the whole `lambda` expression doesn't have to be put in parenthesis, making the whole thing less indented and a bit more readable IMHO. Also note that it uses `{}` instead of `set()` now. Last but not least, use a `# noqa` comment do pacify linters complaining about an assigned lambda. A `def` would make these lines less compact and thus slightly less readable, IMHO.

ClaraBuettner · 2023-03-24T12:38:53Z

The disaggregation is currently not running, because etrago.clustering.busmap is a dictionary now (it used to be a pandas.Series). This seems to be a change in pypsa, so it is not that easy to change it directly.
It works when I update a few lines in the disaggregation function. Can I push that? Or do you have a better idea to solve this problem?

ClaraBuettner · 2023-03-24T12:48:51Z

etrago/cluster/disaggregation.py


    def transfer_results(self, *args, **kwargs):
-        kwargs["bustypes"] = ["generators", "storage_units"]
+        kwargs["bustypes"] = ["generators", "links", "storage_units", "stores"]
        kwargs["series"] = {
            "generators": {"p"},


In case a pf was performed, the q should be also transferred for generators and storage_units.
Could you add this?

I'd only have one question: how would I check whether a PF was performed? Or can I transfer those "results" unconditionally? The thing is, that this PR deals with "links" and "stores", so this bug was present before updating the disaggregation. That's why I moved this to a separate issue (#644).

You can check the args, if args["method"["pf_post_lopf"] is True, the PF was performed.

ClaraBuettner · 2023-03-27T07:46:31Z

The disaggregation is currently not running, because etrago.clustering.busmap is a dictionary now (it used to be a pandas.Series). This seems to be a change in pypsa, so it is not that easy to change it directly. It works when I update a few lines in the disaggregation function. Can I push that? Or do you have a better idea to solve this problem?

I pushed my solution in 73b9e3b , we can also undo it when it's not fine for you.

ClaraBuettner · 2023-03-30T07:59:38Z

I did the changes for the busmap on a sub branch of this one: features/disaggregate-links-and-stores-update-busmap

This partially reverts commit 73b9e3b. The `.busmap` attribute isn't really necessary, as `.buses.loc[:, "cluster"]` returns the same `Series`, i.e. the same indices, values and mapping of index to value. The only difference is in the order of the entries, which shouldn't matter.

This fixes a bug in the original `p0`, `p1` disaggregation implementation for `links` in commit 988d19c. For components having a `bus0` and a `bus1` instead of just a `bus`, `bus0` is taken as corresponding to `bus`. So filtering out these components should be done by looking at `bus0`, not `bus1`.

Nothing in the filter uses the loop variable, so the filtering can be moved out of the loop. Also, since `cl_buses` is only used to determine the value of `clb` for the duration of the loop and nowhere else, filtering can be done directly when assigning `cl_buses` and `clb` can be determined directly by querying the filtered DataFrame. Filtering before determining the groups massively speeds up the disaggregation.

For links, the `query` contains references to `bus1`, which has a different meaning depending on whether we are dealing with a cluster or the partial network the cluster represents. For clusters, the query is correct, because the group was generated by looking up the `bus1` values of links attached to the cluster. But to find the corresponding links in the partial network, one has to look for those links whose `bus1` value matches the index of one of the buses whose `"cluster"` entry is equal to the group's `bus1` value. Or in other words, since the group's `bus1` value identifies a cluster, one has to - first find all buses corresponding to that cluster, - then find all links whose `bus1` value points to one of these buses.

This is the correct way anyway and fixes a bug, since the values stored in each group (should) uniquely identify a cluster. The bug just never surfaced, as these values are usually also found in the partial network. This isn't the case for the values of `bus`, `bus0` and `bus1` though, because these point to other cluster or partial network buses, so they differ depending on whether the attribute is on a cluster or a partial network component, so the bug was finally caught.

It was obviously meant that way, so this is just a bugfix. Add punctuation and a unit for better readability, too.

This code prints the differences of the sum of all values of a certain attribute or timeseries in the clustered network to the sum of the corresponding disaggregated values in the original network. Usually these should be close to zero. For `links`'s "p0" and "p1" timeseries these aren't close to zero, but that's OK as these include values from the CH4 links, which aren't disaggregated. But it's probably still a good idea to have those printed in order to be able to check whether these differences are in the correct ballpark. Maybe the check can later be extended to be filtered so that they are close to zero again.

gnn and others added 24 commits March 22, 2023 21:09

Call UniformDisaggregation with named arguments

0a17cac

Doesn't change the behaviour of the code but makes it easier to understand.

Replace prints with proper logging

db8232c

Use `loguru` for the logging. Get rid of the `"---"` lines in logging, because log messages should contain some explicit information.

Log more stuff

4716abd

Make local variable busflags an argument

d06926b

When closing over local variables, changing the outer variable also changes the value inside the function. Making the local variable an argument with default value prevents that and it's also a tiny bit more efficient.

Factor rows out into a local variable

a8c24ee

And put a type annotation on the local variable. Helps me to figure out what values are getting passed around, i.e. it makes the code more readable for me. Also, actually makes the code shorter.

Factor timeseries out into a local variable

7f61dbe

Same reasons as in the previous commit.

Use implicit string joining instead of +

ff9c359

And move separating spaces to the beginning of the line while at it.

Fix typo in comment: "in side" -> "inside"

4848efe

Update comment about unfiltered time series

aaa0435

Replace DataFrame.append with pandas.concat

096f9d3

The `DataFrame.append` method is deprecated in favour of `pandas.concat` as of `pandas` version 1.4.0.

Improve sanity check's comprehensibility

6ae2a45

Use descriptive variable names for intermediate values to make the code shorter and basically read as what it does: assert that the `DataFrame` of rows which don't pass the sanity check, i.e. are "not sane", or `~sane`, is empty.

Add an error message to sanity checks

4339ebe

Add some information to "etrago/tools/__init__.py"

f00d49a

A short docstring along with author, copyright and license information.

Add various ways of doing nothing

4ecd568

Disable profiling more thorough

17c5e63

Using `noops` not only disables the profiling report, but also makes `profile.enable()` and `profile.disable()` calls do nothing with as little overhead as possible.

Don't create disaggregated_network.results

c350ae4

Results are stored directly in `disaggregated_network`'s attributes.

Use a shorter way of getting the first index value

9db0c2d

That's what `next(clb.itertuples()).Index` does. It just gets the index of the first row of `clb`, so `clb.index[0]` should be equivalent but way clearer and more readable. Also note that `clb` should only have one row anyway.

Replace older string interpolation with f-strings

57352b1

Disaggregate "stores"

b6f42cd

Differentiate between the different columns of stores and other components, as stores use "e_.*" where other components use "p_.*".

Transfer "links"' and "stores"' results

f109cec

Replace infinite values with a large constant

886211b

Use a large constant to avoid NaNs as disaggregation weights.

gnn requested a review from ClaraBuettner March 24, 2023 05:58

ClaraBuettner reviewed Mar 24, 2023

View reviewed changes

ClaraBuettner added 2 commits March 27, 2023 09:39

Add busmap as type pandas.Series

73b9e3b

Un-comment spatial disaggregation

d806f8b

gnn and others added 9 commits April 21, 2023 13:14

Allow pnb to be empty in certain cases

12a1933

Assert that time series sum matches cluster value

774ee1a

Make string containing "{expression}" an f-string

b45c1d3

It was obviously meant that way, so this is just a bugfix. Add punctuation and a unit for better readability, too.

ClaraBuettner approved these changes May 16, 2023

View reviewed changes

gnn mentioned this pull request May 23, 2023

If a PF was performed, q should be considered in transfer_results #644

Closed

gnn merged commit 7132b01 into dev May 23, 2023

ClaraBuettner mentioned this pull request Sep 19, 2023

Adjust desaggregation method for sector coupling #334

Closed

ClaraBuettner mentioned this pull request Oct 2, 2023

Fix disaggregation #674

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disaggregate links and stores #629

Disaggregate links and stores #629

gnn commented Mar 24, 2023

ClaraBuettner commented Mar 24, 2023

ClaraBuettner Mar 24, 2023

gnn May 23, 2023

ClaraBuettner May 23, 2023

ClaraBuettner commented Mar 27, 2023

ClaraBuettner commented Mar 30, 2023

Disaggregate links and stores #629

Disaggregate links and stores #629

Conversation

gnn commented Mar 24, 2023

ClaraBuettner commented Mar 24, 2023

ClaraBuettner Mar 24, 2023

Choose a reason for hiding this comment

gnn May 23, 2023

Choose a reason for hiding this comment

ClaraBuettner May 23, 2023

Choose a reason for hiding this comment

ClaraBuettner commented Mar 27, 2023

ClaraBuettner commented Mar 30, 2023