Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disaggregate links and stores #629

Merged
merged 35 commits into from
May 23, 2023
Merged

Conversation

gnn
Copy link
Contributor

@gnn gnn commented Mar 24, 2023

I'm finally done with picking the commits from stashes/egon-disaggregation-tryouts. Please check if it still works as well as on the other branch, then we can move it into dev.

gnn and others added 24 commits March 22, 2023 21:09
Doesn't change the behaviour of the code but makes it easier to
understand.
Use `loguru` for the logging. Get rid of the `"---"` lines in logging,
because log messages should contain some explicit information.
When closing over local variables, changing the outer variable also
changes the value inside the function. Making the local variable an
argument with default value prevents that and it's also a tiny bit more
efficient.
And put a type annotation on the local variable.
Helps me to figure out what values are getting passed around, i.e. it
makes the code more readable for me. Also, actually makes the code
shorter.
Same reasons as in the previous commit.
And move separating spaces to the beginning of the line while at it.
The `DataFrame.append` method is deprecated in favour of `pandas.concat`
as of `pandas` version 1.4.0.
Use descriptive variable names for intermediate values to make the code
shorter and basically read as what it does: assert that the `DataFrame`
of rows which don't pass the sanity check, i.e. are "not sane", or
`~sane`, is empty.
Instead of using a for loop to append them one at a time. Doing so
triggered a few warnings about the `DataFrame` being fragmented. The new
code should prevent that, while still inserting the same values.
Additionally, no duplicate columns are created. Instead the data in
already existing columns is overwritten, which is a plus.
A short docstring along with author, copyright and license information.
Using `noops` not only disables the profiling report, but also makes
`profile.enable()` and `profile.disable()` calls do nothing with as
little overhead as possible.
Results are stored directly in `disaggregated_network`'s attributes.
That's what `next(clb.itertuples()).Index` does. It just gets the index
of the first row of `clb`, so `clb.index[0]` should be equivalent but
way clearer and more readable.
Also note that `clb` should only have one row anyway.
Instead of only retaining connecting components, i.e. "lines", "links"
and "transformers", which are completely contained in a cluster, retain
all components which have at least one endpoint inside the cluster. This
is what the switch from `&` to `|` does. That way, these components are
available as `cl_buses` when running `solve_partial_network`. Then, when
solving the partial network, only select those components for
disaggregation which have the left endpoint, i.e. `bus0` in the cluster
and group them by right endpoint, i.e. `bus1`. Adjust sanity checks
accordingly, as now a component doesn't have to have both of its
endpoints inside the cluster, but only needs to have one.
Differentiate between the different columns of stores and other
components, as stores use "e_.*" where other components use "p_.*".
Use a large constant to avoid NaNs as disaggregation weights.
Just putting `key in series` in parenthesis prevents `black` from
spreading it over multiple lines, which means that the whole `lambda`
expression doesn't have to be put in parenthesis, making the whole thing
less indented and a bit more readable IMHO.
Also note that it uses `{}` instead of `set()` now.
Last but not least, use a `# noqa` comment do pacify linters complaining
about an assigned lambda. A `def` would make these lines less compact
and thus slightly less readable, IMHO.
@gnn gnn requested a review from ClaraBuettner March 24, 2023 05:58
@ClaraBuettner
Copy link
Contributor

The disaggregation is currently not running, because etrago.clustering.busmap is a dictionary now (it used to be a pandas.Series). This seems to be a change in pypsa, so it is not that easy to change it directly.
It works when I update a few lines in the disaggregation function. Can I push that? Or do you have a better idea to solve this problem?


def transfer_results(self, *args, **kwargs):
kwargs["bustypes"] = ["generators", "storage_units"]
kwargs["bustypes"] = ["generators", "links", "storage_units", "stores"]
kwargs["series"] = {
"generators": {"p"},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case a pf was performed, the q should be also transferred for generators and storage_units.
Could you add this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd only have one question: how would I check whether a PF was performed? Or can I transfer those "results" unconditionally? The thing is, that this PR deals with "links" and "stores", so this bug was present before updating the disaggregation. That's why I moved this to a separate issue (#644).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can check the args, if args["method"["pf_post_lopf"] is True, the PF was performed.

@ClaraBuettner
Copy link
Contributor

The disaggregation is currently not running, because etrago.clustering.busmap is a dictionary now (it used to be a pandas.Series). This seems to be a change in pypsa, so it is not that easy to change it directly. It works when I update a few lines in the disaggregation function. Can I push that? Or do you have a better idea to solve this problem?

I pushed my solution in 73b9e3b , we can also undo it when it's not fine for you.

@ClaraBuettner
Copy link
Contributor

I did the changes for the busmap on a sub branch of this one: features/disaggregate-links-and-stores-update-busmap

gnn and others added 9 commits April 21, 2023 13:14
This partially reverts commit 73b9e3b. The `.busmap` attribute isn't
really necessary, as `.buses.loc[:, "cluster"]` returns the same
`Series`, i.e. the same indices, values and mapping of index to value.
The only difference is in the order of the entries, which shouldn't
matter.
This fixes a bug in the original `p0`, `p1` disaggregation
implementation for `links` in commit 988d19c. For components having a
`bus0` and a `bus1` instead of just a `bus`, `bus0` is taken as
corresponding to `bus`. So filtering out these components should be done
by looking at `bus0`, not `bus1`.
Nothing in the filter uses the loop variable, so the filtering can be
moved out of the loop. Also, since `cl_buses` is only used to determine
the value of `clb` for the duration of the loop and nowhere else,
filtering can be done directly when assigning `cl_buses` and `clb` can
be determined directly by querying the filtered DataFrame.
Filtering before determining the groups massively speeds up the
disaggregation.
For links, the `query` contains references to `bus1`, which has a
different meaning depending on whether we are dealing with a cluster or
the partial network the cluster represents. For clusters, the query is
correct, because the group was generated by looking up the `bus1` values
of links attached to the cluster. But to find the corresponding links in
the partial network, one has to look for those links whose `bus1` value
matches the index of one of the buses whose `"cluster"` entry is equal
to the group's `bus1` value.
Or in other words, since the group's `bus1` value identifies a cluster,
one has to

  - first find all buses corresponding to that cluster,
  - then find all links whose `bus1` value points to one of these buses.
This is the correct way anyway and fixes a bug, since the values stored
in each group (should) uniquely identify a cluster. The bug just never
surfaced, as these values are usually also found in the partial network.
This isn't the case for the values of `bus`, `bus0` and `bus1` though,
because these point to other cluster or partial network buses, so they
differ depending on whether the attribute is on a cluster or a partial
network component, so the bug was finally caught.
It was obviously meant that way, so this is just a bugfix.
Add punctuation and a unit for better readability, too.
This code prints the differences of the sum of all values of a certain
attribute or timeseries in the clustered network to the sum of the
corresponding disaggregated values in the original network.
Usually these should be close to zero. For `links`'s "p0" and "p1"
timeseries these aren't close to zero, but that's OK as these include
values from the CH4 links, which aren't disaggregated.
But it's probably still a good idea to have those printed in order to be
able to check whether these differences are in the correct ballpark.
Maybe the check can later be extended to be filtered so that they are
close to zero again.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants