Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge and write periods #45

Merged
merged 15 commits into from
Aug 30, 2023
Merged

Merge and write periods #45

merged 15 commits into from
Aug 30, 2023

Conversation

FelixMau
Copy link
Contributor

@FelixMau FelixMau commented Aug 11, 2023

Closes #40
Check on periods and write them to the Data a method within Datapackage has been created. The Method

  1. Checks if all given sequences are equal and of same length (if not equal a warning will be raised)
  2. Calls another function for every element
  3. Grouping the Instances of the element via the given name
  4. Then aggregating every column based on what the column contains, there are 4 types:
    4.1. If the contents are dicts the first dicts is being used.
    4.2. If the contents are lists or pandas sequences (plumbin._Sequence is not allowed) check if length the sum of found lists
    is equal to sequence length those sequences are added to timeseries data and foreign keys are written and updated
    4.3 If all entries in the column are equal, only unique values is written
    4.4 If the entries in columns are not all equal, a list with its values is written
  • Run mock test
  • Run hack-a-thon to package creation

When calling yearly_scalars_to_periodic_values it will rewrite values in elements.
It will group for "name" column and then process to aggregate.
Aggregation is happening as follows:
1: If the Entries are dicts (like input_parameter) it will take the first entry and leave it unchanged
2: Elif the Entries are lists or sequences it will check wheter they are sequences for the full period and write them to sequences, update foreign keys and write the foreign key.
3: Elif the entries are all equal the first entry will be written
4: Elif the entries are not equal they are written as a list
Also catching multiindex columns and Changing Typemapping to better type for Gas
@FelixMau FelixMau requested a review from nailend August 11, 2023 15:39
@FelixMau
Copy link
Contributor Author

FelixMau commented Aug 11, 2023

Checks are still failing but I cannot really fix them until a propper Datapackage can be created and has been tested valid. Also #44 needs to be closed before better tests can be written, I'll work on #44 on Monday and then start working on tests again from this point onwards.

@FelixMau FelixMau requested a review from henhuy August 14, 2023 11:49
@henhuy
Copy link
Contributor

henhuy commented Aug 14, 2023

I would have merged period scalars before parametrizing facades, aka one step before

for component_data in process_data.scalars.to_dict(orient="records"):

This would preserve facade internal checks. Additionally, we would not have to create more facades than needed (for each period) and aggregate them later into single facades (which is also confusing IMO).
Could you refactor code, so that period-aggregation is done before L372 and period scalars go into parametrization? I think as you already prepared the functions this is easy...

Aggregation leads to empty fields that can be sequences. Fields that may be sequences are plumbed as `_Sequence` and afterwards saved in a wrong manner. When parametrizing the decorator should be updated to not plug values into plumbing and save them as they are.
@FelixMau
Copy link
Contributor Author

Moving aggregation functionality to begining of build_datapackage for clean pipeline. Yet the moving leads to an issue related to saving the preiodic values.

When parametrizing the components and periodic values are found we discussed to save them as lists of full length into the field. When saving the datapackage the full length list would be saved, taking up a lot of memory and leading to humanly unreadable data.

I have two ideas for solving the issue:

  1. When parametrizing the decorated new_init can be extended to not plug values into plumbing and save them as they are. @nailend Do have a draft already how and wehre multiperiods are read in or have another Idea on this issue?
  2. Check whether fields are List (or _Sequence) on as_dict call. If it is a list write the unique values of the list.

@nailend
Copy link
Contributor

nailend commented Aug 15, 2023

When parametrizing the components and periodic values are found we discussed to save them as lists of full length into the field. When saving the datapackage the full length list would be saved, taking up a lot of memory and leading to humanly unreadable data.

No, we decided to move the "to full length" extraction into tabular. In the oemof.adapter periodic values shall be written in a list with periodic-values only. e.g. [1,2,3]. So just check if the values are unique and if not write them in a list.

@nailend Do have a draft already how and wehre multiperiods are read in or have another Idea on this issue?

Not yet, but this shouldn't be a dependency for this.

@FelixMau FelixMau mentioned this pull request Aug 16, 2023
@nailend
Copy link
Contributor

nailend commented Aug 25, 2023

The deserialization of periodic values in oemof.tabular will be taken care of in PR#124

@nailend nailend merged commit 996c3d6 into dev Aug 30, 2023
3 checks passed
@nailend nailend deleted the merge-and-write-periods branch August 30, 2023 15:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add multi-period to write_datapackage
3 participants