Features/#450 add metadata #568

fwitte · 2021-12-13T10:01:22Z

Contribute to #450.

Before merging into `dev`-branch, please make sure that

the CHANGELOG.rst was updated.
new and adjusted code is formated using black and isort.
the Dataset-version is updated when existing datasets are adjusted.
the branch was merged into the continuous-integration/run-everything-over-the-weekend- branch.
the workflow is running successful in test mode.
the workflow is running successful in Everything mode.

fwitte · 2021-12-13T10:04:58Z

nesnoj · 2021-12-13T10:14:58Z

Resolve #450.

@fwitte This does not resolve #450 but rather contributes to that issue (which is great 😸), please mention #450 but do not let github close it automatically. Thank you!

…go/eGon-data into features/#450-add-metadata

- Split BGR data into four individual sources (report, structural data 1 and 2 as well as data bundle containing shapefiles) - Add ffe data for industrial gas demand

…go/eGon-data into features/#450-add-metadata

…es dict

…go/eGon-data into features/#450-add-metadata

- Add pypsa requirement - Create a pypsa network and read the component metadata - Add comment on egon_etrago table creation with the respective metadata

…a' into features/#450-add-metadata

Similar to the sources dict in metadata.py I added a contributor dict, so full name and github handle are easily accesible. Import the contributors to add them to the metadata. It is possible to update the dict according to the work done in the respective section. Might need a .copy() to not manipulate the original dict, I am not sure however.

fwitte · 2021-12-15T06:53:34Z

I added a contributors function to the metadata.py file you can import and select the names of the contributors from.

eGon-data/src/egon/data/metadata.py

Lines 556 to 606 in 46a1605

    
           def contributors(): 
        
               return { 
        
                   "am": { 
        
                       "title": "Aadit Malla", 
        
                       "email": "https://github.com/aadit879", 
        
                   }, 
        
                   "an": { 
        
                       "title": "Amélia Nadal", 
        
                       "email": "https://github.com/AmeliaNadal", 
        
                   }, 
        
                   "cb": { 
        
                       "title": "Clara Büttner", 
        
                       "email": "https://github.com/ClaraBuettner", 
        
                   }, 
        
                   "ce": { 
        
                       "title": "Carlos Epia", 
        
                       "email": "https://github.com/CarlosEpia", 
        
                   }, 
        
                   "fw": { 
        
                       "title": "Francesco Witte", 
        
                       "email": "https://github.com/fwitte", 
        
                   }, 
        
                   "gp": { 
        
                       "title": "Guido Pleßmann", 
        
                       "email": "https://github.com/gplssm", 
        
                   }, 
        
                   "ik": { 
        
                       "title": "Ilka Cußmann", 
        
                       "email": "https://github.com/IlkaCu", 
        
                   }, 
        
                   "ja": { 
        
                       "title": "Jonathan Amme", 
        
                       "email": "https://github.com/nesnoj", 
        
                   }, 
        
                   "je": { 
        
                       "title": "Jane Doe", 
        
                       "email": "https://github.com/JaneDoe", 
        
                   }, 
        
                   "ke": { 
        
                       "title": "Katharina Esterl", 
        
                       "email": "https://github.com/KathiEsterl", 
        
                   }, 
        
                   "sg": { 
        
                       "title": "Stephan Günther", 
        
                       "email": "https://github.com/gnn", 
        
                   }, 
        
                   "um": { 
        
                       "title": "Ulf Müller", 
        
                       "email": "https://github.com/ulfmueller", 
        
                   }, 
        
               }

~~An example usage can be found here:~~ See comment below

eGon-data/src/egon/data/datasets/etrago_setup.py

Lines 373 to 381 in 46a1605

    
           co = contributors() 
        
           contributor_list = [ 
        
               co['an'].update( 
        
                   {"comment": "Add CH4 storage"} 
        
               ), 
        
               co['fw'].update( 
        
                   {"comment": "Add H2 storage"} 
        
               ), 
        
           ]

Please add people I forgot in the list, I looked here for that list: https://github.com/openego/eGon-data/graphs/contributors, but the list does not seem to be complete.

There where a lot of `datetime.now()` calls in the module. But `now` isn't a member of the `datetime` module, but a member of the `datetime` class inside the `datetime` module. This is why I don't like modules with members named exactly like the module. It's an accident waiting to happen.

Insert an additional blank line below the imports and remove a blank line at the beginning of an indented block.

Having spaces, and important separating characters in general, at the beginning of the line makes it easier to spot if one has been accidentally missed. While this wasn't a problem here, in the following commit I spotted at least four instances of a missing space between words, solely by putting the space at the beginning of continuation lines.

Use a parenthesized expression to break long string values in dictionaries into multiple lines, i.e. use ```python { "first-key": ( "-----------------------------------------------------" "\nA very long string value that has" " to be split over multiple lines." ), "second-key": "Did you spot the second dictionary entry?", } ``` in favour of ```python { "first-key": "-----------------------------------------------------" "\nA very long string value that has " "to be split over multiple lines.", "second-key": "Did you spot the second dictionary entry?", } ``` The first version separates keys and values more clearly, making it easier to spot all keys and to read the string value as a whole. Using the second version, it's harder to see on which line a new entry with a new key begins, especially if one doesn't start continuation lines with a space or other separating character. See the message of the previous commit for details on why to put spaces and other important separators at the beginning of continuation lines. Note also that, while most of the lines were within the 79 character limit, so not too long, they where reformatted because of the reasons above. And since they had to be touched anyways, they got wrapped at 72 characters, as that's what PEP8 suggests for free flowing text like comments and long strings.

This is the length that PEP8 states for free flowing text like comments or long strings. Since most strings don't start at the beginning of the line, I employed some leeway to go a few characters over in favour of readability and sometimes I wrapped lines early for the same reason, mostly at sentence boundaries. See the previous two commits for details on this particular way of wrapping strings.

The previous URL yields a 404 error, while the new one works. At least at the time of writing this. Noticed while trying to figure out whether the session id suffix and the "nn=" URL parameter where really necessary.

The `os.listdir` function can handle `Path` objects just fine.

Use this to save an indentation level. The rest of the code after the `continue` is exactly the same as before, just indented one level less. This isn't really important now, but since the code will change to get rid of the test altogether, it's a precursor to not mixing getting rid of an indentation level and other code changes into one commit later on.

It's not the table that gets "uploaded", but the metadata comment. And things don't (always) get "uploaded" to a database, but they get stored in the database, because the database might be on the same machine. Last but not least, there's no reason to use use an exclamation mark in log messages. It just makes long streams of log messages harder to read, if they aren't terminated consistently. Log levels usually are enough to signify the importance of message.

Prefix the results of the `os.listdir(path)` call with `path` because the call only returns a list of the names found under `path`, without the containing directory.

Reformat them via `python -m json.tool --indent=2`, with any Python version >= 3.9. The files where all in one line, which isn't exactly human readable or easy to edit.

I didnt find a proper solution to set start and end of temporal:timeseries as its 24h timeseries of random days, not having a specific date.

gnn · 2023-03-28T17:13:52Z

src/egon/data/metadata.py

+        "pipeline_classification": {
+                    "title": "Technical pipeline characteristics for high pressure pipelines",
+                    "description": "Parameters for the classification of gas pipelines, "
+                    "the whole documentation could is available at: "


Is that "could" a quote or is that a typo that should have been something else?

Can anyone answer this question?

I vote for typo and not necessary.

This doesn't change the way it's imported but now we can put data files into the package's directory.

Get rid of the hardcoded "json_metadata" directory name and instead use `__name__` to get the name of the current module, which now is also a package, then use `importlib_resources.files` to get a `Path` to the files inside the package, then use `.glob("*.json")` to get all the files ending with ".json" below that path. This simplifies the code because we no longer have to: - import `Path`, `os` or `egon.data`'s `__path__`, - generate the path to the files by hardcoding their directory, - filter for files ending in ".json" and - manually prefix the filenames with the path under which they where found. There is one important change though: since `.glob("*.json")` returns a generator of `Path` objects, the filename component has to be pulled out of the `Path` via the `name` attribute before `split`ting on `"."`. Last but not least, add a missing newline at the end of "src/egon/data/datasets/zensus_vg250.py", courtesy of the `end-of-file-fixer` pre-commit hook.

No need to create a new dialect instance for every file.

gnn · 2023-03-28T18:33:41Z

src/egon/data/json_metadata/demand.egon_heat_idp_pool.json

+      "start": "2011-01-01 00:00",
+      "end": "2011-01-01 00:00",


Shouldn't these two timestamps differ? Like go from "2011-01-01 00:00" to "2011-01-02 00:00" or something like that?

Yes thats true, should be the end of the year. Sorry about that

ClaraBuettner · 2023-09-26T12:09:18Z

What is the current status of this PR? @gnn should someone else review this?

ClaraBuettner and others added 2 commits December 13, 2021 10:03

Add function for list of sources

f981236

Add BGR datasource

e21500a

IlkaCu and others added 20 commits December 13, 2021 11:55

Add osm, mastr, hotmaps to sources dict

4ab5946

Merge branch 'features/#450-add-metadata' of https://github.com/opene…

648ab00

…go/eGon-data into features/#450-add-metadata

Insert individual sources for BGR and openffe gas

a4eb6bc

- Split BGR data into four individual sources (report, structural data 1 and 2 as well as data bundle containing shapefiles) - Add ffe data for industrial gas demand

Sort sources alphabetically

a994a6e

Add demandregio, nep, era5, tyndp, peta to sources list

c6612fc

Add authors

0d886f8

Merge branch 'features/#450-add-metadata' of https://github.com/opene…

9213ac8

…go/eGon-data into features/#450-add-metadata

Add seenergies and egon-data to sources dict

0d78d2b

Add missing }

e6edf5e

Add metadata for scenario capacities

99262f6

Add licenses in list

80df673

Add documentation to add_metadata

c14d16b

Add Scigrid_gas, Einspeiseatlas and pipepline_classification to sourc…

60d13a1

…es dict

Add egon-data to metadata soutces

6262699

Update metadata

68a9165

Merge branch 'features/#450-add-metadata' of https://github.com/opene…

be8a0f8

…go/eGon-data into features/#450-add-metadata

Add metadata to renewable_feedin

7545890

Update metadata for district_heating areas

e04a3b1

Define method to retrieve etrago table metadata

55ff8a2

- Add pypsa requirement - Create a pypsa network and read the component metadata - Add comment on egon_etrago table creation with the respective metadata

Add metadata for zenszs map vg250

775cd54

fwitte mentioned this pull request Dec 14, 2021

Define method to retrieve etrago table metadata #569

Merged

6 tasks

fwitte added 3 commits December 14, 2021 15:54

Add Storage key to components dict

edee1f0

Merge remote-tracking branch 'origin/features/#449-add-etrago-metadat…

41aab41

…a' into features/#450-add-metadata

fwitte self-assigned this Dec 20, 2021

gnn and others added 17 commits March 18, 2023 16:57

Run black on "src/egon/data/metadata.py"

3bc50ac

Insert an additional blank line below the imports and remove a blank line at the beginning of an indented block.

Run isort on "src/egon/data/metadata.py"

1294a0f

Fix "zensus" URL

31b3f50

The previous URL yields a 404 error, while the new one works. At least at the time of writing this. Noticed while trying to figure out whether the session id suffix and the "nn=" URL parameter where really necessary.

Don't convert Path object to string

c0f3485

The `os.listdir` function can handle `Path` objects just fine.

Replace string concatenation with an f-string

b84d5ac

Fix incomplete file paths

c0e244e

Prefix the results of the `os.listdir(path)` call with `path` because the call only returns a list of the names found under `path`, without the containing directory.

Reformat "json_metadata/*.json" files

86fa23e

Reformat them via `python -m json.tool --indent=2`, with any Python version >= 3.9. The files where all in one line, which isn't exactly human readable or easy to edit.

Removed parts of metadata which lead to omi error

9ae851c

I didnt find a proper solution to set start and end of temporal:timeseries as its 24h timeseries of random days, not having a specific date.

Readd start and end information

4d53166

Replace start and end with isodates for the whole year

1275fa8

Move duplicate metadata entries into variables

dfe1d8e

gnn reviewed Mar 28, 2023

View reviewed changes

gnn added 5 commits March 28, 2023 19:22

Convert 'egon.data.metadata' into a package

7a2a21d

This doesn't change the way it's imported but now we can put data files into the package's directory.

Move dialect creation out of the loop

43c6420

No need to create a new dialect instance for every file.

Fix typo in filename: "populaiton" -> "population"

15e6753

Fix typo in filename: "fitered" -> "filtered"

7b8abe0

gnn reviewed Mar 28, 2023

View reviewed changes

Fix end timestep

7c83cb2

khelfen mentioned this pull request Apr 21, 2023

Features/#450 add metadata kh #1122

Open

6 tasks

ClaraBuettner added 3 commits October 10, 2023 15:37

Merge branch 'dev' into features/#450-add-metadata

12f6665

Import missing package

c1f0966

Import json package

bb1f08c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Features/#450 add metadata #568

Features/#450 add metadata #568

fwitte commented Dec 13, 2021 •

edited

Loading

fwitte commented Dec 13, 2021 •

edited by AmeliaNadal

Loading

nesnoj commented Dec 13, 2021 •

edited

Loading

fwitte commented Dec 15, 2021 •

edited

Loading

gnn Mar 28, 2023

ClaraBuettner Sep 26, 2023

nailend Sep 26, 2023

gnn Mar 28, 2023

nailend Mar 29, 2023 •

edited

Loading

ClaraBuettner commented Sep 26, 2023

Features/#450 add metadata #568

Are you sure you want to change the base?

Features/#450 add metadata #568

Conversation

fwitte commented Dec 13, 2021 • edited Loading

Before merging into dev-branch, please make sure that

fwitte commented Dec 13, 2021 • edited by AmeliaNadal Loading

List of data sources

nesnoj commented Dec 13, 2021 • edited Loading

fwitte commented Dec 15, 2021 • edited Loading

gnn Mar 28, 2023

Choose a reason for hiding this comment

ClaraBuettner Sep 26, 2023

Choose a reason for hiding this comment

nailend Sep 26, 2023

Choose a reason for hiding this comment

gnn Mar 28, 2023

Choose a reason for hiding this comment

nailend Mar 29, 2023 • edited Loading

Choose a reason for hiding this comment

ClaraBuettner commented Sep 26, 2023

fwitte commented Dec 13, 2021 •

edited

Loading

Before merging into `dev`-branch, please make sure that

fwitte commented Dec 13, 2021 •

edited by AmeliaNadal

Loading

nesnoj commented Dec 13, 2021 •

edited

Loading

fwitte commented Dec 15, 2021 •

edited

Loading

nailend Mar 29, 2023 •

edited

Loading