-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Features/#450 add metadata #568
base: dev
Are you sure you want to change the base?
Conversation
List of data sourcesPlease mark, which sources we already included @AmeliaNadal, @ClaraBuettner, @IlkaCu.
|
…go/eGon-data into features/#450-add-metadata
- Split BGR data into four individual sources (report, structural data 1 and 2 as well as data bundle containing shapefiles) - Add ffe data for industrial gas demand
…go/eGon-data into features/#450-add-metadata
…go/eGon-data into features/#450-add-metadata
- Add pypsa requirement - Create a pypsa network and read the component metadata - Add comment on egon_etrago table creation with the respective metadata
…a' into features/#450-add-metadata
Similar to the sources dict in metadata.py I added a contributor dict, so full name and github handle are easily accesible. Import the contributors to add them to the metadata. It is possible to update the dict according to the work done in the respective section. Might need a .copy() to not manipulate the original dict, I am not sure however.
I added a eGon-data/src/egon/data/metadata.py Lines 556 to 606 in 46a1605
eGon-data/src/egon/data/datasets/etrago_setup.py Lines 373 to 381 in 46a1605
Please add people I forgot in the list, I looked here for that list: https://github.com/openego/eGon-data/graphs/contributors, but the list does not seem to be complete. |
There where a lot of `datetime.now()` calls in the module. But `now` isn't a member of the `datetime` module, but a member of the `datetime` class inside the `datetime` module. This is why I don't like modules with members named exactly like the module. It's an accident waiting to happen.
Insert an additional blank line below the imports and remove a blank line at the beginning of an indented block.
Having spaces, and important separating characters in general, at the beginning of the line makes it easier to spot if one has been accidentally missed. While this wasn't a problem here, in the following commit I spotted at least four instances of a missing space between words, solely by putting the space at the beginning of continuation lines.
Use a parenthesized expression to break long string values in dictionaries into multiple lines, i.e. use ```python { "first-key": ( "-----------------------------------------------------" "\nA very long string value that has" " to be split over multiple lines." ), "second-key": "Did you spot the second dictionary entry?", } ``` in favour of ```python { "first-key": "-----------------------------------------------------" "\nA very long string value that has " "to be split over multiple lines.", "second-key": "Did you spot the second dictionary entry?", } ``` The first version separates keys and values more clearly, making it easier to spot all keys and to read the string value as a whole. Using the second version, it's harder to see on which line a new entry with a new key begins, especially if one doesn't start continuation lines with a space or other separating character. See the message of the previous commit for details on why to put spaces and other important separators at the beginning of continuation lines. Note also that, while most of the lines were within the 79 character limit, so not too long, they where reformatted because of the reasons above. And since they had to be touched anyways, they got wrapped at 72 characters, as that's what PEP8 suggests for free flowing text like comments and long strings.
This is the length that PEP8 states for free flowing text like comments or long strings. Since most strings don't start at the beginning of the line, I employed some leeway to go a few characters over in favour of readability and sometimes I wrapped lines early for the same reason, mostly at sentence boundaries. See the previous two commits for details on this particular way of wrapping strings.
The previous URL yields a 404 error, while the new one works. At least at the time of writing this. Noticed while trying to figure out whether the session id suffix and the "nn=" URL parameter where really necessary.
The `os.listdir` function can handle `Path` objects just fine.
Use this to save an indentation level. The rest of the code after the `continue` is exactly the same as before, just indented one level less. This isn't really important now, but since the code will change to get rid of the test altogether, it's a precursor to not mixing getting rid of an indentation level and other code changes into one commit later on.
It's not the table that gets "uploaded", but the metadata comment. And things don't (always) get "uploaded" to a database, but they get stored in the database, because the database might be on the same machine. Last but not least, there's no reason to use use an exclamation mark in log messages. It just makes long streams of log messages harder to read, if they aren't terminated consistently. Log levels usually are enough to signify the importance of message.
Prefix the results of the `os.listdir(path)` call with `path` because the call only returns a list of the names found under `path`, without the containing directory.
Reformat them via `python -m json.tool --indent=2`, with any Python version >= 3.9. The files where all in one line, which isn't exactly human readable or easy to edit.
I didnt find a proper solution to set start and end of temporal:timeseries as its 24h timeseries of random days, not having a specific date.
src/egon/data/metadata.py
Outdated
"pipeline_classification": { | ||
"title": "Technical pipeline characteristics for high pressure pipelines", | ||
"description": "Parameters for the classification of gas pipelines, " | ||
"the whole documentation could is available at: " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that "could" a quote or is that a typo that should have been something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can anyone answer this question?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I vote for typo and not necessary.
This doesn't change the way it's imported but now we can put data files into the package's directory.
Get rid of the hardcoded "json_metadata" directory name and instead use `__name__` to get the name of the current module, which now is also a package, then use `importlib_resources.files` to get a `Path` to the files inside the package, then use `.glob("*.json")` to get all the files ending with ".json" below that path. This simplifies the code because we no longer have to: - import `Path`, `os` or `egon.data`'s `__path__`, - generate the path to the files by hardcoding their directory, - filter for files ending in ".json" and - manually prefix the filenames with the path under which they where found. There is one important change though: since `.glob("*.json")` returns a generator of `Path` objects, the filename component has to be pulled out of the `Path` via the `name` attribute before `split`ting on `"."`. Last but not least, add a missing newline at the end of "src/egon/data/datasets/zensus_vg250.py", courtesy of the `end-of-file-fixer` pre-commit hook.
No need to create a new dialect instance for every file.
"start": "2011-01-01 00:00", | ||
"end": "2011-01-01 00:00", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't these two timestamps differ? Like go from "2011-01-01 00:00" to "2011-01-02 00:00" or something like that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes thats true, should be the end of the year. Sorry about that
What is the current status of this PR? @gnn should someone else review this? |
Contribute to #450.
Before merging into
dev
-branch, please make sure thatCHANGELOG.rst
was updated.black
andisort
.Dataset
-version is updated when existing datasets are adjusted.continuous-integration/run-everything-over-the-weekend
- branch.test mode
.Everything
mode.