-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Save datapackage #37
Save datapackage #37
Conversation
Adjusted Naming of foreign_keys to be the same as in tabular, as they are named "foreignKeys" Changed get_foreign_keys, the bus names where actually the entries that would go into "bus.csv" and not the correct fields. For Building the Package datapackage.Package is used to infer datatypes. 1. Save elements and sequences with defined Naming 2. Build Package 3. Add Foreign keys to descriptors 4. Use updated descriptors to create new Package and save that package as json. Then Foreign keys are added and the
After updating tests and testing #31 and #22 may be closed. |
Adding more comments and Extend doc strings.
I don't have much time currently... @nailend could you check? sorry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The functionality to write the csv files seems to work fine. I also like the mock package, didn't know about this. Thanks a lot. I got a few remarks and questions. What about checking the json? This seems to be the neck breaker, and why did you decide to use the frictionless infer
method?
check_if_csv_dirs_equal( | ||
"_files/build_datapackage_goal", "_files/build_datapackage_test" | ||
) | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about checking the json?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that is still missing, thank you! Will add once a propper goal datapackage
is established
package.infer(pattern="**/*.csv") | ||
|
||
# Add foreign keys from self to Package | ||
for i, resource in enumerate(package.descriptor["resources"]): | ||
if resource["name"] in self.foreignKeys.keys(): | ||
resource["schema"].update( | ||
{"foreignKeys": self.foreignKeys[resource["name"]]} | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't you use oemof.tabular.datapackage.building.infer_metadata
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation of infer metadata is only allowing for foreign keys with names "bus", "from_bus", "to_bus" and "profile" and might be sufficient but since we tried to keep it more open and implemented a sophisticated foreign keys approach this would reduce functionality a lot as Far as I understood.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The oemof.tabular.datapackage.building.infer_metadata
functionality also provides a different more general approach to derive foreign keys that I think is not strict enough for our Project, what do you think?
I am a bit confused. This PR is going to close #22, right? Why do you want to merge it into |
Saving Datapackage
For Building the Package
datapackage.Package
is used to infer datatypes.Then Foreign keys are added and the
datapackage.json
is saved to destinationUnrelated changes:
Adjusted Naming of foreign_keys to be the same as in tabular, as they are named
foreignKeys
Changed get_foreign_keys: the bus names where actually the entries that would go into
bus.csv
and not the correct fields.