Skip to content
This repository has been archived by the owner on May 31, 2024. It is now read-only.

Add NewareNDA extractor entry #78

Closed
wants to merge 2 commits into from
Closed

Add NewareNDA extractor entry #78

wants to merge 2 commits into from

Conversation

ml-evs
Copy link
Member

@ml-evs ml-evs commented Mar 10, 2024

This PR adds the NewareNDA Python package as an extractor of our neware-nda file type.

Perhaps of interest to @d-cogswell, the author of the package (feel free to suggest any changes).

In terms of implementation, I have added several very similar CLI invocations for each of the support output file formats, which could maybe be merged/templated after a future schema update.

We're also missing example files for the neware-nda file type (which is in fact two types, .nda and .ndax -- could also be separated...). There's an open PR and issue about adding test data at Solid-Energy-Systems/NewareNDA#48 that we could poach from to set up our own validation pipelines here.

@ml-evs ml-evs added the extractor Create a new / modify an existing extractor. label Mar 10, 2024
Copy link
Collaborator

@PeterKraus PeterKraus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's implement the neware-nda and neware-ndax filetypes here, too, so that it's self consistent.

Comment on lines +23 to +24
- method: cli
command: NewareNDA-cli.py {{ input_path }} --format csv {{ output_path }}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, what is the python object generated from NewareNDA.read() below? Is it a dict, or some other object? I think we should probably have the analogous CLI usage (json?) for the python usage for now, instead of having 10 equivalent ones with an output format.

Also, we have output_type in the template: https://marda-alliance.github.io/metadata_extractors_schema/main/mme_schema/UsageTemplate.html

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It returns a dataframe at the moment, the CLI then just calls the various df.to_<x>() functions that pandas has.

output_type requires a MaRDA filetype itself right? I think it's also single-valued (at the moment) so not sure how useful it is here

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, then we probably need to mention pandas somewhere. I'll think about how to add that into schema (also for the xarray requirement in yadg) in another PR.

And yes, we should probably add those as FileTypes into the registry anyway.

@ml-evs
Copy link
Member Author

ml-evs commented May 28, 2024

This needs to be remade at the yard

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
extractor Create a new / modify an existing extractor.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants