Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

develop methods to access data via a store #11

Open
jeanetteclark opened this issue Jul 3, 2024 · 2 comments
Open

develop methods to access data via a store #11

jeanetteclark opened this issue Jul 3, 2024 · 2 comments

Comments

@jeanetteclark
Copy link
Contributor

to run data quality checks, python checks are going to need to be able to access data files directly on disk. This should be written in such a way that check code would undergo no changes were the method to access the files to change.

my plan is to implement this as an interface with the following components:

  • ObjectStore interface that defines get_object which will return a stream
  • HashStore implementation of the ObjectStore interface which accesses a hashstore backend
  • StoreManager that manages the backend store

so the calling code in the checks would look something like this, where store_configuration is a hashmap passed over from Java with configuration information including the store type

store_manager = StoreManager(store_configuration)
obj = store_manager.get_object("pid")
@jeanetteclark
Copy link
Contributor Author

this work is mostly done but still needs implementation in check code. Leaving it on branch feature-store-interface and issue open until the corresponding metadig-checks (re-writing data checks to use it) and metadig-engine (add methods to pass pids to checks) is completed

@jeanetteclark
Copy link
Contributor Author

implementation in check code has been completed, see metadig-checks branch feature-data-quality to see the drafted data.suite and it's corresponding checks. this is all just awaiting full integration testing on k8s and review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Review
Development

No branches or pull requests

1 participant