-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add GenericHDF5Reader #6356
base: master
Are you sure you want to change the base?
Add GenericHDF5Reader #6356
Conversation
Hello @gjover, hello to Alba! Could you also provide some differently structure hfd5s for testing? If you can not share them publicly, feel free to email them. In the future, this addition will also need some test files that we will keep in the repository for automated testing, but those should be prepared with special care, so they are very small. |
/rebase |
/rebase |
Thanks for the contribution. I did not test, so I only commented on the code structure. As you saw when implementing this, the current File is reader-agnostic. We tried making it such that it does not know anything about specific formats, and we would like to keep it that way. Therefore, any reader-specific interface should be packaged with the reader. A reader could define a variable that lists the class that defines UI for that specific reader. Then, OWFile could check whether that variable exists; if it does, it could call that UI. That UI class should return reader properties that should then be passed on to the reader class when the actual reading is done (and, of course, stored in OWFile settings). So, for HDF5 support, any addition to OWFile should be tiny and only call what is specified in the reader class (in practice, the contents of |
Also, please rebase your branch to the current orange3/master so that there are no merge commits in the PR. If you have problems with that, we can do it for you, but then you'll have to carefully synchronize your checkout-out branch. |
I am thinking about how to go forward. One option is that you propose how complex readers should interact with file reading widgets (in general, not OWFile specific), either in code or as a spec. Or, perhaps more efficiently, we could structure the interoperation between readers of reading widgets together, and then you could implement just the reader part following that interface while we implement the OWFile part. |
Sure, I agree. As commented before, we added a hidden options_box for these readers that may use them. In that way OWFile would keep agnostic but would allow the Reader to set the options box when needed. |
I have remove the option box and use the sheet mechanism instead. |
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## master #6356 +/- ##
==========================================
- Coverage 88.13% 88.10% -0.04%
==========================================
Files 322 322
Lines 70577 70619 +42
==========================================
+ Hits 62206 62218 +12
- Misses 8371 8401 +30 |
I tried this on a few files, it seems to work well as a fallback reader. I did find it odd that the indexes of multidimensional datasets end up in features, in orange-spectroscopy we usually put this into Metas. I squashed the history into two commits for easier reading: https://github.com/stuart-cls/orange3/tree/generic-hdf5-squash . I don't think the @markotoplak I don't want to hijack this PR too much, but I tried to combine this with an implementation of the Orange on-disk format from dask to see if we could have subclasses register that they are better able to handle a particular HDF5 format (we have a few of these in orange-spectroscopy). See orange-hdf5 and base-hdf5 It works, but this approach is flawed because you lose (at least) the sheets functionality on your subclasses, which is a problem for these metareaders in orange-spectroscopy as well. |
@stuart-cls, could you open a PR with your orange-hdf5 reader/writer? I checked it quickly, and it looks good. We do need something similar in the distro. I also like that you handle variable-length strings (I have no idea how I missed what you did when I was programming that part for Dask). Thanks! |
@gjover, just to double check: your changes here in the File widget are not necessary for the reader, right? If so, let's merge this reader into github.com/quasars/orange-spectroscopy for now. Could you open a PR there, please? If you don't have time, I can take your change set and open the PR myself. |
After we have these parts, we can also think about the meta reader that Stuart suggested. |
Issue
Need for a generic HDF5 Reader that reads a dataset of choice from an hdf5 file.
Description of changes
Add GenericHDF5Reader
Add an options box in OWFile to allow readers to define additional options
Includes