Replies: 9 comments 1 reply
-
I think you could try the PTIR studio format. They are HDF5 inside, just have the .ptir extension... |
Beta Was this translation helpful? Give feedback.
-
Thanks, @borondics ! HDF5 is perfect. Do you know whether they have an io-library or maybe a specification for the format? I just opened |
Beta Was this translation helpful? Give feedback.
-
I don't think there is a description. However, if you check the test files here you will see how to structure it. If you make a writer function it could be nice to push that code to Orange Spectroscopy too... |
Beta Was this translation helpful? Give feedback.
-
Ok, thanks! I will share the writer for .ptir, if will do it:) Would it be nice to have just a simple binary format to pass any table in Orange? For example, a flat HDF5 file, where each (key, value) represents column's name and its values (1D-array)? Or maybe a more specific one for the add-on with key |
Beta Was this translation helpful? Give feedback.
-
Thanks for raising this, I think about this problem often! Let's keep this issue open until a satisfactory solution for hyperspectral data is found. Regarding a more general binary format representation of Orange data Tables, I suggest looking at @markotoplak work on an "Orange On-disk Format" in HDF5, although I think it's not stabilized yet. |
Beta Was this translation helpful? Give feedback.
-
Hi, HDF5 is a good choice, as is Zarr, if you plan to stay on Python. Both of these are chunked and have a range of compression features built in. Therefore they are great for out of core processing since you only read the piece of the data you require. They are Dask compatible too for parallel read/writes. The Photothermal PTIR format is (I believe) designed to be compatible with USID, so https://pycroscopy.github.io/pyUSID/ would be a good place to start. I suggest not trying to export to another vendor’s format since if they change anything, you need to keep step with them, and it may make break your structure. I'm also interested in TileDB Embedded, which looks proprietary, but the file format is open source. I've been told that the TileDB format can stream Apache Arrow format, which is very interesting. TileDB is cross platform like HDF5, but Zarr is really only available for Python right now. https://zarr.readthedocs.io/en/stable/ |
Beta Was this translation helpful? Give feedback.
-
We could also think about Nexus. It is based on HDF5 and can hold lots of metadata. https://github.com/nexusformat |
Beta Was this translation helpful? Give feedback.
-
Thank you @borondics and @AlexHenderson for the suggestions! I will try out than pure HDF5, TileDB-embedded and Nexus formats for storing Hyperspectral images and then report my findings here. It is also interesting to see the @markotoplak work on "Orange On-disk Format" in HDF5, but I cannot find it. @stuart-cls could you please send a link to it? |
Beta Was this translation helpful? Give feedback.
-
@pisarik, check the |
Beta Was this translation helpful? Give feedback.
-
I would like to make export of Hyperspectral images from a database to Orange. Images are stored in some internal format, so I need first to assemble a hyperspectral image, then save it and open in Orange. From the
io
module, I saw that ascii format is supported, but those are text files and they are not very well suit to store images, since they will double the size of images and a laptop's memory is precious:)Do you know maybe some open, simple and binary format for hyperspectral images? Should we implement maybe one ascii-like, but binary? It would be also nice to add description of the format for export to Orange in documentation maybe.
Beta Was this translation helpful? Give feedback.
All reactions