-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Recommended workflow for storing results returned as numpy.ndarray
?
#544
Comments
Hi Joscha, Therefore, we are in the planning to move to an alternative representation building something like Neo, not for input data but for analysis results. The idea here is that a minimal number of objects are able to represent the analysis results, certain key metadata and additional info like Neo annotations, and of course a serialization to disk (maybe even the option to temporarily dump objects to disk, similar to Neo's lazy loading, to deal with large analysis results). These objects would not become part of Neo since structurally this would not fit, however, it is possible to draw links between the tools nevertheless. An early prototype of how this could look for a Implementing such objects would further simplify the interoperability with a companion project, alpaca (first release pending within the next weeks, https://alpaca-prov.readthedocs.io/en/latest/), to capture the provenance of an analysis workflow. This work on provenance we had prioritized over the data objects, however, we are confident that the data objects will be on the agenda this year (together with a new object to represent an experimental trial, which is what we are currently working on.) I hope this goes in the direction of what you are thinking. Of course, we are very open for any ideas, suggestions and contributions to this topic! |
Hi @mdenker, great to hear that there are some plans for this. I agree that this is probably out-of-scope for Neo. In general I like the direction that the As data analysis in electrophysiology is often exploratory and constantly changing, it may already be great to offer something a little more rigid than saving a complete workspace in MATLAB, but not too much. Often an analysis result is not much more than a couple of numpy arrays plus meta-data, which could be stored in simple, future-proof formats such as JSON and (flat) HDF or NPY. If I understood it correctly, alpaca is basically already almost doing that, except serializing the arrays. Correct? From an architectural point of view, I'm not sure if each analysis method needs to implement its own result class inheriting from the |
Thinking about it, maybe a Edit: I stumbled upon https://github.com/lidatong/dataclasses-json, which may be useful in this context. |
Hi, and thanks for all your great comments ideas and suggestions. I agree that your idea of having a generic (Regarding alpaca, it's aimed at merely tracking provenance and data flow of inputs and outputs during a script execution, but does not get involved with the structure of serialization of data as such. However, such approaches could be seen synergistic in this discussion.) |
numpy.ndarray
?numpy.ndarray
?
Several functions that operate on
AnalogSignal
data return simplenumpy.ndarray
s, e.g.Most users will probably want to store the results in some way. For the
AnalogSignal
outputs, Neo provides easy-to-use saving to disk. For the spectral and correlation measures however, Neo does not offer this (yet).Are there any best practices what to do with these "pure" results, or plans towards implementing a
Spectrum
data model including I/O?The text was updated successfully, but these errors were encountered: