Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bonsai GUI export to NWB #2041

Open
rly opened this issue Oct 17, 2024 · 2 comments
Open

Bonsai GUI export to NWB #2041

rly opened this issue Oct 17, 2024 · 2 comments

Comments

@rly
Copy link

rly commented Oct 17, 2024

An NWB user asked for NWB export in Bonsai: hdmf-dev/hdmf#1196

The NWB team is willing to help with such a feature, but we do not know where to start. Could you please provide guidance and support? Could we meet with you over a zoom call? cc @justidy1

@glopesdev
Copy link
Member

glopesdev commented Oct 17, 2024

@rly Thanks for the suggestion, we have also received similar requests a few times and we also are not quite sure where to start! I am outlining our questions / concerns below, happy to have a call to discuss.

  1. All writers in Bonsai are optimized for streaming. This is crucial for us to enable the reactive plug-and-play nature of logging in Bonsai, where you may want to record anything at anytime, aligned on anything. So we tend to prefer writers which are general and lightweight, for example:
  • CsvWriter streams each record as a row into a text file, where columns are the type attributes (generated automatically);
  • MatrixWriter streams a sequence of 1D/2D matrices into a binary format; format is assumed to be fixed-size flat-binary;
  • VideoWriter writes a sequence of frames into a video file, using parallel online compression algorithms;

Generic writers are informative in Bonsai because the type system can be leveraged to do compile-time code generation as in the case of CSV so that the type metadata can inform the structure of the output file before streaming starts! For example, in the case of NWB we could easily infer the attributes of the table and use that to initialize required metadata.

  1. Despite the above, we never really understood HDF5 and NWB to be a format amenable to streaming. We hesitate to cache or queue a conversion in writers after the fact since we use Bonsai for both very long and very short experiments, and for both high-throughput and low-throughput data, so there is little way to know beforehand how many records we will produce, or when we need to start and stop the recording. Any hints on whether this is possible (or could be made possible) would be really helpful.

  2. In terms of interfaces to NWB libraries directly, I can pick up from the suggestions in the original issue:

  1. use Python.NET to call functions in PyNWB

While I can see the convenience of this solution, I think this would be very unappealing to the general Bonsai community. While the Python scripting package is a very powerful tool to unblock certain advanced applications, it is definitely not a dependency that regular users expect to have and would greatly complicate the deployment process. Furthermore, because the Python package still depends heavily on the GIL (even with 3.13 the transition will be slow), this would kill the implicit parallelism which is one of the core performance features of Bonsai.

  1. use SWIG to wrap the AqNWB C++ API for C#, or

I am not sure what AqNWB is, but as long as it allows for threaded parallelism, this sounds reasonable. I probably wouldn't use SWIG but we have other solutions we could recommend to make this possible.

  1. use HDF5 directly.

This seems the most pragmatic approach, since you immediately gain the C# wrapper (there are a number of them on nuget.org) and it would give us a chance to build up the NWB standard in a targeted way with optimizations specific to Bonsai.

  1. Alternately we can start by exploring ways to stream NWB files into Bonsai since there seems to be a streaming IO interface for reading in NWB already? Maybe we could use that to understand better the options while we analyze the case for writing.

  2. One final note: from reading through different applications for NWB it seems like it might be used primarily for processed data, rather than raw data, i.e. I guess you probably wouldn't store raw video or ephys data directly in NWB format? Would it be useful to prioritize applications for exporting experiment metadata into NWB?

@rly
Copy link
Author

rly commented Nov 20, 2024

cc @oruebel

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants