Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Post-v1.0] Custom data interfaces as plugins #373

Closed
bendichter opened this issue Sep 18, 2023 · 7 comments
Closed

[Post-v1.0] Custom data interfaces as plugins #373

bendichter opened this issue Sep 18, 2023 · 7 comments

Comments

@bendichter
Copy link
Collaborator

bendichter commented Sep 18, 2023

I wanted to create a ticket to discuss plugins. This is not for the MVP, since there are several features that need to take higher priority, but I wanted to start discussing how it would be implemented. The idea would be that a developer could define a custom interface, and in the GUIDE UI, a user could find that interface and add it to their app. This could be a simple as copying/pasting a GitHub repo URL.

This would potentially provide two big features:

  1. Provide GUIDE functionality for users that have custom code in their conversion pipelines. We now have 35 different labs that we have built pipelines for and every single one has required at least a little custom code, making it impossible for them to use GUIDE to do their entire conversion. It can in most cases do the most involved parts, so GUIDE is still valuable for these groups, but it would be better if we could easily add the custom interfaces to GUIDE so that they could use these pipelines without needing to interact with Python at all. We could create a screen where a user can input the URL of a repo. Then we could pip install that repo and have some way to indicate where the interfaces are (maybe as an entry point?). Then the metadata forms might be tricky. We would need to either 1. automatically generate all of the metadata forms from the json schema, 2. manually define these forms within the plugin repo, 3. not allow any altering of metadata, or 4. only allow altering of metadata through a precisely structured yaml file.

  2. Reduce the size of the initial download and duration of initial installation by moving some less-commonly-used interfaces to plugins.
    The installation package is currently very large and takes a long time to install. Something that might help would be to remove some of the larger and/or less commonly used interfaces from the default package, and allow users to install them as needed. This would presumably speed up the initial installation, and would reduce the size of the application on the client's computer.

This feature could start as a simple unchecked GitHub repo. Then we could add features for automatically testing the interface to ensure it uses the proper classes and methods, etc., and finally we could create a registry of repos so they are easy to explore and discover within the GUIDE app itself.

@bendichter bendichter changed the title plugins data interfaces as plugins Sep 18, 2023
@CodyCBakerPhD
Copy link
Collaborator

This has been on our radar in the background

We would need to either 1. automatically generate all of the metadata forms from the json schema, 2. manually define these forms within the plugin repo, 3. not allow any altering of metadata, or 4. only allow altering of metadata through a precisely structured yaml file.

I think this is exactly the workflow we could use

1.1: try to automatically generate a form based on JSON schema using default components. If it fails for certain reasons (unknown/unsupported components for a field) then proceed to

1.2: search for, or ahead of (1.1), import custom components or pages in some structured way

1.4: moving this above (1.3), allow manual copy/paste of dictionary/YAML test or import/export of .yaml/.json files for the custom metadata - but this would only be allowed as long as the plugin interface has a useful JSON schema for the metadata

1.3: Final case, if no JSON schema for the interface metadata has been specified, there's nothing we can do via NeuroConv. Most likely they hard-coded metadata values into the neurodata types of the add_to_nwbfile method

Reduce the size of the initial download and duration of initial installation by moving some less-commonly-used interfaces to plugins.
Something that might help would be to remove some of the larger and/or less commonly used interfaces from the default package, and allow users to install them as needed. This would presumably speed up the initial installation, and would reduce the size of the application on the client's computer.

This is a hard technological problem; we've been trying out various solutions. Not all pathways have been exhausted yet

The local installation endpoint idea of #300 works well in the dev environment, but is not possible with the distributed releases (https://stackoverflow.com/questions/66590199/can-you-use-pip-from-within-a-pyinstaller-executable, https://stackoverflow.com/questions/75719852/does-pyinstaller-support-installing-dependent-packages-via-pip-on-first-use).

It might make more sense for plugins to be a 'dev mode' feature anyway, what do you think?

But this would not address making the distribution smaller/faster

The installation package is currently very large and takes a long time to install.

I will point out the summary statistics. A quick test of two fresh environments on my system:

neuroconv[minimal]: 0.59 GB
neuroconv[full]: 1.08 GB

So almost twice as large to include instant native support for any format from the get go, even though most users will traverse only a small subset of that.

I will also plead for a fair baseline comparison when reporting that the installation is 'slow': let's try to get some values of how long ours takes on various systems and crucially compare that to the equivalent process of a

(i) fresh download of conda
(ii) creation of the GUIDE environment from our .yml files
(iii) npm ci

obviously our single executable is more convenient than that, but it essentially does the same thing all pre-bundled (which doesn't necessarily reduce size or make it happen faster)

@garrettmflynn
Copy link
Member

garrettmflynn commented Sep 18, 2023

@CodyCBakerPhD Could we add another subfolder to the NWB_GUIDE home folder called Interfaces or Plugins (where user-defined entities could be declared, symlinked, or downloaded) to approach this problem?

Just curious since this is a shared filesystem location for both dev and production builds. I'm not sure what the solution would look like—but it might involve something like this.

@garrettmflynn
Copy link
Member

We'd search for files/subfolders in this folder with a specific structure, then add them to a list of entities (passed to the frontend) that could be used for the conversion.

We'd need a way to ensure that they'd be usable, but that's where I'm hoping y'all can chime in with better Python knowledge

@CodyCBakerPhD
Copy link
Collaborator

CodyCBakerPhD commented Sep 18, 2023

Could we add another subfolder to the NWB_GUIDE home folder called Interfaces or Plugins (where user-defined entities could be declared, symlinked, or downloaded) to approach this problem?

Of course, that is how we would handle exposure/registration of such elements - assuming no substantial or unfixable differences between dev and dist packaging, of course

The quest is, what goes into those subfolders? We could have them compile custom PyInstaller Flask servers for each plugin, but the annoyance there is each one would still need the hefty neuroconv base and would not be able to use the central backend

That's also me assuming it's easy for us to run multiple backends simultaneously

@bendichter
Copy link
Collaborator Author

@CodyCBakerPhD you are right that in saying that the installation is slow, "slow" is not a great word here because it is imprecise and because it is an opinion, not a fact. The fact of the matter is I don't know how slow it is to install because we gave up after ~7 minutes when trying this out at Princeton.

We could have them compile custom PyInstaller Flask servers for each plugin

Woah that sounds way more complicated than what I was thinking. We can't just have them as external pip-installed packages that we import?

@CodyCBakerPhD
Copy link
Collaborator

The fact of the matter is I don't know how slow it is to install because we gave up after ~7 minutes when trying this out at Princeton.

Yes, on a free guest WiFi connection I estimate download + install will probably take about ~10, maybe ~15 minutes on an older device.

Even if we figured how to halve the size of the Python side, that would still be around the ~7 min threshold here

Which honestly IMO is not too bad, just need instructions to download/install ahead of time for shorter workshop events.

We can't just have them as external pip-installed packages that we import?

Not for the distributable. See above

The local installation endpoint idea of #300 works well in the dev environment, but is not possible with the distributed releases (https://stackoverflow.com/questions/66590199/can-you-use-pip-from-within-a-pyinstaller-executable, https://stackoverflow.com/questions/75719852/does-pyinstaller-support-installing-dependent-packages-via-pip-on-first-use).

It might make more sense for plugins to be a 'dev mode' feature anyway, what do you think?

@CodyCBakerPhD CodyCBakerPhD changed the title data interfaces as plugins [Post-v1.0] Custom data interfaces as plugins May 15, 2024
@CodyCBakerPhD
Copy link
Collaborator

Summarized and moved to #847

Re-open again whenever this starts active development

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants