-
Notifications
You must be signed in to change notification settings - Fork 169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consolidate schema files into a single file for each term type #877
Comments
I would just reduce the amount of IO operations, especially when a file system is involved. But that may not be a blocker given that those file can be read once and hold in memory. It seems that the file names hold additional information. Some other way to support that behavior would be to have an additional field in the object definition to denote the context where the fields are valid.
That additional 'context' field in the object definition could also contain arbitrary complex expressions which would not be allowed in a file name. |
well -- it is BIDS, so yes - files hold additional information ;) on on a serious note re
what are concerns - are they pragmatically relevant ? (e.g. is there some kind of indication that loading multiple files pragmatically impairs IO and can't otherwise be easily resolved to gain some speed up -- e.g. parallel threaded operation across files which is trivial to implement, but the question either it is really needed) |
Speed is not the issue because if I'm generating code from it I'd only do that once every time something changed and if I was creating a dictionary to look things up in I would reformat it into a bit format that reads in more quickly. My primary concern is that a filesystem is not the best way to formalize the standard into a machine readable format. I would have to do some github wizardry to selectively look at the schema files instead of this entire repository. This doesn't mean that the many files approach is impossible to work with. |
The only thing now still missing is something that glues all of the object definitions together to get a "coherent schema" (instead of the loosely coupled files). |
Okay, it sounds like there are few real drawbacks or benefits, but since single files will be easier for folks to deal with I will open a PR that consolidates them. I want to make sure that this doesn't cause problems for anyone using the schema files, so it would be great if we could tag anyone who might need to change any code that depends on them. @erdalkaraca this won't include the "glue" definitions you're talking about, and per our call the other day I think the YAMALE files will require their own issue. EDIT: I've opened #883. |
Multiple people have raised concerns with having the schema definitions separated into one file per term. I found that structure easier to work with as I started adding terms to the schema, but that doesn't mean we have to stick with it in the long run. One proposal would be to consolidate all terms for each term type (
metadata
,suffix
,datatype
) into their own file.I'd like to know how people feel about this.
Here are some potential blockers that came to mind:
EchoTime.yaml
andEchoTime_fmap.yaml
for an example). If all fields are in the same file, then we'll need some other way to support this behavior._EEGCoordSys.yaml
). It's probably not a problem when placing them in a single file, but I want to make sure everything still works that way.datatypes
files have a very different structure than themetadata
,suffixes
, orcolumns
(see [SCHEMA] Add TSV column files #827) files.datatypes
information to a new "rules" folder, and create newdatatypes
object files with just the names and definitions. We need to separate object definitions from rules anyway.Some potential benefits:
Tagging some folks who have expressed opinions on this or who I think might have an opinion: @erdalkaraca, @effigies, @yarikoptic, @CPernet, @Remi-Gau, and @Tokazama. Apologies to anyone I might have missed.
The text was updated successfully, but these errors were encountered: