-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Copy" via "export" is "larger" (10x fold in this silly example) than original! #1187
Comments
@yarikoptic Your dandi pytest fixture writes NWB files without caching the spec. The export call caches the spec by default. I believe that explains all of the diff. If you want to export without caching the spec, you currently cannot do that using pynwb but we are going to remedy that in a quick bugfix to pynwb. |
coolio, thanks @rly for quick response! ❯ /tmp/simple2.py /tmp/simple2.nwb /tmp/simple2-copy.nwb && ls -l /tmp/simple2.nwb /tmp/simple2-copy.nwb
Copying /tmp/simple2.nwb /tmp/simple2-copy.nwb using pywnb 2.5.0.post0.dev15
Now reading /tmp/simple2-copy.nwb
/tmp/simple2.py /tmp/simple2.nwb /tmp/simple2-copy.nwb 3.32s user 2.43s system 229% cpu 2.510 total
-rw-rw-r-- 1 yoh yoh 19664 Sep 6 14:48 /tmp/simple2-copy.nwb
-rw-rw-r-- 1 yoh yoh 19664 Sep 5 15:18 /tmp/simple2.nwb
❯ diff -Naur <(h5dump /tmp/simple2.nwb) <(h5dump /tmp/simple2-copy.nwb)
--- /proc/self/fd/18 2024-09-06 14:48:31.938598041 -0400
+++ /proc/self/fd/19 2024-09-06 14:48:31.938598041 -0400
@@ -1,4 +1,4 @@
-HDF5 "/tmp/simple2.nwb" {
+HDF5 "/tmp/simple2-copy.nwb" {
GROUP "/" {
ATTRIBUTE "namespace" {
DATATYPE H5T_STRING {
@@ -45,7 +45,7 @@
}
DATASPACE SCALAR
DATA {
- (0): "154bbc4f-4276-47db-bac9-f7cdc8880aa4"
+ (0): "c8b730fc-f3bf-4619-8069-c66f5ff0a9aa"
}
}
GROUP "acquisition" {
@@ -183,7 +183,7 @@
}
DATASPACE SCALAR
DATA {
- (0): "db410d65-a49a-4bd8-8ec9-ad6076d272e7"
+ (0): "eb09c10a-6ac9-461b-bb44-5bccd2551a3b"
}
}
DATASET "date_of_birth" {
now I wonder -- how to discover if original file had spec cached or not so I export without only if prior one didn't have it cached? |
The spec is cached in the hdf5 nwb file if the root hdf5 file contains an attribute named |
Alternatively, you can run |
I believe this issue has been resolved. @yarikoptic please reopen if not. |
Follow up to
as initially observed while troubleshooting it for
If we use the same script as provided in #1186 with not broken hdmf 3.14.3, we get
so you can see that "copied" file is 189k while original just 19k. Is that expected/desired/unavoidable?
output of
diff -Naur <(h5dump /tmp/simple2.nwb) <(h5dump /tmp/simple2-copy.nwb)
: http://www.oneukrainian.com/tmp/simple2-h5dump.diffOriginal file is produced using this pytest fixture https://github.com/dandi/dandi-cli/blob/HEAD/dandi/tests/fixtures.py#L101
PS feel welcome to reassign to pynwb is the issue is there .
The text was updated successfully, but these errors were encountered: