Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Weird representation of tags in TimeIntervals #1990

Closed
3 tasks done
gviejo opened this issue Nov 13, 2024 · 3 comments
Closed
3 tasks done

[Bug]: Weird representation of tags in TimeIntervals #1990

gviejo opened this issue Nov 13, 2024 · 3 comments
Assignees
Labels
category: bug errors in the code or code behavior category: question questions about code or code behavior priority: medium non-critical problem and/or affecting only a small set of NWB users

Comments

@gviejo
Copy link

gviejo commented Nov 13, 2024

What happened?

Hello, I am trying to parse a NWB file with pynapple. I noticed that the tag of a TimeIntervals objects is weirdly formatted.
The file is this one : https://dandiarchive.org/dandiset/000939/0.240528.1542/files?location=sub-A3707&page=1
When opening and getting the tags for the object 'nrem' for example, the tags appears differently depending on how I access it. Also it should probably appears as an array of strings.

First steps

import pynapple as nap
a = nap.load_file("/mnt/home/gviejo/Downloads/sub-A3707_behavior+ecephys.nwb")

If I do
1.

a.nwb.objects[a.key_to_id['nrem']].tags[:]

I get :

array(['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '1', '0', '1',
       '1', '1', '2', '1', '3', '1', '4', '1', '5', '1', '6', '1', '7',
       '1', '8', '1', '9', '2', '0', '2', '1', '2', '2', '2', '3', '2',
       '4', '2', '5', '2', '6', '2', '7', '2', '8', '2', '9', '3', '0',
       '3', '1', '3', '2', '3', '3', '3', '4', '3', '5', '3', '6', '3',
       '7', '3', '8', '3', '9', '4', '0', '4', '1', '4', '2', '4', '3',
       '4', '4', '4', '5', '4', '6', '4', '7', '4', '8', '4', '9', '5',
       '0', '5', '1', '5', '2', '5', '3', '5', '4', '5', '5', '5', '6',
       '5', '7', '5', '8', '5', '9', '6', '0'], dtype=object)
a.nwb.objects[a.key_to_id['nrem']]['tags'][:]

I get :

[array(['0'], dtype=object),
 array(['1'], dtype=object),
 array(['2'], dtype=object),
 array(['3'], dtype=object),
 array(['4'], dtype=object),
 array(['5'], dtype=object),
 array(['6'], dtype=object),
 array(['7'], dtype=object),
 array(['8'], dtype=object),
 array(['9'], dtype=object),
 array(['1', '0'], dtype=object),
 array(['1', '1'], dtype=object),
 array(['1', '2'], dtype=object),
 array(['1', '3'], dtype=object),
 array(['1', '4'], dtype=object),
 array(['1', '5'], dtype=object),
 array(['1', '6'], dtype=object),
 array(['1', '7'], dtype=object),
 array(['1', '8'], dtype=object),
 array(['1', '9'], dtype=object),
 array(['2', '0'], dtype=object),
 array(['2', '1'], dtype=object),
 array(['2', '2'], dtype=object),
 array(['2', '3'], dtype=object),
 array(['2', '4'], dtype=object),
 array(['2', '5'], dtype=object),
 array(['2', '6'], dtype=object),
 array(['2', '7'], dtype=object),
 array(['2', '8'], dtype=object),
 array(['2', '9'], dtype=object),
 array(['3', '0'], dtype=object),
 array(['3', '1'], dtype=object),
 array(['3', '2'], dtype=object),
 array(['3', '3'], dtype=object),
 array(['3', '4'], dtype=object),
 array(['3', '5'], dtype=object),
 array(['3', '6'], dtype=object),
 array(['3', '7'], dtype=object),
 array(['3', '8'], dtype=object),
 array(['3', '9'], dtype=object),
 array(['4', '0'], dtype=object),
 array(['4', '1'], dtype=object),
 array(['4', '2'], dtype=object),
 array(['4', '3'], dtype=object),
 array(['4', '4'], dtype=object),
 array(['4', '5'], dtype=object),
 array(['4', '6'], dtype=object),
 array(['4', '7'], dtype=object),
 array(['4', '8'], dtype=object),
 array(['4', '9'], dtype=object),
 array(['5', '0'], dtype=object),
 array(['5', '1'], dtype=object),
 array(['5', '2'], dtype=object),
 array(['5', '3'], dtype=object),
 array(['5', '4'], dtype=object),
 array(['5', '5'], dtype=object),
 array(['5', '6'], dtype=object),
 array(['5', '7'], dtype=object),
 array(['5', '8'], dtype=object),
 array(['5', '9'], dtype=object),
 array(['6', '0'], dtype=object)]

The correct output should be :

['0' '1' '2' '3' '4' '5' '6' '7' '8' '9' '10' '11' '12' '13' '14' '15'
 '16' '17' '18' '19' '20' '21' '22' '23' '24' '25' '26' '27' '28' '29'
 '30' '31' '32' '33' '34' '35' '36' '37' '38' '39' '40' '41' '42' '43'
 '44' '45' '46' '47' '48' '49' '50' '51' '52' '53' '54' '55' '56' '57'
 '58' '59' '60']

Steps to Reproduce

import pynapple as nap
import numpy as np

a = nap.load_file("/mnt/home/gviejo/Downloads/sub-A3707_behavior+ecephys.nwb")

a.nwb.objects[a.key_to_id['nrem']]['tags'][:]
a.nwb.objects[a.key_to_id['nrem']].tags[:]

Traceback

No response

Operating System

Linux

Python Executable

Python

Python Version

3.10

Package Versions

PyNWB 2.8.1

Code of Conduct

@stephprince
Copy link
Contributor

stephprince commented Nov 14, 2024

Hi @gviejo, thanks for the issue.

I noticed that the tag of a TimeIntervals objects is weirdly formatted.

Unfortunately, in the example you shared this appears to be a bug with how the data was originally written to the file. It seems the tags were incorrectly parsed into separate strings when the data was written. We can contact the authors of this dandiset to determine where this issue might have originated in PyNWB/MatNWB.

When opening and getting the tags for the object 'nrem' for example, the tags appears differently depending on how I access it.

These differences are due to how ragged arrays are stored, and .tags[:] is accessing the raw VectorData object. This is potentially confusing behavior and ideally .tags[:] and ['tags'][:] should return the same output. We've opened up an issue in hdmf to address this: hdmf-dev/hdmf#1210

I believe the intended behavior would be what is returned currently by ['tags'][:], since this representation will handle any ragged arrays where there might be multiple tags per row in the DynamicTable.

@stephprince stephprince self-assigned this Nov 14, 2024
@stephprince stephprince added category: bug errors in the code or code behavior priority: medium non-critical problem and/or affecting only a small set of NWB users category: question questions about code or code behavior labels Nov 14, 2024
@gviejo
Copy link
Author

gviejo commented Nov 15, 2024

Thanks. Yes I think it would be nice to update the dandiset if possible.

@stephprince
Copy link
Contributor

I will close this issue for now in favor of the relevant hdmf issue and dandiset issue discussion post. I emailed the corresponding author of the dandiset and they plan to upload an updated version in January.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: bug errors in the code or code behavior category: question questions about code or code behavior priority: medium non-critical problem and/or affecting only a small set of NWB users
Projects
None yet
Development

No branches or pull requests

2 participants