Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[IO] added support for new pattern of datetime when working with Neuralynx #1370

Conversation

Antho2422
Copy link
Contributor

@Antho2422 Antho2422 commented Jan 18, 2024

… data in version 2.2.2 of Pegasus. This is fixing the error when trying to parse the date time with the wrong format.

@zm711
Copy link
Contributor

zm711 commented Jan 18, 2024

@PeterNSteinmetz, any chance you could look at this? I have nothing to do with neuralynx. Seems like we can't specifically get a test file so there might not be much we can do yet. Have you worked with the Pegasus version of neurolynx at all?

@PeterNSteinmetz
Copy link
Contributor

@zm711 From what I see in the changes they are applied to the correct two places for a new type. It would of course be better to have a test file from the author.

I see that we already have a ephy_testing_data/neuralynx/Pegasus_v2.11 directory with a .nev file. Does this test properly against that data file? Running this test looks like it would require adding it to TestNeuralynxRawIO.entities_to_test in file test/rawiotest/testneuralynxrawio.py .

@Antho2422
Copy link
Contributor Author

I don't know if a header file is sufficient but I may be able to provide one if needed

@PeterNSteinmetz
Copy link
Contributor

@Antho2422 What do you mean by a header file? Normally Neuralynx files are .Ncs, .Nse, .Nse, .Nst, or .Ntt. These each contain a 16k text header at the start.

To keep test files short one can usually truncate them to just a few records of the type. So for example, a 5 record .Nse file would be 16384 + 5 * 1044 bytes long. The length of records for each type can be found in https://neuralynx.com/_software/NeuralynxDataFileFormats.pdf .

@Antho2422
Copy link
Contributor Author

@PeterNSteinmetz Yes nevermind I was not aware of that, in our data we extract the header from the .ncs to have also a .txt file containing all the infos in the header. I did my testing on that.

But I will check if I can be able to share some small .ncs dataset.

Thank you,
Anthony

@PeterNSteinmetz
Copy link
Contributor

@Antho2422 Yes, even just a single small .Ncs trimmed to 5 or 10 records will work. I can do the trimming if you get it to me.

We just like to have an actual sample of what you are working with to put in the public ephy_testing_data repository so the automated tests can run against it.

@alejoe91 alejoe91 added this to the 0.13.0 milestone Jan 24, 2024
@apdavison apdavison changed the title [IO] added support for new pattern of datetime when working with Neur… [IO] added support for new pattern of datetime when working with Neuralynx Jan 26, 2024
@apdavison apdavison modified the milestones: 0.13.0, 0.14.0 Jan 26, 2024
@Antho2422
Copy link
Contributor Author

Hi all,

Sorry for the wait. Here is a small dataset for testing : https://instituteicm-my.sharepoint.com/:f:/g/personal/anthony_pinto_icm-institute_org/EhK9xHJBvNJNjyk2ZDl1PrEBwXlj0HylxlXla8D8j4j-zw?e=N66jTs

Keep me updated if this is sufficient.

Thank you,
Anthony

@PeterNSteinmetz
Copy link
Contributor

@Antho2422 Thanks I have downloaded your test data. On first blush it looks like the important parts of the header are the same as some test data we have, but I will likely include some files from your set for testing purposes.

@Antho2422
Copy link
Contributor Author

Hi @PeterNSteinmetz,
Do I need to modify something in order to get this merged or is it all good ? :)

@zm711
Copy link
Contributor

zm711 commented Feb 13, 2024

@Antho2422, we need to make sure we have the appropriate test files on our test repo (gin). Then we would run tests and then we can do a final review for merge. I need to setup an account otherwise we have to wait for someone else to move your files onto the test server. Also we ran some formatting since you opened your PR so you now have conflicts that need to be fixed. Could you try to fix those for us?

@zm711
Copy link
Contributor

zm711 commented Feb 13, 2024

Hey @Antho2422, I know you're almost done, but I should've told you. We started using black. So if you just run your files with black it should fix all conflicts without having to go through everything.

@Antho2422
Copy link
Contributor Author

@zm711 sorry I am not used to resolve merge conflict via github I am not sure how to resolve this last one..

@zm711
Copy link
Contributor

zm711 commented Feb 14, 2024

I'm a little busy for the next couple days, but I'll come back and try to help as soon as I can :)

@PeterNSteinmetz
Copy link
Contributor

@zm711 I have now begun a series of pull requests #1403 and #1404 plus others to address this and more generalized header parsing. I would suggest we merge this before those. This change does allow the header of these Pegasus files to load.

However, I have been checking and despite loading the header, these files are causing problems with finding blocks of continuous records of data. That is likely due to some strangeness in how Pegasus is dealing with sampling rate and timestamps in the records in the file. I am investigating presently.

@zm711
Copy link
Contributor

zm711 commented Feb 15, 2024

@PeterNSteinmetz, thanks. I think you're the resident neuralynx person, but if you normally work with someone else on the team feel free to ping them. I don't use neuralynx at all, so I would love to make sure I understand what is going on before I merge anything. Were you able to add all the test files to gin?

@zm711
Copy link
Contributor

zm711 commented Feb 15, 2024

@Antho2422 and @PeterNSteinmetz. Tests are failing, do you guys know why?

@PeterNSteinmetz
Copy link
Contributor

@zm711 No I don't. When I check out the master branch here it runs all the tests fine.

@PeterNSteinmetz
Copy link
Contributor

@zm711 The testing file additions are a problem. I don't know how we are supposed to add them, etc. See my thread in the google group.

@PeterNSteinmetz
Copy link
Contributor

@Antho2422 So here is the issue that I am now seeing with the Pegasus v2.2.2 files you provided. If one looks carefully at the records in the file, Pegasus is causing little jumps in the timestamps about 20-30 microseconds in length. What this means for this code is that the NeuralyxRawIO will create multiple Segments because a Segment is one continuous block of samples equally spaced in time.

For example, the ..._ieeg_e01c02.ncs file provided ends up having 13 segments. If you are fine with that then no further changes will be needed.

How have you been viewing the data previously?

@zm711
Copy link
Contributor

zm711 commented Feb 16, 2024

@zm711 The testing file additions are a problem. I don't know how we are supposed to add them, etc. See my thread in the google group.

I haven't set up my gin either :( . Sorry. I try to set it up soon so I can work on this. But I agree on your google group comment that the instructions are hard to follow/parse.

@PeterNSteinmetz
Copy link
Contributor

@Antho2422 One other thing. When adding these test files we would like to know how the attribution in the README.txt should read. Please see other examples for other type subdirectories.

@zm711 For the merge of this one, I guess we have to wait until Anthony has finished with the checks, hopefully in a few days.

@Antho2422
Copy link
Contributor Author

Antho2422 commented Feb 16, 2024

@Antho2422 So here is the issue that I am now seeing with the Pegasus v2.2.2 files you provided. If one looks carefully at the records in the file, Pegasus is causing little jumps in the timestamps about 20-30 microseconds in length. What this means for this code is that the NeuralyxRawIO will create multiple Segments because a Segment is one continuous block of samples equally spaced in time.

For example, the ..._ieeg_e01c02.ncs file provided ends up having 13 segments. If you are fine with that then no further changes will be needed.

How have you been viewing the data previously?

Hi @PeterNSteinmetz and @zm711 ! Thank you for working on that. I did not notice those jumps you mentioned but tbh I'm not really familiar with the data structure of Neuralynx. I don't know if this can cause further issues but I normally just concatenate the segments when working with multiple segments.

@Antho2422
Copy link
Contributor Author

@Antho2422 One other thing. When adding these test files we would like to know how the attribution in the README.txt should read. Please see other examples for other type subdirectories.

@zm711 For the merge of this one, I guess we have to wait until Anthony has finished with the checks, hopefully in a few days.

@PeterNSteinmetz I will do that right now

@Antho2422
Copy link
Contributor Author

@PeterNSteinmetz here is the README file. Please tell me if this is enough or if I need to add more information to it.

README.txt

@PeterNSteinmetz
Copy link
Contributor

PeterNSteinmetz commented Feb 16, 2024

@Antho2422 If you are going to mention the recording from a patient, it raises a question. Has this data been de-identified or anonymized and approved as such by the local IRB?

Otherwise we should likely de-identify fully. I see that the dates have been set to 1970. I have already shortened the file name on the first two which I will use and can change them further. I would suggest we also change or remove the -FileUUID and -Session UUID fields. I can readily do that and change the README.txt to state the data have been de-identified. We could also say it has been anonymized if your lab destroys all records of the correspondence between these files and the patient. Otherwise de-identification is likely good enough.

This pull request now just waits on completing the changes you were making and fixing the tests issue.

@samuelgarcia
Copy link
Contributor

@Antho2422 we do not have permission to open publically data from epileptic patient in France!!
Unless you have a clear CPP for that (french autorisation ), which I think this is not the case.

@Antho2422
Copy link
Contributor Author

@samuelgarcia @PeterNSteinmetz The data have been fully anonymized. I can verify if you want but I got the authorization to share it. I will double check and get the necessary documents if needed

@PeterNSteinmetz
Copy link
Contributor

@Antho2422 The presence of the file and session UUIDs seem to me a possible way to re-identify the patient and that would violate anonymization. But if you are confident with the members of the lab that it is all right to share, I am fine with that.

@PeterNSteinmetz
Copy link
Contributor

I think the top priority here should be on getting the checks to pass on this PR. The question of the test and the test data can be left for a different PR. That will get Anthony up and running using the master branch.

datetime1_regex=r"-TimeCreated (?P<date>\S+) (?P<time>\S+)",
datetime2_regex=r"-TimeClosed (?P<date>\S+) (?P<time>\S+)",
filename_regex=r'-OriginalFileName "?(?P<filename>\S+)"?',
datetimeformat="%Y/%m/%d %H:%M:%S.%f",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test seems to be failing because you are asking for fractional seconds. But one of our current test data sets doesn't have this fraction. Are you sure that these datasets all have a fractional second provided. or sometimes they don't. In which case this may need to be different @Antho2422 ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem was that Pegasus v2.2.2 uses different time formats in the opening and closing timestamps. PR#1408 fixes this and should replace this PR. I suggest closing this one for the time being and merging that one.

@zm711
Copy link
Contributor

zm711 commented Feb 27, 2024

@Antho2422, Thanks so much for starting this. As Peter explained there was a problem with the closing time for the header that he fixed in #1408, so that PR was merged. We will close this one, but we really appreciate you getting this process started to improve Neuralynx.

@zm711 zm711 closed this Feb 27, 2024
@zm711 zm711 modified the milestones: 0.14.0, 0.13.4 Oct 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants