-
Notifications
You must be signed in to change notification settings - Fork 249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[IO] added support for new pattern of datetime when working with Neuralynx #1370
[IO] added support for new pattern of datetime when working with Neuralynx #1370
Conversation
…alynx data in version 2.2.2 of Pegasus
@PeterNSteinmetz, any chance you could look at this? I have nothing to do with neuralynx. Seems like we can't specifically get a test file so there might not be much we can do yet. Have you worked with the Pegasus version of neurolynx at all? |
@zm711 From what I see in the changes they are applied to the correct two places for a new type. It would of course be better to have a test file from the author. I see that we already have a ephy_testing_data/neuralynx/Pegasus_v2.11 directory with a .nev file. Does this test properly against that data file? Running this test looks like it would require adding it to TestNeuralynxRawIO.entities_to_test in file test/rawiotest/testneuralynxrawio.py . |
I don't know if a header file is sufficient but I may be able to provide one if needed |
@Antho2422 What do you mean by a header file? Normally Neuralynx files are .Ncs, .Nse, .Nse, .Nst, or .Ntt. These each contain a 16k text header at the start. To keep test files short one can usually truncate them to just a few records of the type. So for example, a 5 record .Nse file would be 16384 + 5 * 1044 bytes long. The length of records for each type can be found in https://neuralynx.com/_software/NeuralynxDataFileFormats.pdf . |
@PeterNSteinmetz Yes nevermind I was not aware of that, in our data we extract the header from the .ncs to have also a .txt file containing all the infos in the header. I did my testing on that. But I will check if I can be able to share some small .ncs dataset. Thank you, |
@Antho2422 Yes, even just a single small .Ncs trimmed to 5 or 10 records will work. I can do the trimming if you get it to me. We just like to have an actual sample of what you are working with to put in the public ephy_testing_data repository so the automated tests can run against it. |
Hi all, Sorry for the wait. Here is a small dataset for testing : https://instituteicm-my.sharepoint.com/:f:/g/personal/anthony_pinto_icm-institute_org/EhK9xHJBvNJNjyk2ZDl1PrEBwXlj0HylxlXla8D8j4j-zw?e=N66jTs Keep me updated if this is sufficient. Thank you, |
@Antho2422 Thanks I have downloaded your test data. On first blush it looks like the important parts of the header are the same as some test data we have, but I will likely include some files from your set for testing purposes. |
Hi @PeterNSteinmetz, |
@Antho2422, we need to make sure we have the appropriate test files on our test repo (gin). Then we would run tests and then we can do a final review for merge. I need to setup an account otherwise we have to wait for someone else to move your files onto the test server. Also we ran some formatting since you opened your PR so you now have conflicts that need to be fixed. Could you try to fix those for us? |
Hey @Antho2422, I know you're almost done, but I should've told you. We started using black. So if you just run your files with black it should fix all conflicts without having to go through everything. |
@zm711 sorry I am not used to resolve merge conflict via github I am not sure how to resolve this last one.. |
I'm a little busy for the next couple days, but I'll come back and try to help as soon as I can :) |
@zm711 I have now begun a series of pull requests #1403 and #1404 plus others to address this and more generalized header parsing. I would suggest we merge this before those. This change does allow the header of these Pegasus files to load. However, I have been checking and despite loading the header, these files are causing problems with finding blocks of continuous records of data. That is likely due to some strangeness in how Pegasus is dealing with sampling rate and timestamps in the records in the file. I am investigating presently. |
@PeterNSteinmetz, thanks. I think you're the resident neuralynx person, but if you normally work with someone else on the team feel free to ping them. I don't use neuralynx at all, so I would love to make sure I understand what is going on before I merge anything. Were you able to add all the test files to gin? |
@Antho2422 and @PeterNSteinmetz. Tests are failing, do you guys know why? |
@zm711 No I don't. When I check out the master branch here it runs all the tests fine. |
@zm711 The testing file additions are a problem. I don't know how we are supposed to add them, etc. See my thread in the google group. |
@Antho2422 So here is the issue that I am now seeing with the Pegasus v2.2.2 files you provided. If one looks carefully at the records in the file, Pegasus is causing little jumps in the timestamps about 20-30 microseconds in length. What this means for this code is that the NeuralyxRawIO will create multiple Segments because a Segment is one continuous block of samples equally spaced in time. For example, the ..._ieeg_e01c02.ncs file provided ends up having 13 segments. If you are fine with that then no further changes will be needed. How have you been viewing the data previously? |
I haven't set up my gin either :( . Sorry. I try to set it up soon so I can work on this. But I agree on your google group comment that the instructions are hard to follow/parse. |
@Antho2422 One other thing. When adding these test files we would like to know how the attribution in the README.txt should read. Please see other examples for other type subdirectories. @zm711 For the merge of this one, I guess we have to wait until Anthony has finished with the checks, hopefully in a few days. |
Hi @PeterNSteinmetz and @zm711 ! Thank you for working on that. I did not notice those jumps you mentioned but tbh I'm not really familiar with the data structure of Neuralynx. I don't know if this can cause further issues but I normally just concatenate the segments when working with multiple segments. |
@PeterNSteinmetz I will do that right now |
@PeterNSteinmetz here is the README file. Please tell me if this is enough or if I need to add more information to it. |
@Antho2422 If you are going to mention the recording from a patient, it raises a question. Has this data been de-identified or anonymized and approved as such by the local IRB? Otherwise we should likely de-identify fully. I see that the dates have been set to 1970. I have already shortened the file name on the first two which I will use and can change them further. I would suggest we also change or remove the -FileUUID and -Session UUID fields. I can readily do that and change the README.txt to state the data have been de-identified. We could also say it has been anonymized if your lab destroys all records of the correspondence between these files and the patient. Otherwise de-identification is likely good enough. This pull request now just waits on completing the changes you were making and fixing the tests issue. |
@Antho2422 we do not have permission to open publically data from epileptic patient in France!! |
@samuelgarcia @PeterNSteinmetz The data have been fully anonymized. I can verify if you want but I got the authorization to share it. I will double check and get the necessary documents if needed |
@Antho2422 The presence of the file and session UUIDs seem to me a possible way to re-identify the patient and that would violate anonymization. But if you are confident with the members of the lab that it is all right to share, I am fine with that. |
I think the top priority here should be on getting the checks to pass on this PR. The question of the test and the test data can be left for a different PR. That will get Anthony up and running using the master branch. |
datetime1_regex=r"-TimeCreated (?P<date>\S+) (?P<time>\S+)", | ||
datetime2_regex=r"-TimeClosed (?P<date>\S+) (?P<time>\S+)", | ||
filename_regex=r'-OriginalFileName "?(?P<filename>\S+)"?', | ||
datetimeformat="%Y/%m/%d %H:%M:%S.%f", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test seems to be failing because you are asking for fractional seconds. But one of our current test data sets doesn't have this fraction. Are you sure that these datasets all have a fractional second provided. or sometimes they don't. In which case this may need to be different @Antho2422 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem was that Pegasus v2.2.2 uses different time formats in the opening and closing timestamps. PR#1408 fixes this and should replace this PR. I suggest closing this one for the time being and merging that one.
@Antho2422, Thanks so much for starting this. As Peter explained there was a problem with the closing time for the header that he fixed in #1408, so that PR was merged. We will close this one, but we really appreciate you getting this process started to improve Neuralynx. |
… data in version 2.2.2 of Pegasus. This is fixing the error when trying to parse the date time with the wrong format.