Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the dataset BIDS compatible #1

Closed
jcohenadad opened this issue Dec 16, 2022 · 23 comments
Closed

Make the dataset BIDS compatible #1

jcohenadad opened this issue Dec 16, 2022 · 23 comments
Assignees

Comments

@jcohenadad
Copy link
Member

Currently, the naming with "_mean" suffix is not BIDS compatible. I think there are ways to describe processing applied to a time series.

@MerveKaptan
Copy link
Collaborator

MerveKaptan commented Dec 28, 2022

What about this?

data_leipzig_rest
   ├── dataset_description.json
   ├── derivatives
   │   ├── labels
   │   │   └── sub-leipzigR01
   │   │       └── func
   │   │           └── sub-leipzigR01_task-rest_desc-mocomeanseg.nii.gz  <---- Manual spinal cord segmentation
   │   └── moco
   │       └── sub-leipzigR01
   │           └── func
   │               ├── sub-leipzigR01_task-rest_desc-moco_bold.nii.gz     <---- 20 motion-corrected EPI volumes
   │               └── sub-leipzigR01_task-rest_desc-mocomean_bold.nii.gz <---- Mean of motion-corrected volumes 
   ├── participants.json
   ├── participants.tsv
   └── task-rest_bold.json

Source: https://hackmd.io/@effigies/bids-derivatives-readme

@jcohenadad
Copy link
Member Author

Looks like a good start! To be closer to the official examples, I would do:

  • labels --> label
  • sub-leipzigR01_task-rest_desc-mocomeanseg.nii.gz --> sub-leipzigR01_task-rest_desc-spinalcord_mask.nii.gz

@MerveKaptan
Copy link
Collaborator

Okay, perfect, thank you- I will change it and reupload the data!

@MerveKaptan
Copy link
Collaborator

Dear all,
As you know our dataset is consisting of derivatives only. Therefore, OpenNeuro as it is in its current form would not accept it but they are updating their platform and this is the reply I got from them:

Thanks for reaching out. This is quite timely, as we've been working on a validator for derivatives that would allow us to host derivatives-only datasets. The current plan is to roll it out as an option that can be enabled on a per-dataset basis by an admin, and we hope to do that within the next month or so. If you'd like, I can get back in touch when we're ready and we can get it set up for your dataset; this would help us test out our processes for non-admin users.
What I would recommend for the data organization would be to treat it as one coherent dataset, and not two parallel datasets:

data_leipzig_rest
├── dataset_description.json
├── sub-leipzigR01
│   └── func
│       ├── sub-leipzigR01_task-rest_desc-moco_bold.json       <---- sidecar json file containing imaging parameters
│       ├── sub-leipzigR01_task-rest_desc-moco_bold.nii.gz     <---- 20 motion-corrected EPI volumes                 
│       ​├── sub-leipzigR01_task-rest_desc-mocomean_bold.nii.gz <---- Mean of motion-corrected volumes
│       └── sub-leipzigR01_task-rest_desc-spinalcord_mask.nii.gz  <---- Manual spinal cord segmentation
├── participants.json
├── participants.tsv
└── task-rest_bold.json

In the meantime, you can test out the new validator by using the version published to https://deno.land/x/bids_validator:
deno run --allow-read --allow-env https://deno.land/x/bids_validator/bids-validator.ts path/to/dataset
We'd be happy to hear any issues you run into. You can open them here: https://github.com/bids-standard/bids-validator/issues
Best,
Chris

What do you think?

@jcohenadad
Copy link
Member Author

this is the reply I got from them:

could you please cross-ref the GH conversation

@jcohenadad
Copy link
Member Author

As you know our dataset is consisting of derivatives only.

maybe we should consider having the images NOT in the derivatives, but in the common location?

@MerveKaptan
Copy link
Collaborator

As you know our dataset is consisting of derivatives only.

maybe we should consider having the images NOT in the derivatives, but in the common location?

this is the reply I got from them:

could you please cross-ref the GH conversation

Hi Julien,

I am so sorry, I do not understand what I should cross-reference.

@jcohenadad
Copy link
Member Author

jcohenadad commented Aug 8, 2023

@rohanbanerjee pls help with #1 (comment) always put hyperlinks so we can go through the various GH repos-- GH is a public comm platform

@MerveKaptan
Copy link
Collaborator

As you know our dataset is consisting of derivatives only.

maybe we should consider having the images NOT in the derivatives, but in the common location?

I think we can use the 20 volumes as source data and then mask and mean as derivatives! We would need to change the naming of the data sub-leipzigR01_task-rest_desc-moco_bold.nii.gz a bit to make it compatible to be source data!

alternatively, we can think about sharing data here: https://data.mendeley.com/

@jcohenadad
Copy link
Member Author

I think we can use the 20 volumes as source data and then mask and mean as derivatives!

can we consider using the mean in the source data (even though I know this is not really a source...)

alternatively, we can think about sharing data here: https://data.mendeley.com/

no, let's stick with OpenNeuro

@MerveKaptan
Copy link
Collaborator

MerveKaptan commented Aug 8, 2023

Good questions!

I think we can use the 20 volumes as source data and then mask and mean as derivatives!

can we consider using the mean in the source data (even though I know this is not really a source...)

Technically, we can! I had shared one-volume EPI data as the source data (as we just acquired one volume for that specific acquisition). Would you want me to ask these questions to OpenNeuro developers and get back to you?

alternatively, we can think about sharing data here: https://data.mendeley.com/

no, let's stick with OpenNeuro

sure, I also think that would be better!

@jcohenadad
Copy link
Member Author

Would you want me to ask these questions to OpenNeuro developers and get back to you?

yes, ask the question, but instead of getting back to me, please cross-ref the conversation here-- again, GH is a public comm platform and we should all be able to see and participate in the discussion thread with openneuro-- if this is unclear @rohanbanerjee please chat with @MerveKaptan to clarify

@rohanbanerjee
Copy link
Collaborator

Sure, we will open an issue on OpenNeuro GH and post the link here.

@MerveKaptan
Copy link
Collaborator

Sure, we will open an issue on OpenNeuro GH and post the link here.

Thank you both for clarifying! I was not using GH previously, but their ticket system : https://openneuro.freshdesk.com/support/tickets/1525
Now we will move it to GitHub as Rohan suggested!

@jcohenadad
Copy link
Member Author

Thank you both for clarifying! I was not using GH previously, but their ticket system : https://openneuro.freshdesk.com/support/tickets/1525
Now we will move it to GitHub as Rohan suggested!

no-- if they have a ticket system, then you should use their ticket system, or whatever they ask users to use. I was suggesting GH bc i thought they were using GH issues as their user-facing ticket system. On the other hand, I am not able to see your ticket, which is not great... communication alternatives are neurostars.org (or maybe GH issue if they also have that).

Anyhow, at this point it is rather up to us to decide what to do with this dataset and how to convert it to BIDS, right?

@MerveKaptan
Copy link
Collaborator

Hi @jcohenadad,

I think both are fine. As your suggestion, I have moved this to GitHub.

Actually, talking to Chris from the OpenNeuro team helped a lot and I finally have an idea how we can organize the data!

What we need to do is to have a source dataset which will be 20 volume time-series and we could basically organize the derivatives however we would like to!

@MerveKaptan
Copy link
Collaborator

Hi @jcohenadad,

What do you think? Once we decide on an organization, I can re-organize the data as we want to.

I can move the 20 motion-corrected volumes to source data and keep the derivatives as they are.
Alternatively, I can also move both derivatives into one folder (instead of separating them for moco mean and spinal segmentation).

Please do let me know!

@jcohenadad
Copy link
Member Author

the original motivation for having the moco dataset in source is because of #1 (comment) now, if #1 (comment) is fixed, then we can put everything under derivatives/ if it is not, then you split between source (moco data) and derivatives (mean and seg)

@MerveKaptan
Copy link
Collaborator

Thank you, @jcohenadad ! I missed this reply for some reason. Okay, I will ask the OpenNeuro team again and act accordingly.

@MerveKaptan
Copy link
Collaborator

Dear @jcohenadad & @rohanbanerjee,

FYI, I have been testing out a few things, and I believe, it will be easier and quicker to go with the following solution: "split between source (moco data) and derivatives (mean and seg)". I will reorganize the data and start uploading.

That being said, for some sites, we would still need the sidecar .json files as mentioned here .

Thank you!
Merve

@MerveKaptan
Copy link
Collaborator

Hi @jcohenadad & @rohanbanerjee,

Another question about the BIDS organization. Currently, the latest BIDS version does not have an option for multisite data, please see here.

We can either treat each data set separately or combine all the subjects in one dataset and add a site column to participants.tsv file as suggested in the link above.

I believe the second option will be more neat as it is one project. Would you agree?

Thank you,
Merve

@jcohenadad
Copy link
Member Author

Currently, the latest BIDS version does not have an option for multisite data, please see here.

it does, see here and an example for the spine-generic project.

I believe the second option will be more neat as it is one project. Would you agree?

👍

@rohanbanerjee
Copy link
Collaborator

The dataset is BIDS compliant now and has been uploaded to Openneuro. Closing the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants