Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move GmsegChallenge outside of sct-testing-large #195

Open
3 of 4 tasks
valosekj opened this issue Dec 13, 2022 · 6 comments
Open
3 of 4 tasks

Move GmsegChallenge outside of sct-testing-large #195

valosekj opened this issue Dec 13, 2022 · 6 comments
Assignees
Labels

Comments

@valosekj
Copy link
Member

valosekj commented Dec 13, 2022

After a discussion with @jcohenadad, we decided to move GmsegChallenge subjects outside of sct-testing-large.

This will include the following steps:

  • modification of sct-testing-large dataset - removal of GmsegChallenge nii files and entries from participants.tsv
  • creation of a new git-annex gmseg-challenge dataset
  • adding all 80 subjects to the gmseg-challenge dataset (currently, there are only 50 GmsegChallenge subjects in sct-testing-large)
  • update of the G Sheets dataset table
@valosekj valosekj added the dataset: sct-testing-large Dataset name: `sct-testing-large` label Dec 13, 2022
@valosekj valosekj self-assigned this Dec 28, 2022
@valosekj
Copy link
Member Author

TLDR: sct-testing-large contains SC and GM seg, but the provided data contains WM seg from 4 raters.


sct-testing-large currently contains 50 from the total of 80 GmsegChallenge subjects --> 10 UCL and 20 Montreal subjects are missing:

see participants.tsv
$ column_tab participants.tsv | grep GmsegChallenge                                                        
sub-uclGmsegChallenge001      n/a  n/a  unknown           unknown        n/a            n/a           HC                                 ucl_gmsegChallenge16_01                  uclGmsegChallenge
sub-uclGmsegChallenge002      n/a  n/a  unknown           unknown        n/a            n/a           HC                                 ucl_gmsegChallenge16_02                  uclGmsegChallenge
sub-uclGmsegChallenge003      n/a  n/a  unknown           unknown        n/a            n/a           HC                                 ucl_gmsegChallenge16_03                  uclGmsegChallenge
sub-uclGmsegChallenge004      n/a  n/a  unknown           unknown        n/a            n/a           HC                                 ucl_gmsegChallenge16_04                  uclGmsegChallenge
sub-uclGmsegChallenge005      n/a  n/a  unknown           unknown        n/a            n/a           HC                                 ucl_gmsegChallenge16_05                  uclGmsegChallenge
sub-uclGmsegChallenge006      n/a  n/a  unknown           unknown        n/a            n/a           HC                                 ucl_gmsegChallenge16_06                  uclGmsegChallenge
sub-uclGmsegChallenge007      n/a  n/a  unknown           unknown        n/a            n/a           HC                                 ucl_gmsegChallenge16_07                  uclGmsegChallenge
sub-uclGmsegChallenge008      n/a  n/a  unknown           unknown        n/a            n/a           HC                                 ucl_gmsegChallenge16_08                  uclGmsegChallenge
sub-uclGmsegChallenge009      n/a  n/a  unknown           unknown        n/a            n/a           HC                                 ucl_gmsegChallenge16_09                  uclGmsegChallenge
sub-uclGmsegChallenge010      n/a  n/a  unknown           unknown        n/a            n/a           HC                                 ucl_gmsegChallenge16_10                  uclGmsegChallenge
sub-vanderbiltGmChallenge001  n/a  n/a  unknown           unknown        n/a            n/a           HC                                 vanderbilt_gmsegChallenge16_01           vanderbiltGmsegChallenge
sub-vanderbiltGmChallenge002  n/a  n/a  unknown           unknown        n/a            n/a           HC                                 vanderbilt_gmsegChallenge16_02           vanderbiltGmsegChallenge
sub-vanderbiltGmChallenge003  n/a  n/a  unknown           unknown        n/a            n/a           HC                                 vanderbilt_gmsegChallenge16_03           vanderbiltGmsegChallenge
sub-vanderbiltGmChallenge004  n/a  n/a  unknown           unknown        n/a            n/a           HC                                 vanderbilt_gmsegChallenge16_04           vanderbiltGmsegChallenge
sub-vanderbiltGmChallenge005  n/a  n/a  unknown           unknown        n/a            n/a           HC                                 vanderbilt_gmsegChallenge16_05           vanderbiltGmsegChallenge
sub-vanderbiltGmChallenge006  n/a  n/a  unknown           unknown        n/a            n/a           HC                                 vanderbilt_gmsegChallenge16_06           vanderbiltGmsegChallenge
sub-vanderbiltGmChallenge007  n/a  n/a  unknown           unknown        n/a            n/a           HC                                 vanderbilt_gmsegChallenge16_07           vanderbiltGmsegChallenge
sub-vanderbiltGmChallenge008  n/a  n/a  unknown           unknown        n/a            n/a           HC                                 vanderbilt_gmsegChallenge16_08           vanderbiltGmsegChallenge
sub-vanderbiltGmChallenge009  n/a  n/a  unknown           unknown        n/a            n/a           HC                                 vanderbilt_gmsegChallenge16_09           vanderbiltGmsegChallenge
sub-vanderbiltGmChallenge010  n/a  n/a  unknown           unknown        n/a            n/a           HC                                 vanderbilt_gmsegChallenge16_10           vanderbiltGmsegChallenge
sub-vanderbiltGmChallenge011  n/a  n/a  unknown           unknown        n/a            n/a           HC                                 vanderbilt_gmsegChallenge16_11           vanderbiltGmsegChallenge
sub-vanderbiltGmChallenge012  n/a  n/a  unknown           unknown        n/a            n/a           HC                                 vanderbilt_gmsegChallenge16_12           vanderbiltGmsegChallenge
sub-vanderbiltGmChallenge013  n/a  n/a  unknown           unknown        n/a            n/a           HC                                 vanderbilt_gmsegChallenge16_13           vanderbiltGmsegChallenge
sub-vanderbiltGmChallenge014  n/a  n/a  unknown           unknown        n/a            n/a           HC                                 vanderbilt_gmsegChallenge16_14           vanderbiltGmsegChallenge
sub-vanderbiltGmChallenge015  n/a  n/a  unknown           unknown        n/a            n/a           HC                                 vanderbilt_gmsegChallenge16_15           vanderbiltGmsegChallenge
sub-vanderbiltGmChallenge016  n/a  n/a  unknown           unknown        n/a            n/a           HC                                 vanderbilt_gmsegChallenge16_16           vanderbiltGmsegChallenge
sub-vanderbiltGmChallenge017  n/a  n/a  unknown           unknown        n/a            n/a           HC                                 vanderbilt_gmsegChallenge16_17           vanderbiltGmsegChallenge
sub-vanderbiltGmChallenge018  n/a  n/a  unknown           unknown        n/a            n/a           HC                                 vanderbilt_gmsegChallenge16_18           vanderbiltGmsegChallenge
sub-vanderbiltGmChallenge019  n/a  n/a  unknown           unknown        n/a            n/a           HC                                 vanderbilt_gmsegChallenge16_19           vanderbiltGmsegChallenge
sub-vanderbiltGmChallenge020  n/a  n/a  unknown           unknown        n/a            n/a           HC                                 vanderbilt_gmsegChallenge16_20           vanderbiltGmsegChallenge
sub-zurichGmsegChallenge001   n/a  n/a  unknown           unknown        n/a            n/a           HC                                 zurich_gmsegChallenge16_01               zurichGmsegChallenge
sub-zurichGmsegChallenge002   n/a  n/a  unknown           unknown        n/a            n/a           HC                                 zurich_gmsegChallenge16_02               zurichGmsegChallenge
sub-zurichGmsegChallenge003   n/a  n/a  unknown           unknown        n/a            n/a           HC                                 zurich_gmsegChallenge16_03               zurichGmsegChallenge
sub-zurichGmsegChallenge004   n/a  n/a  unknown           unknown        n/a            n/a           HC                                 zurich_gmsegChallenge16_04               zurichGmsegChallenge
sub-zurichGmsegChallenge005   n/a  n/a  unknown           unknown        n/a            n/a           HC                                 zurich_gmsegChallenge16_05               zurichGmsegChallenge
sub-zurichGmsegChallenge006   n/a  n/a  unknown           unknown        n/a            n/a           HC                                 zurich_gmsegChallenge16_06               zurichGmsegChallenge
sub-zurichGmsegChallenge007   n/a  n/a  unknown           unknown        n/a            n/a           HC                                 zurich_gmsegChallenge16_07               zurichGmsegChallenge
sub-zurichGmsegChallenge008   n/a  n/a  unknown           unknown        n/a            n/a           HC                                 zurich_gmsegChallenge16_08               zurichGmsegChallenge
sub-zurichGmsegChallenge009   n/a  n/a  unknown           unknown        n/a            n/a           HC                                 zurich_gmsegChallenge16_09               zurichGmsegChallenge
sub-zurichGmsegChallenge010   n/a  n/a  unknown           unknown        n/a            n/a           HC                                 zurich_gmsegChallenge16_10               zurichGmsegChallenge
sub-zurichGmsegChallenge011   n/a  n/a  unknown           unknown        n/a            n/a           HC                                 zurich_gmsegChallenge16_11               zurichGmsegChallenge
sub-zurichGmsegChallenge012   n/a  n/a  unknown           unknown        n/a            n/a           HC                                 zurich_gmsegChallenge16_12               zurichGmsegChallenge
sub-zurichGmsegChallenge013   n/a  n/a  unknown           unknown        n/a            n/a           HC                                 zurich_gmsegChallenge16_13               zurichGmsegChallenge
sub-zurichGmsegChallenge014   n/a  n/a  unknown           unknown        n/a            n/a           HC                                 zurich_gmsegChallenge16_14               zurichGmsegChallenge
sub-zurichGmsegChallenge015   n/a  n/a  unknown           unknown        n/a            n/a           HC                                 zurich_gmsegChallenge16_15               zurichGmsegChallenge
sub-zurichGmsegChallenge016   n/a  n/a  unknown           unknown        n/a            n/a           HC                                 zurich_gmsegChallenge16_16               zurichGmsegChallenge
sub-zurichGmsegChallenge017   n/a  n/a  unknown           unknown        n/a            n/a           HC                                 zurich_gmsegChallenge16_17               zurichGmsegChallenge
sub-zurichGmsegChallenge018   n/a  n/a  unknown           unknown        n/a            n/a           HC                                 zurich_gmsegChallenge16_18               zurichGmsegChallenge
sub-zurichGmsegChallenge019   n/a  n/a  unknown           unknown        n/a            n/a           HC                                 zurich_gmsegChallenge16_19               zurichGmsegChallenge
sub-zurichGmsegChallenge020   n/a  n/a  unknown           unknown        n/a            n/a           HC                                 zurich_gmsegChallenge16_20               zurichGmsegChallenge 

Each sct-testing-large/*GmsegChallenge* subject has T2star .nii.gz and .json file:

$ tree sub-uclGmsegChallenge001/anat                                                                        
sub-uclGmsegChallenge001/anat
├── sub-uclGmsegChallenge001_T2star.json
└── sub-uclGmsegChallenge001_T2star.nii.gz

And SC and GM seg:

$ tree derivatives/labels/sub-uclGmsegChallenge001/anat                                                     
derivatives/labels/sub-uclGmsegChallenge001/anat
├── sub-uclGmsegChallenge001_T2star_gmseg-manual.json
├── sub-uclGmsegChallenge001_T2star_gmseg-manual.nii.gz
├── sub-uclGmsegChallenge001_T2star_seg-manual.json
└── sub-uclGmsegChallenge001_T2star_seg-manual.nii.gz

However, the data I downloaded from the provided links contains differently structured derivatives. Namely, the WM seg from 4 raters is available for each subject:

$ ls -1 site1-sc01*
site1-sc01-image.nii.gz
site1-sc01-levels.txt
site1-sc01-mask-r1.nii.gz
site1-sc01-mask-r2.nii.gz
site1-sc01-mask-r3.nii.gz
site1-sc01-mask-r4.nii.gz

This leads me to the idea of creating a new git-annex gmseg-challenge dataset (as mentioned above) but keeping the current sct-testing-large/*GmsegChallenge* subjects under sct-testing-large (instead of removing them as mentioned above).

@jcohenadad
Copy link
Member

Thank you for the excellent detective work @valosekj

However, the data I downloaded from the provided links contains differently structured derivatives. Namely, the WM seg from 4 raters is available for each subject:

Indeed, we modified the original naming to be BIDS compatible. Also, we created GM and SC segs out of the WM segmentations because this is what we trained our models on.

but keeping the current sct-testing-large/GmsegChallenge subjects under sct-testing-large (#195 (comment)).

I'm not a big fan of this suggestion, because it will duplicate data, which could cause methodological issues in the future. Eg: if a student trains a model using "sct-testing-large/*GmsegChallenge" and "gmseg-challenge", the same data could appear in the training and testing split. I would be inclined to discard the "sct-testing-large/*GmsegChallenge", and re-create the SC and GM labels (eg: by using flood filling algorithm to get SC, and then subtracting the SC from the WM mask).

More details about the mask generation in the original paper from Prados et al. Neuroimage, in section "GM mask delineation". However, it would be good to contact the organizer (Ferran Prados) and ask how the GM and SC masks are generated from the WM masks, Because, e.g. if the GM mask touches the edge of the SC mask, then a flood-fill algorithm would fail.

@valosekj
Copy link
Member Author

Thank you very much for the explanation!

I would be inclined to discard the "sct-testing-large/*GmsegChallenge", and re-create the SC and GM labels

I agree.
Moreover, during the first check of the data, I missed the fact that the provided segmentations actually contain both WM (pixel value 2) and GM (pixel value 1):

image

Thus, the SC and GM labels can be easily re-created using simple thresholding. --> I will do it.

@jcohenadad
Copy link
Member

I'm in favor of updating our conventions (also here) for derivatives/labels, see notably: sct-pipeline/fmri-segmentation#1.

so, in this case, it would look something like this:

derivatives/label/xxx/anat/xxx_T1w_label-SC_desc-manual_mask.nii.gz
derivatives/label/xxx/anat/xxx_T1w_label-GM_desc-manual_mask.nii.gz
derivatives/label/xxx/anat/xxx_T1w_label-WM_desc-manual_mask.nii.gz

Sources:

@valosekj
Copy link
Member Author

valosekj commented Jan 2, 2023

I'm in favor of updating our conventions (also here) for derivatives/labels, see notably: sct-pipeline/fmri-segmentation#1.

It sounds good!

so, in this case, it would look something like this:

derivatives/label/xxx/anat/xxx_T1w_label-SC_desc-manual_mask.nii.gz
derivatives/label/xxx/anat/xxx_T1w_label-GM_desc-manual_mask.nii.gz
derivatives/label/xxx/anat/xxx_T1w_label-WM_desc-manual_mask.nii.gz

I have used this convention in #199. Once we agree on that, I will update ivadomed and intranet websites.

Based on "Derivative data types" section here, maybe we could use also derivatives/manual_masks instead of derivatives/label(s)

@jcohenadad
Copy link
Member

Based on "Derivative data types" section here, maybe we could use also derivatives/manual_masks instead of derivatives/label(s)

Excellent idea. In this case, we could get rid of the desc-manual:

derivatives/manual_labels/xxx/anat/xxx_T1w_label-SC_mask.nii.gz
derivatives/manual_labels/xxx/anat/xxx_T1w_label-GM_mask.nii.gz
derivatives/manual_labels/xxx/anat/xxx_T1w_label-WM_mask.nii.gz

And I suggest changing manual_masks for manual_labels in case we have disc labels (which are not masks).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants