`GmSegChallenge2016` dataset BIDSification #199

valosekj · 2023-01-02T19:57:20Z

This PR discusses the BIDSification of GmSegChallenge2016 dataset.

Input datasets (training and testing) and the BIDSified dataset are located in ~/extrassd1/janvalosek/GmSegChallenge2016.

Related to #195

Input dataset structure

training dataset - 10 subjects from each site (10 x 4 = 40 subjects). Each subject has:
- image.nii.gz representing T2star.nii.gz image
- four mask-r files with manual labels of WM (pixel value 2) and GM (pixel value 1)
- levels.txt file
testing dataset - 10 subjects from each site (10 x 4 = 40 subjects). Each subject has:
- image.nii.gz representing T2star.nii.gz image
- levels.txt file

├── training-data-gm-sc-challenge-ismrm16-v20160302b
│   ├── license.txt
│   ├── site1-sc01-image.nii.gz
│   ├── site1-sc01-levels.txt
│   ├── site1-sc01-mask-r1.nii.gz
│   ├── site1-sc01-mask-r2.nii.gz
│   ├── site1-sc01-mask-r3.nii.gz
│   ├── site1-sc01-mask-r4.nii.gz
│   ├── ...
└── test-data-gm-sc-challenge-ismrm16-v20160401
    ├── license.txt
    ├── site1-sc11-image.nii.gz
    ├── site1-sc11-levels.txt
    ├── ...

Output dataset structure

├── code
│   └── curate_gmsegchallenge2016.py
├── dataset_description.json
├── derivatives
│   ├── dataset_description.json
│   └── manual_labels
│       ├── sub-epm001
│       │   └── anat
│       │       ├── sub-epm001_T2star_label-vertebral-levels.txt
│       │       ├── sub-epm001_T2star_label-GM_mask1.json
│       │       ├── sub-epm001_T2star_label-GM_mask1.nii.gz
│       │       ├── sub-epm001_T2star_label-GM_mask2.json
│       │       ├── sub-epm001_T2star_label-GM_mask2.nii.gz
...
│       │       ├── sub-epm001_T2star_label-SC_mask1.json
│       │       ├── sub-epm001_T2star_label-SC_mask1.nii.gz
...
│       │       ├── sub-epm001_T2star_label-WM_mask1.json
│       │       ├── sub-epm001_T2star_label-WM_mask1.nii.gz
...
├── LICENSE
├── participants.json
├── participants.tsv
├── sub-epm001
│   └── anat
│       ├── sub-epm001_T2star.json
│       └── sub-epm001_T2star.nii.gz
├── sub-epm002
│   └── anat
│       ├── sub-epm002_T2star.json
│       └── sub-epm002_T2star.nii.gz
...

SC, WM and GM segmentations

I re-created SC, WM and GM segmentations from the input manual labels contacting both WM (pixel value 2) and GM (pixel value 1) segmentations using sct_maths. Also, I created json sidecars for each segmentation containing the rater's name.

SCseg (-bin 0):

data-management/scripts/curate_gmsegchallenge2016.py

Lines 144 to 150 in 5c99d23

    
           os.system('sct_maths -i ' + path_file_in + ' -bin 0 -o ' + path_file_out) 
        
           logger.info(f'Using {path_file_in} to create SCseg: {path_file_out}') 
        
           # Create a json sidecar 
        
           data_json = { 
        
               "Author": rater, 
        
               "Label": "SC-seg-manual" 
        
               }

GMseg (-uthr 1):

data-management/scripts/curate_gmsegchallenge2016.py

Lines 170 to 176 in 5c99d23

    
           os.system('sct_maths -i ' + path_file_in + ' -uthr 1 -o ' + path_file_out) 
        
           logger.info(f'Using {path_file_in} to create GMseg: {path_file_out}') 
        
           # Create a json sidecar 
        
           data_json = { 
        
               "Author": rater, 
        
               "Label": "GM-seg-manual" 
        
               }

WMseg (-bin 1):

data-management/scripts/curate_gmsegchallenge2016.py

Lines 118 to 124 in 5c99d23

    
           os.system('sct_maths -i ' + path_file_in + ' -bin 1 -o ' + path_file_out) 
        
           logger.info(f'Using {path_file_in} to create WMseg: {path_file_out}') 
        
           # Create a json sidecar 
        
           data_json = { 
        
               "Author": rater, 
        
               "Label": "WM-seg-manual" 
        
           }

Manual raters

Based on GM mask delineation section from Prados et al., 2017:

Rater 1 (MY) and rater 3 (GD), ...
Rater 2 (SMD) and 4 (BL) ...

I included the following rater's names in the json sidecars:

data-management/scripts/curate_gmsegchallenge2016.py

Lines 62 to 67 in 5c99d23

    
           rater_to_name = { 
        
               1: 'Marios C. Yiannakas', 
        
               2: 'Sara M. Dupont', 
        
               3: 'Gergely David', 
        
               4: 'Bailey Lyttle' 
        
               }

Additional files

dataset_description.json, participants.json, participants.tsv, and README are attached to this PR to allow easy feedback. @jcohenadad could you please add the contact person and email conversion to the README?
LICENSE is a copy of the licence from the input dataset.

UPDATE 2023-01-03: The *-levels.txt files containing information about the vertebral levels were copied to derivatives/manual_labels. To be clear that these txt files contain info about vertebral levels and not about discs, I choose the following filename: *_T2star_label-vertebral-levels.txt, for example: sub-epm001_T2star_label-vertebral-levels.txt.

…d README.md for GmSegChallenge2016 dataset (to allow easy feedback).

jcohenadad · 2023-01-02T21:55:39Z

I have been considering the creation of vertebral levels nii files using sct_label_vertebrae or sct_labels_utils.

I'm not sure these data will be used to train DL model for disc labeling because the discs are poorly visible on these axial GRE data. So maybe it's ok to keep the disc labels as is.

jcohenadad · 2023-01-02T21:56:23Z

@valosekj the PR is still in 'draft' mode but you requested a review from me. Is it ready for review?

… `desc-manual` from segmentation file names.

…vatives/manual_labels`

…s bids-validator).

valosekj · 2023-01-03T16:38:11Z

I'm not sure these data will be used to train DL model for disc labeling because the discs are poorly visible on these axial GRE data. So maybe it's ok to keep the disc labels as is.

Okay. I kept the labeling as txt files and placed them under derivatives/manual_labels/xxx/anat/:

├── derivatives
│   └── manual_labels
│       ├── sub-epm001
│       │   └── anat
│       │       ├── sub-epm001_T2star_label-vertebral-levels.txt
│       │       ├── sub-epm001_T2star_label-GM_mask1.json
│       │       ├── sub-epm001_T2star_label-GM_mask1.nii.gz
...

To be clear that these txt files contain info about vertebral levels and not about discs, I choose the following filename: *_T2star_label-vertebral-levels.txt.

valosekj · 2023-01-03T16:45:21Z

@valosekj the PR is still in 'draft' mode but you requested a review from me. Is it ready for review?

Sorry about that.
Based on #195 (comment), I renamed derivatives/labels to derivatives/manual_labels and got rid of the desc-manual.
Now, the PR is ready for review.

jcohenadad

I haven't tested the script to generate the separate labels but overall it looks good! thank you @valosekj 🙏

valosekj · 2023-01-03T19:16:10Z

I haven't tested the script to generate the separate labels but overall it looks good! thank you @valosekj 🙏

Okay! Thank you. I will upload the dataset to git-annex. Then I will close this PR without merging and delete the branch.

Regarding the README; can I add Ferran Prados as a contact person?

jcohenadad · 2023-01-03T20:21:59Z

Regarding the README; can I add Ferran Prados as a contact person?

👍

valosekj · 2023-01-05T17:28:16Z

The dataset uploaded to git-annex --> closing this PR.

valosekj added 3 commits January 2, 2023 11:43

Add curation script for GmSegChallenge2016 dataset

5c99d23

Fix typo in derivatives filenames

85d59fe

Add dataset_description.json, participants.json, participants.tsv, an…

25fadf1

…d README.md for GmSegChallenge2016 dataset (to allow easy feedback).

valosekj added the dataset: GmSegChallenge2016 label Jan 2, 2023

valosekj requested a review from jcohenadad January 2, 2023 20:04

valosekj mentioned this pull request Jan 2, 2023

Move GmsegChallenge outside of sct-testing-large #195

Open

4 tasks

valosekj added 6 commits January 2, 2023 17:25

Rename derivatives/labels to derivatives/manual_masks. Get rid of…

8a83163

… `desc-manual` from segmentation file names.

Deal with txt file containing vertebral levels.

4c4c6c7

Save manual segmentations as well as txt vertebral levels under `deri…

ca9dde3

…vatives/manual_labels`

Create a json sidecar (with scanner params) for T2star.nii.gz images

4e9d46a

Add last newline to json sidecars

aeee21c

Make array from dataset_description.json's ReferencesAndLinks (to pas…

af8b4a1

…s bids-validator).

valosekj marked this pull request as ready for review January 3, 2023 16:45

jcohenadad approved these changes Jan 3, 2023

View reviewed changes

valosekj added 2 commits January 3, 2023 16:25

Add contact person

c9d6ce9

Remove forgotten _desc-manual from comments

9cf24c9

valosekj closed this Jan 5, 2023

valosekj deleted the jv/curate_gmsegchallenge2016 branch January 5, 2023 17:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`GmSegChallenge2016` dataset BIDSification #199

`GmSegChallenge2016` dataset BIDSification #199

valosekj commented Jan 2, 2023 •

edited

Loading

jcohenadad commented Jan 2, 2023

jcohenadad commented Jan 2, 2023

valosekj commented Jan 3, 2023 •

edited

Loading

valosekj commented Jan 3, 2023

jcohenadad left a comment

valosekj commented Jan 3, 2023

jcohenadad commented Jan 3, 2023

valosekj commented Jan 5, 2023

	os.system('sct_maths -i ' + path_file_in + ' -bin 0 -o ' + path_file_out)
	logger.info(f'Using {path_file_in} to create SCseg: {path_file_out}')
	# Create a json sidecar
	data_json = {
	"Author": rater,
	"Label": "SC-seg-manual"
	}

	rater_to_name = {
	1: 'Marios C. Yiannakas',
	2: 'Sara M. Dupont',
	3: 'Gergely David',
	4: 'Bailey Lyttle'
	}

GmSegChallenge2016 dataset BIDSification #199

GmSegChallenge2016 dataset BIDSification #199

Conversation

valosekj commented Jan 2, 2023 • edited Loading

Input dataset structure

Output dataset structure

SC, WM and GM segmentations

Manual raters

Additional files

jcohenadad commented Jan 2, 2023

jcohenadad commented Jan 2, 2023

valosekj commented Jan 3, 2023 • edited Loading

valosekj commented Jan 3, 2023

jcohenadad left a comment

Choose a reason for hiding this comment

valosekj commented Jan 3, 2023

jcohenadad commented Jan 3, 2023

valosekj commented Jan 5, 2023

`GmSegChallenge2016` dataset BIDSification #199

`GmSegChallenge2016` dataset BIDSification #199

valosekj commented Jan 2, 2023 •

edited

Loading

valosekj commented Jan 3, 2023 •

edited

Loading