Skip to content

Utilities for preparation and loading of PET-CT image data to be used for training tumor segmentation models.

Notifications You must be signed in to change notification settings

Maastro-CDS-Imaging-Group/PET-CT-data-pipeline

Repository files navigation

Notebooks, scripts and utilities for PET-CT data processing

For segmentation of Head-and-Neck gross tumor volume using deep learning-based PET-CT fusion models.


Steps involved

Preparing the data

  1. Set a required physical volume size starting from the top (of the head) for the scans. Physical volume used is (450 x 450 x 300) mm3 in the (W,H,D) format. This size is chosen so that majority of the Head-and-Neck region is included. The generated bounding boxes of this size for each image are written in a CSV file.

    $ python cli_generate_bboxes.py  --source_dir ...   
                                     --bbox_filepath ...  
                                     --output_phy_size 450 450 300
    
  2. Crop the images according to their bounding box's physical coordinates and resample to the specified spacing/resolution. The voxel spacing used is 1mm x 1mm x 3mm, i.e. inplane resolution of 1mm x 1mm and a slice thickness of 3mm. A cropped+resampled version of the dataset is created and written on the disk.

    $ python cli_hktr_crop_and_resample.py  --source_dir ...  
                                            --target_dir ...  
                                            --bbox_filepath ...  
                                            --new_spacing 1 1 3  
                                            --cores 24  
                                            --order 3
    

Codename used for the outputs (and related items) of this step is "crFH_rs113" (cropped keeping Full Head, resampled to 1x1x3). The images thus obtained have a physical volume of (450 x 450 x 300) mm3 and an array size of (450 x 450 x 100) voxels.

Patient Dataset

Defines a dataset which fetches and returns paired input-ouput full volumes of a given patient (index). Performs modality-specific preprocessing and transforms. Subclass of torch.utils.data.Dataset.

Processing involved:

  1. Smoothing

    • Mainly for PET, to remove/reduce PSF reconstruction induced overshoot artifacts.
  2. Intensity standardization

    • 2 options: Intensity clipping or histogram standardization
    • Clipping for CT is set to [-150,150] HU and for PET [0,20] SUV by default.
    • Clipping + Histogram standardization [Histogram paper | TorchIO Histogram standardization]: Computing a mean histogram using the training samples, and distorting the histograms of the images to match this mean histogram using piece-wise linear contrast adjustment. To be done separately for PET and CT, obviously.
  3. Augmentation transform

    • Spatial: Random rotation(+-10 degrees), scaling(+-15%), elastic distortion.
    • Intensity: Contrast stretching in PET between 30 and 95 percentile SUV range.
    • One of these 4 applied with equal probability.
  4. Rescaling intensities to [0,1] range Min-max normalization, where min and max values are obtained from the volume.

Patch Queue

Combined with a patch sampler, the patch queue creates, stores and returns randomly sampled paired input-output patches of given size. Code adapted from TorchIO Queue source code. The PatchQueue class is derived from torch.data.utils.Dataset. See the GIF on the linked page for working mechanism.

Patch loader

Used with the patch queue to create batches of patches for training. Regular torch dataloader instance.


Libraries for IO and data processing

  • For DICOM: Pydicom
  • For multiple formats: SimpleITK - Complete toolkit for N-dim scientific image-processing

Software tools


Resources

Documentation

Code repositories

Online articles

About

Utilities for preparation and loading of PET-CT image data to be used for training tumor segmentation models.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages