Add decompression preprocessing step to `TotalSegmentator2D` for more efficient slice loading #705

nkaenzig · 2024-11-12T11:09:08Z

Closes #704

The issue for the slow data loading was due to the .gz compression used for the ct and mask .nii.gz files combined with reading individual slices. While reading the first few slices from compressed niftis using nibabel is fast, the deeper the slice index, the slower it gets, because each time you read a slice a sequential decompression of the file occurs. When unpacking the .nii.gz file beforehand, and then reading the .nii, reading deeper slices becomes much faster.
This is especially an issue for reading the data in 2D fashion, because if you do 3D you might read the whole CT only once, so you have to decompress only once.

This PR reduces the runtime for iterating over the complete dataset from 1h to 2-3min, using a torch dataloader with 16 workers.

…appings dict

nkaenzig added 4 commits November 8, 2024 14:46

added multithreading to optimize mask step & added simplified class m…

6634638

…appings dict

started decompress logic

e8a2b63

decompress logic moved to prepare_data

77cd94d

add decompression preprocessing step to total segmentator dataset

834f981

nkaenzig linked an issue Nov 12, 2024 that may be closed by this pull request

Slow dataloader iterations for TotalSegmentator2D dataset #704

Closed

nkaenzig marked this pull request as ready for review November 12, 2024 11:09

ioangatop approved these changes Nov 12, 2024

View reviewed changes

nkaenzig added 4 commits November 12, 2024 13:10

updated color logic in segmentation logger callback

3533c99

fmt lint

986c923

fixed test

246c41a

fixed to_size in offline mode

4a0edb1

nkaenzig enabled auto-merge (squash) November 12, 2024 14:08

nkaenzig merged commit 50c90a3 into main Nov 12, 2024
6 checks passed

nkaenzig deleted the optimize-total-segmentator-dataset branch November 12, 2024 14:16

nkaenzig self-assigned this Nov 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add decompression preprocessing step to `TotalSegmentator2D` for more efficient slice loading #705

Add decompression preprocessing step to `TotalSegmentator2D` for more efficient slice loading #705

nkaenzig commented Nov 12, 2024 •

edited

Loading

Add decompression preprocessing step to TotalSegmentator2D for more efficient slice loading #705

Add decompression preprocessing step to TotalSegmentator2D for more efficient slice loading #705

Conversation

nkaenzig commented Nov 12, 2024 • edited Loading

Add decompression preprocessing step to `TotalSegmentator2D` for more efficient slice loading #705

Add decompression preprocessing step to `TotalSegmentator2D` for more efficient slice loading #705

nkaenzig commented Nov 12, 2024 •

edited

Loading