Authors: Apavou Clément & Belkada Younes
The kaggle challenge is the following : https://www.kaggle.com/c/mvadlmi/leaderboard
With more than 1 million new diagnoses reported every year, prostate cancer (PCa) is the second most common cancer among males worldwide that results in more than 350,000 deaths annually. The key to decreasing mortality is developing more precise diagnostics. Diagnosis of PCa is based on the grading of prostate tissue biopsies. These tissue samples are examined by a pathologist and scored according to the Gleason grading system. In this challenge, you will develop models for detecting PCa on images of prostate tissue samples, and estimate severity of the disease using the most extensive multi-center dataset on Gleason grading yet available.
The grading process consists of finding and classifying cancer tissue into so-called Gleason patterns (3, 4, or 5) based on the architectural growth patterns of the tumor (Fig. below). After the biopsy is assigned a Gleason score, it is converted into an ISUP grade on a 1-5 scale. The Gleason grading system is the most important prognostic marker for PCa, and the ISUP grade has a crucial role when deciding how a patient should be treated. There is both a risk of missing cancers and a large risk of overgrading resulting in unnecessary treatment. However, the system suffers from significant inter-observer variability between pathologists, limiting its usefulness for individual patients. This variability in ratings could lead to unnecessary treatment, or worse, missing a severe diagnosis.
The goal of this challenge is to predict the ISUP Grade using only Histopathology images. For that, we dealt with the process of Whole Slide Images as huge gigapixel images and deal with the limited number of patients provided in the train set.
Classes: [0, 1, 2, 3, 4, 5]
Download the dataset and extract it in the assets
folder.
Chose the mode that you want:
Classification
: Classify isup grade of imagesSegmentation
: Semantic segmentation on imagesClassif_WITH_Seg
: Classification using a semantic segmentation models trained with Segmentation
Chose a dataset and a model adapted to the mode.
Models for:
Check dataset in datasets.py
Feature extractor from timm library.
Name method: Concatenate top patches
MODE: Classif_WITH_Seg
dataset_name: ConcatTopPatchDataset
feature_extractor_name: tresnet_xl_448
network_name: SimpleModel
Command line to train the model:
python main.py --train True --MODE Classif_WITH_Seg --dataset_name ConcatTopPatchDataset --patch_size 256 --nb_samples 16 --max_epochs 150 --batch_size 2 --accumulate_grad_batches 8 --discounted_draw False --percentage_blank 0.5 --resized_img 512 --seed_everything 6836
drawn-dream-632
is the name of the wandb run of the segmentation model trained with our framework (mode Segmentation
)
Command line to create submission csv file:
python main.py --train False --MODE Classif_WITH_Seg --dataset_name ConcatTopPatchDataset --patch_size 256 --nb_samples 16 --discounted_draw False --percentage_blank 0.5 --resized_img 512 --best_model rich-jazz-915
rich-jazz-915
is the name of the wandb run with weights of the model. (Name change if you train your model yourself)
Model | Backbone | Area Under ROC (weighted) validation | Area Under ROC (macro) test (private leaderboard) | Run |
---|---|---|---|---|
SimpleModel | tresnet_xl_448 | 0.8126 | 0.8833 |
Top patches concatenated from a wsi images. Prediction: 4, Label: 4.
Name method: Concatenate random patches
MODE: Classification
dataset_name: ConcatPatchDataset
feature_extractor_name: tresnet_xl_448
network_name: SimpleModel
Command line to train the model:
python main.py --train True --MODE Classification --dataset_name ConcatPatchDataset --patch_size 256 --nb_samples 36 --max_epochs 150 --batch_size 2 --accumulate_grad_batches 16 --discounted_draw True --seed_everything 6130
Command line to create submission csv file:
python main.py --train False --MODE Classification --dataset_name ConcatPatchDataset --patch_size 256 --nb_samples 36 --discounted_draw True --best_model denim-terrain-844
denim-terrain-844
is the name of the wandb run with weights of the model. (Name change if you train your model yourself)
Model | Backbone | Area Under ROC (weighted) validation | Area Under ROC (macro) test (private leaderboard) without voting | with voting | Run |
---|---|---|---|---|---|
SimpleModel | tresnet_xl_448 | 0.8034 | [0.8774, 0.92647] | 0.8641 |
Random patches concatenated from a wsi images. Left: label 1, Radboud provider, Right: label 1, Karolinska provider
MODE: Segmentation
dataset_name: PatchSegDataset
Model | Backbone | Data provider | Patch Size | Level | IoU (average over classes) validation | Run |
---|---|---|---|---|---|---|
DeepLabV3Plus | resnet152 | All | 384 | 1 | 0.7858 | |
DeepLabV3Plus | resnet34 | Radboud | 512 | 0 | 0.7029 | |
DeepLabV3Plus | resnet34 | Karolinska | 512 | 0 | 0.5958 |
Karolinska is composed of 3 classes:
- 0: background (non tissue) or unknown
- 1: benign tissue (stroma and epithelium combined)
- 2: cancerous tissue (stroma and epithelium combined)
Radboud is composed of 6 classes:
- 0: background (non tissue) or unknown
- 1: stroma (connective tissue, non-epithelium tissue)
- 2: healthy (benign) epithelium
- 3: cancerous epithelium (Gleason 3)
- 4: cancerous epithelium (Gleason 4)
- 5: cancerous epithelium (Gleason 5)
We merged in 3 classes to have the same number as karolinska:
- 0: background (non tissue) or unknown {0}
- 1: benign tissue (stroma and epithelium combined) {1,2}
- 2: cancerous tissue (stroma and epithelium combined) {3,4,5}
Segmentation of a Patch 384x384 from WSI: Patch, Prediction, Ground Truth
Segmentation of a Patch 384x384 from WSI of the Radboud data provider: Patch, Prediction, Ground Truth
- Blue: background or unknown
- Red: benign tissue
- Green: Cancerous tissue
MODE: Segmentation
dataset_name: PatchSegDataset
network_name: DeepLabV3Plus
feature_extractor_name: resnet152
python main.py --train True --MODE Segmentation --dataset_name PatchSegDataset --dataset_static False --max_epochs 150 --batch_size 4 --accumulate_grad_batches 16 --nb_samples 4 --patch_size 384 --percentage_blank 0.5 --level 1 --seed_everything 4882
Model | Backbone | Data Provider | mIoU validation | Run |
---|---|---|---|---|
DeepLabV3Plus | resnet152 | Both | 0.7858 |