This code is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
This is the official repository for our recent work: [CT-based radiomic model for identifying pathological T3a upstaging in renal cell carcinoma: model development and multi-source validation]. This repository contains ready-to-use codes for inference of the proposed models.
- Features 1: The training dataset contains sufficient patients with renal cell carcinoma (n = 999) and covers a wide spectrum of disease phenotype (common and rare RCC subtypes, all ISUP/fuhrman grades, all pathological T stages).
- Features 2: The predictive performance of the models were externally evaluated in data from two medical centers and four TCIA datasets.
- Features 3: The morphology model improved the performance of junior radiologists.
numpy
pandas
pyradiomics
SimpleITK
scikit-learn
tqdm
NIfTI-formatted image and mask files were used for feature extraction and model prediction. Please organize the image and mask files in the following structure.
test_data/
├── Dataset001
│── patient_1
│ ├── image.nii.gz
│ ├── tumor_mask.nii.gz
│ └── peritumor_mask.nii.gz
│── patient_2
│ ├── image.nii.gz
│ ├── tumor_mask.nii.gz
│ └── peritumor_mask.nii.gz
└── patient_n
├── image.nii.gz
├── tumor_mask.nii.gz
└── peritumor_mask.nii.gz
During inference, the configurations are loaded from a .json
file. The structure of the file is as follows:
{
"model":"your\\path\\to\\model.pkl",
"extraction parameter":"your\\path\\to\\extraction.yaml",
"dataset":"your\\path\\to\\dataset",
"image":"image.nii.gz",
"masks": {
"tumor":"tumor_mask.nii.gz",
"peritumor":"peritumor_mask.nii.gz"
},
"cutoff":0.3582773836548992,
"output":"your\\path\\to\\results\\"
}
Each of the parameters is explained as follows:
model
: The path to the pickle file of trained model.extraction parameter
: The path to the yaml file of radiomics feature extraction configurations.dataset
: The path to the dataset folder.image
: The name of the image file. Both.nii
and.nii.gz
are supported.masks
: The names of the mask files. The key is the prefix added to the feature name during extraction, and the value is the mask file name.cutoff
: The cutoff value that determines whether a case is classified as T3a invasion positive or negative.output
: The output folder to hold the output.csv
file.
from inference import ModelInference
if __name__ == '__main__':
config_dir = '.\\inference_config.json'
model = ModelInference(config_dir)
model.predict()
- import
ModelInference
class frominference
module. - initialize an instance of the
ModelInference
class, passing the configuration file path (inference_config.json
) as an argument. - call the
predict
method of theModelInference
class.
- use
SignatureExtractor
class ininference
module and.yaml
configuration file to extract the radiomic features. - merge the tabular radiomics data with
label
column and save it as.csv
file. - run
train
intrain_validation
module to train the model and save it as.pkl
file. - run
internal validaion
intrain_validation
module to internal validate the model using nested-cross validation. The outer loop is 200 repeats 5 fold cross validation, and the inner loop is 20 repeats 5 fold cross validation. For a detailed explanation of nested-cross validation, please refer to A Guide to Cross-Validation for Artificial Intelligence in Medical Imaging. - run
external_validation
intrain_validation
module to external validate the model in holdout datasets. - during internal validation and external validation, the raw predictions were saved as
.csv
files.