This project utilizes the HAM10000 dataset which labels cancerous and non-cancerous human skin images to create a model with pytorch to identify and label images as cancerous or non-cancerous.
You will need the following python dependencies
pip install torch torchvision matplotlib torchsummary kaggle pillow numpy pandas tqdm scikit-learn
Make sure that you have a Kaggle API token and use download-dataset.py
to download the dataset. Alternatively use the CLI tool, kagglehub, curl, mlcroissant, or download the HAM10000 dataset manually, just make sure you unzip the dataset in the /data
directory.
Convert dataset into proper directory format using build-dataset.py
, it creates /data/train
, /data/val
, and /data/test/
(70/10/20 train/validation/test split, though this can be changed in the script) and sub folders in each of those classes so that it can be used by pytorch.
train.py
sets the conditions for training the classification model. Trained models and their “.pth” files can be found at our Hugging Face repository.
Here are some command line arguments for customising the training process:
- -m (required): Specified model for training
- 1: AlexNet
- 2: VGG
- 3: ResNet
- 4: GoogLeNet
- -s: Specifies the batch size for training. It accepts an integer value representing the size of the batch.
- -e: Specified number of epochs for training
This is the format in which you should run the training script:
python train.py -m <model> -s <batch_size> -e <epochs>
For example, this is how you would run googlenet with a batch size of 64 for 15 epochs:
python train.py -m 4 -s 64 -e 15
test.py
takes any model which you’ve trained with train.py
and evaluates the performance of the model on the images in /data/test
- -b: Base directory
- -a: Flag to test alexnet (0 or 1)
- -v: Flag to test VGG (0 or 1)
- -r: Flag to test ResNet (0 or 1)
- -g: Flag to test GoogLeNet (0 or 1)
- -m: Specifies ensemble method.
- 1: Max Probability
- 2: Average Probability
- 3: Majority Vote
- -e: Flag for using a 5 epoch model (0 or 1)
python test.py -b <base_dir> -a <alexnet_flag> -v <vgg_flag> -r <resnet_flag> -g <googlenet_flag> -m <ensemble_method> -e <epoch_flag>
EvaluateModel.py
and EvaluateModel2.py
evaluates the output of the models on single images
- -m: Specified model for evaluation
- -p: Trained parameters
- -i: Image to Evaluate
python EvaluateModel.py -m <model> -p <trained_parameters> -i <image>