Skip to content
Fábio Maia edited this page Apr 4, 2019 · 8 revisions

Week 4

  • Ran many experiments
    • Maybe we just need more data to obtain better performance
    • Try inceptionv3
  • Improved CLI
  • Normalized luminance and applied contrast stretching which seemed to improve performance
    • Try more color correction techniques

Week 3

  • Refactored data pipeline to do offline data augmentation
    • Online data augmentation changes the input 𝑥 at each iteration, and therefore changes the surface and the location of the minimizer 𝜃* (which makes optimization more difficult)
    • Empirically verify online vs offline data augmentation
  • Class-balanced the dataset by oversampling the minority class
  • Empirically justify and support why we need more training data than the provided 2k samples
  • There is potentially no need for validation data if we don't do early stopping
  • Improve the scripts' CLI:
    • Validate file paths
    • Write help menus
    • Set all required arguments as required
  • The layers which are cutoff from the pretrained model are still present in TensorFlow's computational graph, which means it's doing unnnecessary computation
  • Empirically verify if regularization is necessary

Week 2

  • Refactored data pipeline:
    • Use the ISIC 2017 dataset which allows for performance comparison with top competitors
    • Crop squares of the center of the image such that resizing them to fit the CNN input tensor will maintain the aspect ratio (alternatively this is promising https://github.com/keras-team/keras-preprocessing/pull/81)
    • Data augmentation is performed on the CPU in parallel to training on the GPU
  • Introduced elastic (L1 and L2) weight regularization
    • Disregarded dropout because it is "too magic"
  • Started defining experiments
    • There is an allowed range of values for hyperparameters and their cartesian product is computed in order to obtain all possible combinations of values which we want to try
    • We must keep the range of values to a minimum in order to minimize the number of possible combinations so as to reduce computation
    • There is no need to perform cross validation for the purposes of hyperparameter search. I am not interested in finding the optimal hyperparameter values, instead I want to try them all in order see how they affect performance and study transfer learning in that respect by drawing conclusions from plots
  • Settle on a simple cyclical learning rate schedule, potentially one for each network architecture
  • Switch out Adam optimizer for simple SGD with Nesterov momentum
  • Start working on the custom CNN
  • Validation F1-score is always zero and I am not sure why

Week 1

  • Data from ISIC challenges are not fixed size
  • Very few papers describe exactly how they process ISIC data, but the best way seems to be to crop the center into a square and resize to fit the CNN input tensor
  • It does not make sense to freeze layers in the middle of convolutional blocks, because the convolutional block is trying to progressively build up a higher level feature, so it only makes sense to freeze entire blocks
  • Transfer learning definition, strategies and taxonomy
    • In general most strategies take some higher level representation of the input features and build a classifier on top of that, but I think we should only consider NN classifiers because e.g. SVM with the kernel tricke are equivalent to feed forward neural networks with a non-linear activation function (citation needed)
  • Validation F1-score is frequently zero because FN and TN are zero, likely because of a problem in splitting the data into train, validation, and test
  • Transfer learning from models trained on ImageNet may not be adequate for cancer binary classification (experimentally visualize blocks https://github.com/philipperemy/keract for each considered network) because ImageNet is a very diverse dataset whereas the ISIC-Archive is a very restrict subset of skin images, so very likely only the initial layers of any model trained on ImageNet are (theoretically) useful for transfer learning
Clone this wiki locally