First, login into PACE-ICE via ssh <GT_USERNAME>@login-ice.pace.gatech.edu
. It will ask you for a password, which will just be your gt password. For example, for me it would be ssh [email protected]
. Then you want to allocate youself an interactive compute node with a gpu via salloc -N1 -t6:00:00 --gres=gpu:1 --cpus-per-task=15 --mem=200G
. The -t6:00:00
specifies you want 6 hours of time. The excessive number of cpus cores really helps when loading data.
You will first want to go into the scratch directory, via cd scratch
. From there you can clone the repo: git clone https://github.com/NikhilVyas7/DeepLearningProject.git
, then go into it: cd DeepLearningProject
. Once you've cloned the repository you can set up the conda
environment by running ./scripts/setup_env.sh
.
The dataset is located at this dropbox. You can simply download the dataset, scp it to bring it to pace-ice, then unzip it so that the
dataset is located in the repo with the name FloodNet
. To unzip, run
UNZIP_DISABLE_ZIPBOMB_DETECTION=TRUE unzip FloodNet.zip -d FloodNet
. The unzip disable zipbomb is required because the unzip function falsely
thinks the dataset is a zipbomb.
Once in a compute node with a lot of cores and memory,and in the conda environment, simply run python data/shrink_dataset.py FloodNet ShrunkenFloodNet
. This will very quickly create the shrunken dataset.
-
Create script to convert label image containign (1s,2s.) etc to something that can be visualized with colors ,according to
FloodNet/ColorMasks-FloodNetv1.0/ColorPalette-Values.xlsx
. -
Explore DataDistributedParallel so we can use more than 2 gpus.
-
Explore using Diffusion or GAN models to create the images from the labels
-
Explore HPO (Hyper-parameter Optimization) on the UNet to improve baseline performance
-
Explore methods of optimizing the model inference time in general