Quick startup guide to using the FAS cluster to train, evaluate, and analyze video with DeepLabCut. Some modifications have been made to accommodate parallelization and alternative directory structures at the analysis phase.
If you are unfamiliar with cluster computing, a good place to start is here and here.
This is the only part of the workflow that is done on your local computer. Make sure you have Fiji installed, as well as the Generating_a_Training_Set
directory cloned from the local
directory of this repo.
Edit myconfig.py
in the Generating_a_Training_Set
folder.
cd Generating_a_Training_Set
python3 Step1_SelectRandomFrames_fromVideos.py
-
open ImageJ or Fiji
-
within pop-up window navigate to folder with images to be labeled (Generating_a_Training_Set -> data-YOUR_NETWORK)
-
click first image, then click open
-
you will see window pop up named "Sequence Options"
-
Make sure 2 boxes are checked: "Sort names numerically" and "Use virtual stack"
-
window will pop up with all your images in stack (scroll bar at bottom)
-
in tool bar (with File, Edit, etc), click Multi-point button (to right of angle button and to left of wand button)
-
click on body features in EXACT order for every image (order specified in myconfig.py - Step 2, bodyparts variable)
(if a point can't be determined, click in the top left corner of the image, so that X and Y positions are less than 50 pixels)
-
window will pop up: "Results"
-
save Results window as "Results.csv" in same folder as the images you're labeling
python3 Step2_ConvertingLabels2DataFrame.py
python3 Step3_CheckLabels.py
python3 Step4_GenerateTrainingFileFromLabelledData.py
This directory is probably called something like /n/holystore01/LABS/uchida_users/Users/$YOUR_RC_ID
. Since your folders are currently local, you'll have to do this with either scp
or a client like FileZilla, e.g.
scp -r remote [email protected]:/n/holystore01/LABS/uchida_users/Users/$YOUR_RC_ID
You'll have to make sure to overwrite the default config.py file in the configs
folder with your own config.py
, which is currently only local!
In this example, I request two hours and 4 GB of RAM on the test
partition.
srun --pty -p test -t 2:00:00 --mem 4G /bin/bash
For help on how to use the cluster, see this page: https://www.rc.fas.harvard.edu/resources/quickstart-guide/.
First, cd
to whereever you uploaded the contents of the remote
directory. You may want to rename it to DeepLabCut, e.g.
cd /n/holystore01/LABS/uchida_users/Users/$YOUR_RC_ID
mv remote DeepLabCut
cd DeepLabCut
This folder should contain TF1_3GPUEnv_DeeperCut.yml
. Then,
module load Anaconda3/5.0.1-fasrc01
module load cuda/8.0.61-fasrc01 cudnn/6.0_cuda8.0-fasrc01
conda env create -f TF1_3GPUEnv_DeeperCut.yml
(Thanks to Gerald Pho for this step.)
cd pose-tensorflow/models/pretrained
curl http://download.tensorflow.org/models/resnet_v1_50_2016_08_28.tar.gz | tar xvz
curl http://download.tensorflow.org/models/resnet_v1_101_2016_08_28.tar.gz | tar xvz
Since your folders are currently local, you'll have to do this with either scp
or a client like FileZilla. For example, assuming you extracted everything to a directory titled DeepLabCut
, then from a local command line, you'd run:
scp -r YOURexperimentNameTheDate-trainset95shuffle1 [email protected]:/n/holystore01/LABS/uchida_users/Users/$YOUR_RC_ID/DeepLabCut/pose-tensorflow/models/
scp -r UnaugmentedDataSet_YOURexperimentNameTheDate [email protected]:/n/holystore01/LABS/uchida_users/Users/$YOUR_RC_ID/DeepLabCut/pose-tensorflow/models/
Make sure to edit the path in train_requeueDC.sh
first!
cd pose-tensorflow
sbatch train_requeueDC.sh
If this is working properly, it will take ~10 hours to run.
cd ../evaluation-tools
sbatch evaluateDC.sh
Evaluation metrics will be printed to STDOUT, and images will be saved in the evaluation-tools
directory in a new folder called LabeledImages_...
.
Edit: myconfig_analysis.py
in the configs
folder within remote
. If you do this locally (recommended), don't forget to re-upload to the cluster!
cd ../analysis-tools
sbatch analyzeDC.sh
This step can be easily parallelized, making it ideal to be run on the cluster! For example, let's say you wanted to run each recording session as its own job. This is more efficient when there are many short trials, which don't deserve their own job or job array because of load on the Slurm scheduler. Use:
cd ../analysis-tools/parallel-session
sbatch analyze_all.sh
If you had relatively few sessions but many trials, each named something like path-to-file/trial_$trial-num
, and each trial was relatively long, it would be more efficient to submit each trial as a task within a job array. For example, try:
cd ../analysis-tools/parallel-trial
sbatch analyze_all_array.sh
However, note that these parallelization methods are all directory structure and naming-convention dependent. The code is provided to give you an idea of how to do this, but it should not be expected to work out of the box.