-
Notifications
You must be signed in to change notification settings - Fork 2
Sub-folders containing model-specific information #4
base: main
Are you sure you want to change the base?
Changes from all commits
f302d30
cb9d289
1224e1c
cc17cee
d3943ce
c744b73
def9a48
e61d8e2
fc1bae9
e45d86a
c899181
1ee798e
03fd821
9070dec
c3780ee
cf8584b
06cfeee
26b1211
a95b060
0172cb3
d41bbd6
3b71f6d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
*.pth | ||
*.h5 | ||
*.pt |
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,328 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "view-in-github", | ||
"colab_type": "text" | ||
}, | ||
"source": [ | ||
"<a href=\"https://colab.research.google.com/github/heinsense2/AIO_CaseStudy/blob/main/notebooks/Training_on_FathomNet_Custom_Data.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "jpTpijONjCPL" | ||
}, | ||
"source": [ | ||
"## Custom training using YOLOv5 on Fathomnet custom dataset\n", | ||
"\n", | ||
"This notebook explains how to train a custom dataset using YOLOv5 to recognize different marine species presnt in the Monterey bay. This notebook serves as a guideline to produce the results presented in the paper\n", | ||
"\n", | ||
" *Demystifying image-based machine learning: a practical guide to automated analysis of imagery using modern machine learning tools*, \n", | ||
"\n", | ||
"\n", | ||
"The data is prepared using code available [here](https://github.com/heinsense2/AIO_CaseStudy/tree/main/data/scripts).\n", | ||
"\n", | ||
"NOTE: If you wish to use this notebook, you will need to make changes to refer to the datast locations and python script parameters amoung others.\n", | ||
"\n", | ||
"Here are the relevant steps:\n", | ||
"\n", | ||
"* Create the dataset and annotations (labels). Organize directories.\n", | ||
"* Export dataset to YOLOv5\n", | ||
"* Train YOLOv5 to recognize the objects (marine animals) in our dataset\n", | ||
"* Evaluate our YOLOv5 model's performance\n", | ||
"* Run inference to view the model at work\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "uruljSVEk_hc" | ||
}, | ||
"source": [ | ||
"# 1. Install requirements" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "xaA_HZi8lK4U" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"# Clone YOLOv5\n", | ||
"!git clone https://github.com/ultralytics/yolov5 # clone repo\n", | ||
"%cd yolov5\n", | ||
"%pip install -qr requirements.txt # install dependencies\n", | ||
"%pip install torch==1.8.1 torchvision==0.9.1\n", | ||
"\n", | ||
"import torch\n", | ||
"import os\n", | ||
"from IPython.display import Image, clear_output # to display images\n", | ||
"\n", | ||
"print(f\"Setup complete. Using torch {torch.__version__} ({torch.cuda.get_device_properties(0).name if torch.cuda.is_available() else 'CPU'})\")\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "KXnRWZ2nnQuD" | ||
}, | ||
"source": [ | ||
"# 2. Assemble Dataset\n", | ||
"\n", | ||
"To train our model, we need to assemble a dataset of representative images with bounding boxes around the objects we want to detect. Our dataset must be in YOLOv5 format.\n", | ||
"\n", | ||
"The Fathomnet data is downloaded and prepared using code available [here](https://github.com/heinsense2/AIO_CaseStudy/tree/main/data/scripts).\n", | ||
"\n", | ||
"\n", | ||
"When usig Google Colab, it is recommended to have the data available on Google Drive. So we need to first mount our Google Drive.\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "Vncfn_fqoeLW" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"# Mount Google Drive\n", | ||
"from google.colab import drive\n", | ||
"drive.mount('/content/gdrive')\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "07QSJM5iuvar" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"# List the directory where the data resides\n", | ||
"# For example:\n", | ||
"#!ls \"/content/gdrive/My Drive/data\"" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "X7yAi9hd-T4B" | ||
}, | ||
"source": [ | ||
"# 3. Train Our Custom YOLOv5 model\n", | ||
"\n", | ||
"We are able to pass a number of arguments to ```train.py```. To see all the settable arguments:\n", | ||
"\n", | ||
">>\n", | ||
"```python\n", | ||
"python train.py -h\n", | ||
"```\n", | ||
"\n", | ||
"Here is what we used:\n", | ||
"- **img:** define input image size\n", | ||
"- **batch:** determine batch size\n", | ||
"- **epochs:** define the number of training epochs. We use 300. (Note: typically 300-1000 are used although 1000+ are common here!)\n", | ||
"- **data:** Our dataset locaiton is saved in the `data.location`\n", | ||
"- **weights:** specify a path to weights to start transfer learning from. Here we choose the pretrained model yolov5s, which is the small and quickest one.\n", | ||
"- **cache:** cache images for faster training\n", | ||
"\n", | ||
"The commad to train in general will have the form:\n", | ||
"\n", | ||
">>\n", | ||
"```python\n", | ||
"python train.py --img 640 --batch 16 --epochs 300 --data {data.directory}/{domain}.yaml --weights yolov5s.pt --cache\n", | ||
"```\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"source": [ | ||
"Below is an example on how to train YOLOv5 on data from years prior to 2012 using pretrained ```python\n", | ||
"--weights yolov5s.pt```. Image size is set to 640 and we use batch size of 16.\n" | ||
], | ||
"metadata": { | ||
"id": "K2pVR3QcAGdb" | ||
} | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "LQttoa-D2mgn" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"# Train using pre-2012 data\n", | ||
"!python train.py --img 640 --batch 16 --epochs 300 --data ../data/pre_2012/yolov5/pre_2012.yaml --weights yolov5s.pt --cache" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"source": [ | ||
"To train on the other spatial/depth and temporal regions using the command above, you only need to change the ```--data``` parameter to point to the corresponding yaml file.\n", | ||
"\n", | ||
"In this notebook, we will continue using pre-2012 data as example code." | ||
], | ||
"metadata": { | ||
"id": "ZIqMTTCGdoeM" | ||
} | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "AcIRLQOlA14A" | ||
}, | ||
"source": [ | ||
"# Evaluate Custom YOLOv5 Detector Performance\n", | ||
"All results are logged by default to runs/train, with a new experiment directory created for each new training (runs/train/exp2, runs/train/exp3)\n", | ||
"\n", | ||
"Training losses and performance metrics are saved to Tensorboard and also to a CSV logfile results.csv\n", | ||
"\n", | ||
"If you are new to these metrics, the one you want to focus on is `mAP_0.5` - learn more about mean average precision [here](https://blog.roboflow.com/mean-average-precision/)." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "1jS9_BxdBBHL" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"# Start tensorboard\n", | ||
"# Launch after you have started training\n", | ||
"# logs save in the folder \"runs/train/exp*\"\n", | ||
"%load_ext tensorboard\n", | ||
"%tensorboard --logdir runs" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"source": [ | ||
"You can also validate the trained detection model on the test or out-of-domain datasets by using the val.py script in YOLOv5 and the corresponding yaml file.\n", | ||
"\n", | ||
"For the pre-2012 trained model, the following commands will validate it on the test or out-of-domain datasets, assuming the results were saved to ```runs/train/exp```." | ||
], | ||
"metadata": { | ||
"id": "ffwecYhRKF8Q" | ||
} | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"source": [ | ||
"# Validate model on test dataset\n", | ||
"!python val.py --data ../data/pre_2012/yolov5/pre_2012.yaml --weights runs/train/exp/weights/best.pt --task test\n", | ||
"\n", | ||
"# Validate model on out-of-domain (post-2012) dataset\n", | ||
"!python val.py --data ../data/post_2012/yolov5/post_2012_as_out_of_domain.yaml --weights runs/train/exp/weights/best.pt --task test\n", | ||
"\n" | ||
], | ||
"metadata": { | ||
"id": "o-_FjSzJKqlv" | ||
}, | ||
"execution_count": null, | ||
"outputs": [] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "jtmS7_TXFsT3" | ||
}, | ||
"source": [ | ||
"#Run Inference With Trained Weights on New Images\n", | ||
"\n", | ||
"Once the model is trained, you can run inference witht the best pretrained checkpoint. `best.pt` on test (`images/test`) or out-of-domain (`all/images`) images.\n", | ||
"\n", | ||
"`--conf 0.65` only detections where the confidence level of it being an actual detection is 0.65 or greater.\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "NkJrK89z_O0q" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"# Run detection on pre-2012 images\n", | ||
"!python detect.py --weights runs/train/exp/weights/best.pt --img 640 --conf 0.65 --source \"../data/pre_2012/yolov5/images/test\"\n", | ||
"\n", | ||
"# Run detection on out-of-domain (post-2012) images\n", | ||
"!python detect.py --weights runs/train/exp/weights/best.pt --img 640 --conf 0.65 --source \"../data/post_2012/yolov5/all/images\"\n", | ||
"\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"source": [ | ||
"To visualize the detection results, there are many options. One simple way using IPython display is to presented below." | ||
], | ||
"metadata": { | ||
"id": "htF5pmgWldBO" | ||
} | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "e9h22JUt_m56" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"# Display inference on ALL resulting detection images\n", | ||
"\n", | ||
"import glob\n", | ||
"from IPython.display import Image, display\n", | ||
"\n", | ||
"# Assume resulting images from detection are in runs/detect/exp\n", | ||
"for imageName in glob.glob('/content/yolov5/runs/detect/exp/*.png'): #assuming PNG\n", | ||
" display(Image(filename=imageName))\n", | ||
" print(\"\\n\")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"source": [ | ||
"# Export Trained Weights for Future Inference\n", | ||
"You can now export the trained weights from our detector for inference on your device elsewhere.\n" | ||
], | ||
"metadata": { | ||
"id": "KAXD9i3I91CV" | ||
} | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"source": [ | ||
"#export your model's weights for future use\n", | ||
"from google.colab import files\n", | ||
"files.download('./runs/train/exp/weights/best.pt')" | ||
], | ||
"metadata": { | ||
"id": "yfgdDBYC-ztS" | ||
}, | ||
"execution_count": null, | ||
"outputs": [] | ||
} | ||
], | ||
"metadata": { | ||
"accelerator": "GPU", | ||
"colab": { | ||
"provenance": [], | ||
"include_colab_link": true | ||
}, | ||
"kernelspec": { | ||
"display_name": "Python 3", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"name": "python" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 0 | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should exclude IDE configuration from being checked in to the repo -- I recommend removing the
.idea
directory and its contents