-
Notifications
You must be signed in to change notification settings - Fork 9
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
9 changed files
with
1,209 additions
and
1,391 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,260 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"# First we need to create a config store to store our configurations\n", | ||
"from dacapo.store.create_store import create_config_store\n", | ||
"\n", | ||
"config_store = create_config_store()\n", | ||
"\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
" ## Datasplit\n", | ||
" Where can you find your data? What format is it in? Does it need to be normalized? What data do you want to use for validation?\n", | ||
" We'll assume your data is in a zarr file, and that you have a raw and a ground truth dataset, all stored in your `runs_base_dir` as `example_{type}.zarr` where `{type}` is either `train` or `validate`.\n", | ||
" NOTE: You may need to delete old config stores if you are re-running this cell with modifications to the configs. The config names are unique and will throw an error if you try to store a config with the same name as an existing config. For the `files` backend, you can delete the `runs_base_dir/configs` directory to remove all stored configs." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from dacapo.experiments.datasplits import DataSplitGenerator\n", | ||
"from funlib.geometry import Coordinate\n", | ||
"\n", | ||
"input_resolution = Coordinate(8, 8, 8)\n", | ||
"output_resolution = Coordinate(4, 4, 4)\n", | ||
"datasplit_config = DataSplitGenerator.generate_from_csv(\n", | ||
" \"/misc/public/dacapo_learnathon/datasplit_csvs/cosem_example.csv\",\n", | ||
" input_resolution,\n", | ||
" output_resolution,\n", | ||
").compute()\n", | ||
"\n", | ||
"datasplit = datasplit_config.datasplit_type(datasplit_config)\n", | ||
"viewer = datasplit._neuroglancer()\n", | ||
"config_store.store_datasplit_config(datasplit_config)\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
" ## Task\n", | ||
" What do you want to learn? An instance segmentation? If so, how? Affinities,\n", | ||
" Distance Transform, Foreground/Background, etc. Each of these tasks are commonly learned\n", | ||
" and evaluated with specific loss functions and evaluation metrics. Some tasks may\n", | ||
" also require specific non-linearities or output formats from your model." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from dacapo.experiments.tasks import DistanceTaskConfig\n", | ||
"\n", | ||
"task_config = DistanceTaskConfig(\n", | ||
" name=\"cosem_distance_task_4nm\",\n", | ||
" channels=[\"mito\"],\n", | ||
" clip_distance=40.0,\n", | ||
" tol_distance=40.0,\n", | ||
" scale_factor=80.0,\n", | ||
")\n", | ||
"config_store.store_task_config(task_config)\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
" ## Architecture\n", | ||
"\n", | ||
" The setup of the network you will train. Biomedical image to image translation often utilizes a UNet, but even after choosing a UNet you still need to provide some additional parameters. How much do you want to downsample? How many convolutional layers do you want?" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from dacapo.experiments.architectures import CNNectomeUNetConfig\n", | ||
"\n", | ||
"architecture_config = CNNectomeUNetConfig(\n", | ||
" name=\"upsample_unet\",\n", | ||
" input_shape=Coordinate(216, 216, 216),\n", | ||
" eval_shape_increase=Coordinate(72, 72, 72),\n", | ||
" fmaps_in=1,\n", | ||
" num_fmaps=12,\n", | ||
" fmaps_out=72,\n", | ||
" fmap_inc_factor=6,\n", | ||
" downsample_factors=[(2, 2, 2), (3, 3, 3), (3, 3, 3)],\n", | ||
" constant_upsample=True,\n", | ||
" upsample_factors=[(2, 2, 2)],\n", | ||
")\n", | ||
"config_store.store_architecture_config(architecture_config)\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
" ## Trainer\n", | ||
"\n", | ||
" How do you want to train? This config defines the training loop and how the other three components work together. What sort of augmentations to apply during training, what learning rate and optimizer to use, what batch size to train with." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from dacapo.experiments.trainers import GunpowderTrainerConfig\n", | ||
"from dacapo.experiments.trainers.gp_augments import (\n", | ||
" ElasticAugmentConfig,\n", | ||
" GammaAugmentConfig,\n", | ||
" IntensityAugmentConfig,\n", | ||
" IntensityScaleShiftAugmentConfig,\n", | ||
")\n", | ||
"\n", | ||
"trainer_config = GunpowderTrainerConfig(\n", | ||
" name=\"cosem\",\n", | ||
" batch_size=1,\n", | ||
" learning_rate=0.0001,\n", | ||
" num_data_fetchers=20,\n", | ||
" augments=[\n", | ||
" ElasticAugmentConfig(\n", | ||
" control_point_spacing=[100, 100, 100],\n", | ||
" control_point_displacement_sigma=[10.0, 10.0, 10.0],\n", | ||
" rotation_interval=(0.0, 1.5707963267948966),\n", | ||
" subsample=8,\n", | ||
" uniform_3d_rotation=True,\n", | ||
" ),\n", | ||
" IntensityAugmentConfig(scale=(0.25, 1.75), shift=(-0.5, 0.35), clip=True),\n", | ||
" GammaAugmentConfig(gamma_range=(0.5, 2.0)),\n", | ||
" IntensityScaleShiftAugmentConfig(scale=2.0, shift=-1.0),\n", | ||
" ],\n", | ||
" snapshot_interval=10000,\n", | ||
" min_masked=0.05,\n", | ||
" clip_raw=True,\n", | ||
")\n", | ||
"config_store.store_trainer_config(trainer_config)\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
" ## Run\n", | ||
" Now that we have our components configured, we just need to combine them into a run and start training. We can have multiple repetitions of a single set of configs in order to increase our chances of finding an optimum." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from dacapo.experiments import RunConfig\n", | ||
"from dacapo.experiments.run import Run\n", | ||
"\n", | ||
"start_config = None\n", | ||
"\n", | ||
"# Uncomment to start from a pretrained model\n", | ||
"# start_config = StartConfig(\n", | ||
"# \"setup04\",\n", | ||
"# \"best\",\n", | ||
"# )\n", | ||
"\n", | ||
"iterations = 2000\n", | ||
"validation_interval = iterations // 2\n", | ||
"repetitions = 1\n", | ||
"for i in range(repetitions):\n", | ||
" run_config = RunConfig(\n", | ||
" name=\"cosem_distance_run_4nm\",\n", | ||
" # # NOTE: This is a template for the name of the run. You can customize it as you see fit.\n", | ||
" # name=(\"_\").join(\n", | ||
" # [\n", | ||
" # \"example\",\n", | ||
" # \"scratch\" if start_config is None else \"finetuned\",\n", | ||
" # datasplit_config.name,\n", | ||
" # task_config.name,\n", | ||
" # architecture_config.name,\n", | ||
" # trainer_config.name,\n", | ||
" # ]\n", | ||
" # )\n", | ||
" # + f\"__{i}\",\n", | ||
" datasplit_config=datasplit_config,\n", | ||
" task_config=task_config,\n", | ||
" architecture_config=architecture_config,\n", | ||
" trainer_config=trainer_config,\n", | ||
" num_iterations=iterations,\n", | ||
" validation_interval=validation_interval,\n", | ||
" repetition=i,\n", | ||
" start_config=start_config,\n", | ||
" )\n", | ||
"\n", | ||
" print(run_config.name)\n", | ||
" config_store.store_run_config(run_config)\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
" ## Train\n", | ||
" To train one of the runs, you can either do it by first creating a **Run** directly from the run config\n", | ||
" NOTE: The run stats are stored in the `runs_base_dir/stats` directory. You can delete this directory to remove all stored stats if you want to re-run training. Otherwise, the stats will be appended to the existing files, and the run won't start from scratch. This may cause errors" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from dacapo.train import train_run\n", | ||
"from dacapo.experiments.run import Run\n", | ||
"from dacapo.store.create_store import create_config_store\n", | ||
"\n", | ||
"config_store = create_config_store()\n", | ||
"\n", | ||
"run = Run(config_store.retrieve_run_config(\"cosem_distance_run_4nm\"))\n", | ||
"train_run(run)\n" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "Python 3 (ipykernel)", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.9.16" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 4 | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.