Skip to content

Commit

Permalink
Merge pull request #14 from florencejt/refactor/params
Browse files Browse the repository at this point in the history
Changing the inputs to be separate parameter arguments rather than a parameter dictionary
  • Loading branch information
florencejt authored Jan 5, 2024
2 parents 27ac475 + c55714e commit 5f23273
Show file tree
Hide file tree
Showing 137 changed files with 3,944 additions and 4,196 deletions.
Binary file modified docs/auto_examples/auto_examples_jupyter.zip
Binary file not shown.
Binary file modified docs/auto_examples/auto_examples_python.zip
Binary file not shown.
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n# How to customise the training in Fusilli\n\nThis tutorial will show you how to customise the training of your fusion model.\n\nWe will cover the following topics:\n\n* Early stopping\n* Batch size\n* Number of epochs\n* Checkpoint suffix modification\n\n## Early stopping\n\nEarly stopping is implemented in Fusilli using the PyTorch Lightning\n[EarlyStopping](https://lightning.ai/docs/pytorch/stable/api/lightning.pytorch.callbacks.EarlyStopping.html#lightning.pytorch.callbacks.EarlyStopping)\ncallback. This callback can be passed to the\n:func:`~fusilli.model_utils.train_and_save_models` function using the\n``early_stopping_callback`` argument. For example:\n\n```python\nfrom fusilli.data import get_data_module\nfrom fusilli.train import train_and_save_models\n\nfrom lightning.pytorch.callbacks import EarlyStopping\n\nmodified_early_stopping_callback = EarlyStopping(\n monitor=\"val_loss\",\n min_delta=0.00,\n patience=3,\n verbose=True,\n mode=\"min\",\n)\n\ndatamodule = get_data_module(\n fusion_model=example_model,\n params=params,\n own_early_stopping_callback=modified_early_stopping_callback,\n )\n\ntrained_model_list = train_and_save_models(\n data_module=datamodule,\n params=params,\n fusion_model=example_model,\n )\n```\nNote that you only need to pass the callback to the :func:`~.fusilli.data.get_data_module` and **not** to the :func:`~.fusilli.train.train_and_save_models` function. The new early stopping measure will be saved within the data module and accessed during training.\n\n\n-----\n\n## Batch size\n\nThe batch size can be set using the ``batch_size`` argument in the :func:`~.fusilli.data.get_data_module` function. By default, the batch size is 8.\n\n```python\nfrom fusilli.data import get_data_module\nfrom fusilli.train import train_and_save_models\n\ndatamodule = get_data_module(\n fusion_model=example_model,\n params=params,\n batch_size=32,\n )\n\ntrained_model_list = train_and_save_models(\n data_module=datamodule,\n params=params,\n fusion_model=example_model,\n batch_size=32,\n )\n```\n-----\n\n## Number of epochs\n\nYou can change the maximum number of epochs using the ``max_epochs`` argument in the :func:`~.fusilli.data.get_data_module` and :func:`~.fusilli.train.train_and_save_models` functions. By default, the maximum number of epochs is 1000.\n\nYou also pass it to the :func:`~.fusilli.data.get_data_module` function because some of the fusion models require pre-training.\n\nChanging the ``max_epochs`` parameter is especially useful when wanting to run a quick test of your model. For example, you can set ``max_epochs=5`` to run a quick test of your model.\n\n```python\nfrom fusilli.data import get_data_module\nfrom fusilli.train import train_and_save_models\n\ndatamodule = get_data_module(\n fusion_model=example_model,\n params=params,\n max_epochs=5,\n )\n\ntrained_model_list = train_and_save_models(\n data_module=datamodule,\n params=params,\n fusion_model=example_model,\n max_epochs=5,\n )\n```\nSetting ``max_epochs`` to -1 will train the model until early stopping is triggered.\n\n-----\n\n## Checkpoint suffix modification\n\nBy default, Fusilli saves the model checkpoints in the following format:\n\n ``{fusion_model.__name__}_epoch={epoch_n}.ckpt``\n\nIf the checkpoint is for a pre-trained model, then the following format is used:\n\n ``subspace_{fusion_model.__name__}_{pretrained_model.__name__}.ckpt``\n\nYou can add suffixes to the checkpoint names by passing a string to the ``extra_log_string_dict`` argument in the :func:`~.fusilli.data.get_data_module` and :func:`~.fusilli.train.train_and_save_models` functions. For example, I could add a suffix to denote that I've changed the batch size for this particular run:\n\n```python\nfrom fusilli.data import get_data_module\nfrom fusilli.train import train_and_save_models\n\nextra_suffix_dict = {\"batchsize\": 32}\n\ndatamodule = get_data_module(\n fusion_model=example_model,\n params=params,\n batch_size=32,\n extra_log_string_dict=extra_suffix_dict,\n )\n\ntrained_model_list = train_and_save_models(\n data_module=datamodule,\n params=params,\n fusion_model=example_model,\n batch_size=32,\n extra_log_string_dict=extra_suffix_dict,\n )\n```\nThe checkpoint name would then be (if the model trained for 100 epochs):\n\n ``ExampleModel_epoch=100_batchsize_32.ckpt``\n\n\n<div class=\"alert alert-info\"><h4>Note</h4><p>The ``extra_log_string_dict`` argument is also used to modify the logging behaviour of the model. For more information, see `wandb`.</p></div>\n"
"\n# How to customise the training in Fusilli\n\nThis tutorial will show you how to customise the training of your fusion model.\n\nWe will cover the following topics:\n\n* Early stopping\n* Batch size\n* Number of epochs\n* Checkpoint suffix modification\n\n## Early stopping\n\nEarly stopping is implemented in Fusilli using the PyTorch Lightning\n[EarlyStopping](https://lightning.ai/docs/pytorch/stable/api/lightning.pytorch.callbacks.EarlyStopping.html#lightning.pytorch.callbacks.EarlyStopping)\ncallback. This callback can be passed to the\n:func:`~fusilli.model_utils.train_and_save_models` function using the\n``early_stopping_callback`` argument. For example:\n\n```python\nfrom fusilli.data import prepare_fusion_data\nfrom fusilli.train import train_and_save_models\n\nfrom lightning.pytorch.callbacks import EarlyStopping\n\nmodified_early_stopping_callback = EarlyStopping(\n monitor=\"val_loss\",\n min_delta=0.00,\n patience=3,\n verbose=True,\n mode=\"min\",\n)\n\ndatamodule = prepare_fusion_data(\n prediction_task=\"binanry\",\n fusion_model=example_model,\n data_paths=data_paths,\n output_paths=output_path,\n own_early_stopping_callback=modified_early_stopping_callback,\n )\n\ntrained_model_list = train_and_save_models(\n data_module=datamodule,\n fusion_model=example_model,\n )\n```\nNote that you only need to pass the callback to the :func:`~.fusilli.data.prepare_fusion_data` and **not** to the :func:`~.fusilli.train.train_and_save_models` function. The new early stopping measure will be saved within the data module and accessed during training.\n\n\n-----\n\n## Batch size\n\nThe batch size can be set using the ``batch_size`` argument in the :func:`~.fusilli.data.prepare_fusion_data` function. By default, the batch size is 8.\n\n```python\nfrom fusilli.data import prepare_fusion_data\nfrom fusilli.train import train_and_save_models\n\ndatamodule = prepare_fusion_data(\n prediction_task=\"binary\",\n fusion_model=example_model,\n data_paths=data_paths,\n output_paths=output_path,\n batch_size=32\n )\n\ntrained_model_list = train_and_save_models(\n data_module=datamodule,\n fusion_model=example_model,\n batch_size=32,\n )\n```\n-----\n\n## Number of epochs\n\nYou can change the maximum number of epochs using the ``max_epochs`` argument in the :func:`~.fusilli.data.prepare_fusion_data` and :func:`~.fusilli.train.train_and_save_models` functions. By default, the maximum number of epochs is 1000.\n\nYou also pass it to the :func:`~.fusilli.data.prepare_fusion_data` function because some of the fusion models require pre-training.\n\nChanging the ``max_epochs`` parameter is especially useful when wanting to run a quick test of your model. For example, you can set ``max_epochs=5`` to run a quick test of your model.\n\n```python\nfrom fusilli.data import prepare_fusion_data\nfrom fusilli.train import train_and_save_models\n\ndatamodule = prepare_fusion_data(\n prediction_task=\"binary\",\n fusion_model=example_model,\n data_paths=data_paths,\n output_paths=output_path,\n max_epochs=5,\n )\n\ntrained_model_list = train_and_save_models(\n data_module=datamodule,\n fusion_model=example_model,\n max_epochs=5,\n )\n```\nSetting ``max_epochs`` to -1 will train the model until early stopping is triggered.\n\n-----\n\n## Checkpoint suffix modification\n\nBy default, Fusilli saves the model checkpoints in the following format:\n\n ``{fusion_model.__name__}_epoch={epoch_n}.ckpt``\n\nIf the checkpoint is for a pre-trained model, then the following format is used:\n\n ``subspace_{fusion_model.__name__}_{pretrained_model.__name__}.ckpt``\n\nYou can add suffixes to the checkpoint names by passing a string to the ``extra_log_string_dict`` argument in the :func:`~.fusilli.data.prepare_fusion_data` and :func:`~.fusilli.train.train_and_save_models` functions. For example, I could add a suffix to denote that I've changed the batch size for this particular run:\n\n```python\nfrom fusilli.data import prepare_fusion_data\nfrom fusilli.train import train_and_save_models\n\nextra_suffix_dict = {\"batchsize\": 32}\n\ndatamodule = prepare_fusion_data(\n prediction_task=\"binary\",\n fusion_model=example_model,\n data_paths=data_paths,\n output_paths=output_path,\n batch_size=32,\n extra_log_string_dict=extra_suffix_dict,\n )\n\ntrained_model_list = train_and_save_models(\n data_module=datamodule,\n fusion_model=example_model,\n batch_size=32,\n extra_log_string_dict=extra_suffix_dict,\n )\n```\nThe checkpoint name would then be (if the model trained for 100 epochs):\n\n ``ExampleModel_epoch=100_batchsize_32.ckpt``\n\n\n<div class=\"alert alert-info\"><h4>Note</h4><p>The ``extra_log_string_dict`` argument is also used to modify the logging behaviour of the model. For more information, see `wandb`.</p></div>\n"
]
},
{
Expand Down Expand Up @@ -35,7 +35,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.13"
"version": "3.9.16"
}
},
"nbformat": 4,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
.. code-block:: python
from fusilli.data import get_data_module
from fusilli.data import prepare_fusion_data
from fusilli.train import train_and_save_models
from lightning.pytorch.callbacks import EarlyStopping
Expand All @@ -35,42 +35,44 @@
mode="min",
)
datamodule = get_data_module(
datamodule = prepare_fusion_data(
prediction_task="binanry",
fusion_model=example_model,
params=params,
data_paths=data_paths,
output_paths=output_path,
own_early_stopping_callback=modified_early_stopping_callback,
)
trained_model_list = train_and_save_models(
data_module=datamodule,
params=params,
fusion_model=example_model,
)
Note that you only need to pass the callback to the :func:`~.fusilli.data.get_data_module` and **not** to the :func:`~.fusilli.train.train_and_save_models` function. The new early stopping measure will be saved within the data module and accessed during training.
Note that you only need to pass the callback to the :func:`~.fusilli.data.prepare_fusion_data` and **not** to the :func:`~.fusilli.train.train_and_save_models` function. The new early stopping measure will be saved within the data module and accessed during training.
-----
Batch size
----------
The batch size can be set using the ``batch_size`` argument in the :func:`~.fusilli.data.get_data_module` function. By default, the batch size is 8.
The batch size can be set using the ``batch_size`` argument in the :func:`~.fusilli.data.prepare_fusion_data` function. By default, the batch size is 8.
.. code-block:: python
from fusilli.data import get_data_module
from fusilli.data import prepare_fusion_data
from fusilli.train import train_and_save_models
datamodule = get_data_module(
datamodule = prepare_fusion_data(
prediction_task="binary",
fusion_model=example_model,
params=params,
batch_size=32,
data_paths=data_paths,
output_paths=output_path,
batch_size=32
)
trained_model_list = train_and_save_models(
data_module=datamodule,
params=params,
fusion_model=example_model,
batch_size=32,
)
Expand All @@ -81,26 +83,27 @@
Number of epochs
-------------------
You can change the maximum number of epochs using the ``max_epochs`` argument in the :func:`~.fusilli.data.get_data_module` and :func:`~.fusilli.train.train_and_save_models` functions. By default, the maximum number of epochs is 1000.
You can change the maximum number of epochs using the ``max_epochs`` argument in the :func:`~.fusilli.data.prepare_fusion_data` and :func:`~.fusilli.train.train_and_save_models` functions. By default, the maximum number of epochs is 1000.
You also pass it to the :func:`~.fusilli.data.get_data_module` function because some of the fusion models require pre-training.
You also pass it to the :func:`~.fusilli.data.prepare_fusion_data` function because some of the fusion models require pre-training.
Changing the ``max_epochs`` parameter is especially useful when wanting to run a quick test of your model. For example, you can set ``max_epochs=5`` to run a quick test of your model.
.. code-block:: python
from fusilli.data import get_data_module
from fusilli.data import prepare_fusion_data
from fusilli.train import train_and_save_models
datamodule = get_data_module(
datamodule = prepare_fusion_data(
prediction_task="binary",
fusion_model=example_model,
params=params,
data_paths=data_paths,
output_paths=output_path,
max_epochs=5,
)
trained_model_list = train_and_save_models(
data_module=datamodule,
params=params,
fusion_model=example_model,
max_epochs=5,
)
Expand All @@ -120,25 +123,26 @@
``subspace_{fusion_model.__name__}_{pretrained_model.__name__}.ckpt``
You can add suffixes to the checkpoint names by passing a string to the ``extra_log_string_dict`` argument in the :func:`~.fusilli.data.get_data_module` and :func:`~.fusilli.train.train_and_save_models` functions. For example, I could add a suffix to denote that I've changed the batch size for this particular run:
You can add suffixes to the checkpoint names by passing a string to the ``extra_log_string_dict`` argument in the :func:`~.fusilli.data.prepare_fusion_data` and :func:`~.fusilli.train.train_and_save_models` functions. For example, I could add a suffix to denote that I've changed the batch size for this particular run:
.. code-block:: python
from fusilli.data import get_data_module
from fusilli.data import prepare_fusion_data
from fusilli.train import train_and_save_models
extra_suffix_dict = {"batchsize": 32}
datamodule = get_data_module(
datamodule = prepare_fusion_data(
prediction_task="binary",
fusion_model=example_model,
params=params,
data_paths=data_paths,
output_paths=output_path,
batch_size=32,
extra_log_string_dict=extra_suffix_dict,
)
trained_model_list = train_and_save_models(
data_module=datamodule,
params=params,
fusion_model=example_model,
batch_size=32,
extra_log_string_dict=extra_suffix_dict,
Expand Down
Loading

0 comments on commit 5f23273

Please sign in to comment.