-
-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
unfreezing 🥶 weights callback #297
Comments
sebffischer
changed the title
freezing 🥶 weights callback
unfreezing 🥶 weights callback
Oct 18, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
When finetuning a predefined image network on a downstream task, one often wants to freeze some weights for a given number of epochs/steps. As this is relatively common, we should offer a predefined callback (
"cb.freeze"
) to do enable this.The callback should be able to iteratively unfreeze layers after a given number of epochs / batches.
Background:
Each torch module represents its parameters as a named
list()
:When we want to unfreeze a specific weight, we can refer to it via its name in this list.
Further, we can freeze a parameter in a network by setting its
$requires_grad
field toFALSE
:We can unfreeze a parameter the same way:
The callback needs to define
It should e.g. be possible to unfreeze
layer8
after the first epoch,layer7
after the third, and the rest after the third epoch.I can e.g. imagine this callback to have the parameters:
start
:: ASelector
(see theaffect_columns
parameter inmlr3pipelines
. that defines which weights will be trained from the start (maybe a better name exists for the parameter).unfreeze
: adata.table()
with columnweights
(alist()
column containingSelector
) and a columnepoch
ORbatch
.If we had something like:
module$parameters$some_layer
after the first epoch and the rest after the second layer.If the name in the
data.table
is"batch"
instead of"epoch"
, this should work just the same but aftern
batches instead of epochsThe text was updated successfully, but these errors were encountered: