You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In decoupled_optimizer.py, one finds the code fragment:
# Iterate through the named modules of the model.formodule_name, moduleinmodel.named_modules():
# Check if the current module is an instance of any of the desired# types (LayerNorm or torch.nn.Embedding).forndimin [LayerNorm, torch.nn.Embedding]:
ifisinstance(module, ndim):
# If torch.nn.Embedding, append its name with a ".weight"# suffix to the no_decay list.ifmodule_name==exclude_module:
no_decay.append(f"{module_name}.weight")
else:
# If the module is an instance of LayerNormno_decay.append(f"{module_name}.gamma")
# Exit the inner loop since the desired module has been found.break
If the module_name != exclude_module, this code appends a parameter named gamma to the no_decay list. In this case, the layer is a LayerNorm, defined in torch.nn.LayerNorm, which only has parameters weight and bias. Thus, .gamma should be replaced by weight.
Of course, I do not really know why bias is not included. But that is for another day.
Upvote & Fund
We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.
The text was updated successfully, but these errors were encountered:
In
decoupled_optimizer.py
, one finds the code fragment:If the
module_name != exclude_module
, this code appends a parameter namedgamma
to theno_decay
list. In this case, the layer is a LayerNorm, defined in torch.nn.LayerNorm, which only has parametersweight
andbias
. Thus,.gamma
should be replaced byweight
.Of course, I do not really know why
bias
is not included. But that is for another day.Upvote & Fund
The text was updated successfully, but these errors were encountered: