resume training fails with single learning rate #17

weinman · 2021-07-16T13:37:26Z

Quick report from the field on commit 61aca54 to branch update_21014 (which addressed #15 and #16).

Because decolle.utils.MultiOpt uses the semantically accurate, but non-conforming method name load_state_dicts, resuming model training fails with train_lenet_decolle.py when trying to resume from a single learning-rate model whose opt is a torch .optim.Adamax object (which has only a load_state_dict method), the failure happening here.

I'd perhaps suggest simply renaming the MultiOpt method to the singular load_state_dict to avoid messiness elsewhere about checking for which attribute or object class is present.

I'm happy to submit a PR along those lines if you like; it looks like you pull the public facing version of this repo from elsewhere, so I'd understand if that complicates your git flow for so simple a change.

Here's the error trace:

Traceback (most recent call last):
File "train_lenet_decolle.py", line 111, in
starting_epoch = load_model_from_checkpoint(checkpoint_dir, net, opt)
File "[root]/conda/lib/python3.7/site-packages/decolle-0.1-py3.7.egg/decolle/utils.py", line 166, in load_model_from_checkpoint
AttributeError: 'Adamax' object has no attribute 'load_state_dicts'

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

resume training fails with single learning rate #17

resume training fails with single learning rate #17

weinman commented Jul 16, 2021 •

edited

Loading

resume training fails with single learning rate #17

resume training fails with single learning rate #17

Comments

weinman commented Jul 16, 2021 • edited Loading

weinman commented Jul 16, 2021 •

edited

Loading