Fixes for training models with bf16 + freshly initialized optimizer via load_module_only
#7940
Job | Run time |
---|---|
4m 59s | |
4m 59s |
load_module_only
#7940
Job | Run time |
---|---|
4m 59s | |
4m 59s |