Removes unnecessary cloning #6761

swigls · 2024-11-19T08:05:41Z

clone_tensors_for_torch_save() function:

When the item.device is different from device input,
tensor.clone() is not actually required because to() function also clones the original tensor.

+) I observed memory bloat under following conditions:

Training a Whisper model w/ transformers framework with ZeRO-0 and ZeRO-1 configuration.
Memory bloating can be observed every time the model state_dict is cloned using clone_tensors_for_torch_save()

After I removed the unnecessary clone(), seems like the problem is solved.

About clone_tensors_for_torch_save() function: if the `item.device` is different from `device` input, clone() is not required because to() function do the same thing. * With the unnecessary clone(), I observed memory bloat. After I removed the unnecessary clone(), it's solved.

tjruwase · 2024-11-20T19:34:41Z

@swigls, thanks!

swigls requested a review from tjruwase as a code owner November 19, 2024 08:05

Merge branch 'master' into patch-1

39ac81d

tjruwase approved these changes Nov 20, 2024

View reviewed changes

Merge branch 'master' into patch-1

d2a0f26

loadams added this pull request to the merge queue Nov 21, 2024

Merged via the queue into microsoft:master with commit f515104 Nov 21, 2024
10 checks passed

swigls deleted the patch-1 branch November 22, 2024 02:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Removes unnecessary cloning #6761

Removes unnecessary cloning #6761

swigls commented Nov 19, 2024 •

edited

Loading

tjruwase commented Nov 20, 2024

Removes unnecessary cloning #6761

Removes unnecessary cloning #6761

Conversation

swigls commented Nov 19, 2024 • edited Loading

tjruwase commented Nov 20, 2024

swigls commented Nov 19, 2024 •

edited

Loading