-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix incorrect patch in zero.init #5921
Fix incorrect patch in zero.init #5921
Conversation
@microsoft-github-policy-service agree |
A better solution is proposed to handle |
Thank you @VeryLazyBoy for the great catch! I think the issue is that we patch superclass's |
@tohtana Yes! Your approach is less intrusive and much better. Let's go ahead with this new method. Should I close this merge request? |
@VeryLazyBoy Thank you for your response! |
This PR fixes an issue addressed in #5921. With this change, we only apply the patch for parameter partitioning to classes that have `__init__` so that we can avoid applying the patch multiple times. The class that does not have `__init__` now uses its superclass's one. So this PR also applies the patch to the root class, `torch.nn.modules.module.Module`. Thanks @VeryLazyBoy for the report and initial solution. --------- Co-authored-by: Logan Adams <[email protected]>
The code below has a problem where
cls.__init__
in line 525 can be modified before assignment to_old_init
. This could lead to an incorrect__init__
being backed up:DeepSpeed/deepspeed/runtime/zero/partition_parameters.py
Lines 524 to 534 in ffe0af2
Test Case