Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Fix] Fix cpu inference UT failure #4430
[Fix] Fix cpu inference UT failure #4430
Changes from all commits
e60e645
ed95d21
f0022b0
af2f380
257ed96
ac4254f
cc0294f
af6661a
861088f
48787d9
f516fbd
17183bd
34b2570
f40a484
4ed3b60
bac6bb6
577b292
15295ae
8d182cb
d52ff77
0c6fa89
3dd7d34
e9fafa7
50bba12
1bd0dfb
5e41955
51922e4
c98752b
fc6025c
a8cec8b
9fb8ecb
590c959
c4cabcd
3663b75
7ca2ba5
3934919
4bf6493
b90fa99
62d835c
8055034
71d1106
b50a481
3dce178
1596224
a72beea
057b6ff
5886645
3244e1f
21b438c
2e6fa99
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the runtime error that we can hit here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
while True:
loop would iterate local ranks and collect global rank for these local ranks, until local rank is out of local rank range. In older version of PyTorch, this out-of-range will throw a ValueError. In PyTorch 2, this behavior will throw a RuntimeError.@Liangliang-Ma
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that caching the model list can be a good idea for the tests, but we need to save it to blob storage so that it is persistent. Additionally, I think the cache should have a timestamp connected to it, such that we update it every hour/day/week. See how we do this in MII:
https://github.com/microsoft/DeepSpeed-MII/blob/4472e4e206182ed56399f225848a7721565922fb/mii/utils.py#L39