Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

infV2 fix for OPT size variants #4694

Merged
merged 6 commits into from
Nov 17, 2023
Merged

infV2 fix for OPT size variants #4694

merged 6 commits into from
Nov 17, 2023

Conversation

mrwyattii
Copy link
Contributor

The OPT model has inconsistent checkpoint layer names:
125m, 1.3b, 2.7b, 66b: model.decoder.* and lm_head.weights
6.7b, 13b, 30b: decoder.* and no lm_head.weights
350m: decoder.* and project_*

This PR extends support to all OPT models except 350m. We will have a future PR to handle the unique features of this model.

As part of this PR, wildcards can now be used in the model container PARAM_MAPPINGS, such as *decoder.embed_tokens.weights

@mrwyattii mrwyattii requested a review from loadams as a code owner November 16, 2023 22:14
@mrwyattii mrwyattii merged commit a3926bb into master Nov 17, 2023
14 of 16 checks passed
@mrwyattii mrwyattii deleted the mrwyattii/infv2-fix-OPT branch November 17, 2023 00:17
mauryaavinash95 pushed a commit to mauryaavinash95/DeepSpeed that referenced this pull request Feb 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants