Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix llama.covert_onnx to make it runnable in CI #19372

Merged
merged 11 commits into from
Feb 4, 2024
Merged

Fix llama.covert_onnx to make it runnable in CI #19372

merged 11 commits into from
Feb 4, 2024

Conversation

mszhanyi
Copy link
Contributor

@mszhanyi mszhanyi commented Feb 1, 2024

Description

  1. make parity_check use local model to avoid using hf token
  2. del the model didn't work because it tried to del the object define out of the function scope.
    So it caused out of memory in A10.
  3. In fact, 16G GPU memory (one T4) is enough. But the conversion process always be killed in T4 and it works on A10/24G.
    Standard_NC4as_T4_v3 has 28G CPU memory
    Standard_NV36ads_A10_v5 has 440G memory.
    It looks that the model conversion needs very huge memory.

Motivation and Context

Last time, I came across some issues in convert_to_onnx.py so I use the onnx model in https://github.com/microsoft/Llama-2-Onnx for testing.
Now, these issues could be fixed. So I use onnx model generated by this repo and the CI can cover the model conversion.

@mszhanyi mszhanyi requested a review from a team as a code owner February 1, 2024 15:54
@kunal-vaishnavi
Copy link
Contributor

Now that torch v2.2.0 has been released in stable, can you also update the below lines to say torch>=2.2.0?

# Please manually install torch>=2.2.0.dev20230920 with CUDA enabled for the CUDA version installed in your system.

@snnn snnn requested a review from kunal-vaishnavi February 2, 2024 15:56
@mszhanyi mszhanyi merged commit 435e199 into main Feb 4, 2024
92 of 94 checks passed
@mszhanyi mszhanyi deleted the zhanyi/option1 branch February 4, 2024 23:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants