Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[On Hold] Upgrade nanoGPT #3974

Open
boris-drazic opened this issue Nov 23, 2023 · 11 comments
Open

[On Hold] Upgrade nanoGPT #3974

boris-drazic opened this issue Nov 23, 2023 · 11 comments
Assignees
Labels

Comments

@boris-drazic
Copy link
Contributor

No description provided.

@boris-drazic
Copy link
Contributor Author

boris-drazic commented Nov 23, 2023

Update nanoGPT model with following:

  • - store weights as TT tensors on Weka an load them from there PR #4221
  • - if a TT OP is not available, record which OP and its shapes; use fallback or torch OP until TT is implemented PR #4221
  • - implement entire model and make a test that compares PCC to run on CPU for a single input PR #4221
  • - set all OPs to produce tensors in tile layout commit
  • - implement submodules and tests for them (use batch >1 if it makes sense) Commit
  • - make a test that will load real inputs from a data set, run model, and compare TT outputs to valid/expected outputs from data set; also report compile/inference times and throughput Commit
  • - store all intermedaite tensors in L1 commit
  • - store all weights in L1 (if they fit) commit
  • - use bfp8 data format and tile layoute for all tensors

@boris-drazic
Copy link
Contributor Author

This replaces old issue #2193

@punithsekar
Copy link
Contributor

Here is the commit for Loading weights from weka path for nanogpt model Link

@punithsekar
Copy link
Contributor

Currently, We have moved weights to Weka path and uplifted the model with supported TT ops.
Commit Link

@Sudharsan-V
Copy link
Contributor

Sudharsan-V commented Dec 4, 2023

The task set all OPs to produce tensors in tile layout is linked to #4091

The task make a test that will load real inputs from a data set, run model, and compare TT outputs to valid/expected outputs from data set; also report compile/inference times and throughput is linked to #4092

@punithsekar
Copy link
Contributor

punithsekar commented Dec 7, 2023

This PR(#4221) has following

  • Uplifted nanoGPT model and recorded the ops that do not have TT-ops.
  • Tilized weights tensors are stored and loaded from the Weka path.
  • Tests for both sub-modules and the whole model.

We face a drop in PCC (0.99 to 0.98) for the whole model while using tt_lib.tensor.softmax in the attention submodule.

@Sudharsan-V
Copy link
Contributor

The tasks

 - store all intermedaite tensors in L1
 - store all weights in L1 (if they fit)
 

are linked to #4342

@boris-drazic
Copy link
Contributor Author

Can you make a PR for this?

@punithsekar
Copy link
Contributor

We were planning to create a PR for ,

 - store all intermedaite tensors in L1
 - store all weights in L1 (if they fit)

after #4221 gets merged. As this work is depended on #4221 work.

Can we wait until #4221 gets merged or can we create a PR for it?

@boris-drazic
Copy link
Contributor Author

Yes, you can wait.

@saichandax
Copy link
Contributor

PR #4221 merged. We can proceed with creating PRs for other commits.

@saichandax saichandax changed the title Upgrade nanoGPT [On Hold] Upgrade nanoGPT Jan 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants