Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rc4 cherry-pick #226

Merged
merged 13 commits into from
Mar 25, 2024
Merged

Rc4 cherry-pick #226

merged 13 commits into from
Mar 25, 2024

Conversation

jchen351
Copy link
Contributor

No description provided.

kunal-vaishnavi and others added 4 commits March 22, 2024 15:40
### Description

This PR adds `repeat_kv` to the model builder for models where
`num_attention_heads != num_key_value_heads`.

### Motivation and Context

By supporting `repeat_kv`, models where `num_attention_heads !=
num_key_value_heads` can now run on both CPU and GPU.
jchen351 and others added 9 commits March 22, 2024 15:48
For safety. This will ensure the Model object's lifetime matches that of
any Generator using it.
Same as GeneratorParams and Tokenizer
swap p and k to match generate api functions
set_search_options already supports the functionality, so the extra
functions are confusing users since there are multiple ways to do the
same thing.
set_search_options is also more flexible as it supports all future
options without the need for extra APIs.
@jchen351 jchen351 merged commit f43f7b0 into rel-0.1.0 Mar 25, 2024
10 checks passed
@jchen351 jchen351 deleted the Cjian/rc4 branch March 25, 2024 05:25
@jchen351 jchen351 restored the Cjian/rc4 branch March 25, 2024 05:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants