Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow the maximum requested response size (tokens) to be specified in the command #18

Open
w0rp opened this issue Feb 15, 2023 · 1 comment
Labels
enhancement New feature or request

Comments

@w0rp
Copy link
Member

w0rp commented Feb 15, 2023

We should permit the number of tokens to be something that can be set at will when text is requested, in addition to being configurable for all prompts, so you can request smaller or larger prompts in different contexts.

@w0rp w0rp added the enhancement New feature or request label Feb 15, 2023
@Angelchev
Copy link
Member

I think with implementing #41 we should be able to dynamically adjust the request for a model source such that it never requests for more tokens than the maximum allowable by the model.

I think the design decision I want to go with is that from a UX perspective, the user shouldn't need to worry about the token length unless they are going over the limit.

I would personally rather give a model the freedom to respond in as many tokens as it can instead of artificially limiting its response. The downside to this is a monetary cost for API sources or computational cost for local sources (Comming Soon TM).

Side note, in the future token limitation might not be a thing to worry about due to Sliding Attention, but that's a different thing to contend with.


With all that said, this will need to be implemented to adjust the max token request dynamically anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants