Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase the limit of tokens for openai compatible llms #2383

Closed
2 of 3 tasks
myrulezzz opened this issue Feb 4, 2024 · 8 comments
Closed
2 of 3 tasks

Increase the limit of tokens for openai compatible llms #2383

myrulezzz opened this issue Feb 4, 2024 · 8 comments
Assignees
Labels
💪 enhancement New feature or request

Comments

@myrulezzz
Copy link

Self Checks

1. Is this request related to a challenge you're experiencing?

The limit of tokens for openai compatible llms is 2000. If you use a tool with your custom llm there is a possibility not to get the full response. Try mixtral8x7b and it seems that the response is not fully generated as i reach the maximum number of tokens which is 2000.

2. Describe the feature you'd like to see

Add functionality for the openai compatible api llms to increase the token limit to 4000-10000

3. How will this feature improve your workflow or experience?

i will get a full response if i use a custom tool with open source llm like mixtral 8x7b

4. Additional context or comments

In the properties of openai compatible llms should be a functionality where i can limit max_tokens to a bigger number. 4000-10000 i suppose

5. Can you help us with this feature?

  • I am interested in contributing to this feature.
@myrulezzz myrulezzz added the 💪 enhancement New feature or request label Feb 4, 2024
Copy link

dosubot bot commented Feb 4, 2024

🤖

Hey there, @myrulezzz! Great to see you diving into another interesting challenge with us. Hope you're doing awesome!

Thank you for your feature request. I understand that you would like to increase the token limit for OpenAI compatible LLMS from 2000 to between 4000 and 10000 in the dify repository.

The token limit for the OpenAI compatible LLMs is not directly set in the provided code. However, it's important to note that the token limit is often a constraint set by the model itself or the API you're using to access the model. For instance, OpenAI's GPT-3 has a maximum token limit of 4096.

If you're using a different model or API that allows for a higher token limit, you would need to adjust the request you're sending to that API to allow for more tokens. This would typically be done in the _generate method where the request to the API is constructed. However, without more information about the specific API or model you're using, it's not possible to provide a more specific solution.

If you're using a custom model with a higher token limit, you would need to ensure that your model is properly configured to handle the increased number of tokens. This could involve adjusting the model's architecture or training regimen to handle longer sequences of tokens.

In conclusion, increasing the token limit is not as simple as changing a value in the code. It involves considerations at the model level, the API level, and potentially the infrastructure level (as handling more tokens may require more computational resources).

You can refer to the following sources for more information:

The current method used in the dify codebase to set the token limit for OpenAI compatible LLMs is not explicitly mentioned in the provided context. However, the method get_num_tokens() is used to retrieve the number of tokens used in a given text. This method takes in the model, credentials, and texts as parameters. The number of tokens is then returned.

Here is the code snippet for the get_num_tokens() method:

def test_get_num_tokens():
    model = OpenAITextEmbeddingModel()

    num_tokens = model.get_num_tokens(
        model='text-embedding-ada-002',
        credentials={
            'openai_api_key': os.environ.get('OPENAI_API_KEY'),
            'openai_api_base': 'https://api.openai.com'
        },
        texts=[
            "hello",
            "world"
        ]
    )

    assert num_tokens == 2

Please note that this method is used for testing purposes. The actual implementation might be different and is not provided in the given context.
Sources:

The current maximum token limit set for OpenAI compatible LLMs in the dify codebase is 1024 tokens. This is indicated by the "model_max_length" field in the tokenizer configuration JSON file.
Sources:

I hope this information is helpful. If you have any further questions or need more clarification, please don't hesitate to ask.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

@takatost
Copy link
Collaborator

takatost commented Feb 5, 2024

@guchenhe pls check this out

@guchenhe guchenhe reopened this Feb 5, 2024
@guchenhe
Copy link
Collaborator

guchenhe commented Feb 5, 2024

image

You're free to modify both context length and max output token in the model config to suit your model. I tried mixtral on together.ai's endpoint and had it return 6000+ tokens no problem.

@myrulezzz
Copy link
Author

Hi i deployed the model with vllm on runpod so localai accepts only 2000

@takatost
Copy link
Collaborator

takatost commented Feb 7, 2024

Hi i deployed the model with vllm on runpod so localai accepts only 2000

Are you using localAI provider configuration?

@myrulezzz
Copy link
Author

Yes i do. There is no other option for vLLM endpoint

@takatost
Copy link
Collaborator

takatost commented Feb 7, 2024

Yes i do. There is no other option for vLLM endpoint

Can we try using the OpenAI Compatible API to integrate your model?

Copy link
Contributor

Close due to it's no longer active, if you have any questions, you can reopen it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💪 enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants