Yi34B's 8k context capability on dify doesn't work #2360

PhenixZhang · 2024-02-02T09:34:23Z

Self Checks

I have searched for existing issues search for existing issues, including closed ones.
I confirm that I am using English to file this report (我已阅读并同意 Language Policy).

Dify version

0.5.2

Cloud or Self Hosted

Self Hosted (Docker), Self Hosted (Source)

Steps to reproduce

The model I use is Yi-34B (8k), and the openai-api interface is packaged through dify. The result shows that it can only receive <=2k token length, but if only the openai-api interface can receive 4k token length, I would like to ask what is the specific reason?

Here is 2k output：

Here is 3k output：

Here is 6k（openai-api） output：

Can ensure that the parameter given to dify is 8K.

✔️ Expected Behavior

Ability to give Yi34B's 8k context capability on dify doesn't work

❌ Actual Behavior

can't to give Yi34B 8k context capability on dify

dosubot · 2024-02-02T09:34:39Z

🤖

Hey there, @PhenixZhang! Great to see you diving into the depths of dify again. How's the coding journey treating you this time?

Based on the information you've provided, it seems like you're trying to utilize the full 8k context capability of the Yi34B model on dify, but are only able to receive a token length of <=2k. This could be due to the max_tokens parameter in the ernie-bot-8k.yaml configuration file. By default, this parameter is set to 1024, but it can be increased up to a maximum of 8000.

To utilize the full 8k context capability, you need to explicitly set the max_tokens parameter to a higher value when making the API call. Make sure you're not only setting the context_size to 8000 but also adjusting the max_tokens parameter accordingly in your request.

Here are the relevant files for your reference:

As for the specific reason why the Yi34B model is limited to a token length of <=2k in dify, it's not explicitly mentioned in the provided context. However, such limitations are often due to memory constraints and the computational resources required for processing larger sequences in the model.

Lastly, according to the specifications of the Yi34B model, it can handle a maximum token length of 192,000 tokens. This information is found under the parameter_rules section for the max_tokens parameter in the provided YAML file.

I hope this helps! If you have any other questions or need further clarification, feel free to ask.

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

takatost · 2024-02-02T11:51:36Z

Have you tried the OpenAI compatible API model provider?

PhenixZhang · 2024-02-04T01:30:41Z

Have you tried the OpenAI compatible API model provider?

Yes， This is the config screenshot

And using openai server directly without dify is no problem.

takatost · 2024-02-04T02:38:58Z

Can you try changing the max_tokens parameter to 4K or larger and see if it works?

PhenixZhang · 2024-02-04T07:18:12Z

Can you try changing the max_tokens parameter to 4K or larger and see if it works?

Of course~ I set the maximum number of tokens to 8K at the beginning, respectively in the following two places:

guchenhe · 2024-02-05T08:30:08Z

Hi @PhenixZhang, seems that I'm not able to reproduce this issue on my end. Can you upload the backend logs? If you're ok with it and your endpoint is accessible in public, you can also drop your endpoint's information at [email protected], i'll take a look

github-actions · 2024-02-21T03:04:45Z

Close due to it's no longer active, if you have any questions, you can reopen it.

soulteary · 2024-03-17T09:42:46Z

refs to #2865, When the function is online, the API can be output in the interface and the online model can be used directly.

Or configure the API of the internal testing platform.

PhenixZhang added the 🐞 bug Something isn't working label Feb 2, 2024

takatost assigned guchenhe Feb 5, 2024

takatost mentioned this issue Feb 5, 2024

Increase the limit of tokens for openai compatible llms #2383

Closed

3 tasks

guchenhe added the 🤔 cant-reproduce This issue can not be reproduced label Feb 5, 2024

github-actions bot added the no-issue-activity label Feb 21, 2024

crazywoola closed this as completed Feb 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Yi34B's 8k context capability on dify doesn't work #2360

Yi34B's 8k context capability on dify doesn't work #2360

PhenixZhang commented Feb 2, 2024

dosubot bot commented Feb 2, 2024 •

edited

Loading

takatost commented Feb 2, 2024

PhenixZhang commented Feb 4, 2024

takatost commented Feb 4, 2024

PhenixZhang commented Feb 4, 2024

guchenhe commented Feb 5, 2024

github-actions bot commented Feb 21, 2024

soulteary commented Mar 17, 2024

Yi34B's 8k context capability on dify doesn't work #2360

Yi34B's 8k context capability on dify doesn't work #2360

Comments

PhenixZhang commented Feb 2, 2024

Self Checks

Dify version

Cloud or Self Hosted

Steps to reproduce

✔️ Expected Behavior

❌ Actual Behavior

dosubot bot commented Feb 2, 2024 • edited Loading

Sources

takatost commented Feb 2, 2024

PhenixZhang commented Feb 4, 2024

takatost commented Feb 4, 2024

PhenixZhang commented Feb 4, 2024

guchenhe commented Feb 5, 2024

github-actions bot commented Feb 21, 2024

soulteary commented Mar 17, 2024

dosubot bot commented Feb 2, 2024 •

edited

Loading