Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Yi34B's 8k context capability on dify doesn't work #2360

Closed
2 tasks done
PhenixZhang opened this issue Feb 2, 2024 · 8 comments
Closed
2 tasks done

Yi34B's 8k context capability on dify doesn't work #2360

PhenixZhang opened this issue Feb 2, 2024 · 8 comments
Assignees
Labels
🐞 bug Something isn't working 🤔 cant-reproduce This issue can not be reproduced

Comments

@PhenixZhang
Copy link

Self Checks

Dify version

0.5.2

Cloud or Self Hosted

Self Hosted (Docker), Self Hosted (Source)

Steps to reproduce

The model I use is Yi-34B (8k), and the openai-api interface is packaged through dify. The result shows that it can only receive <=2k token length, but if only the openai-api interface can receive 4k token length, I would like to ask what is the specific reason?

Here is 2k output:
image
Here is 3k output:
image
Here is 6k(openai-api) output:
image

Can ensure that the parameter given to dify is 8K.

✔️ Expected Behavior

Ability to give Yi34B's 8k context capability on dify doesn't work

❌ Actual Behavior

can't to give Yi34B 8k context capability on dify

@PhenixZhang PhenixZhang added the 🐞 bug Something isn't working label Feb 2, 2024
Copy link

dosubot bot commented Feb 2, 2024

🤖

Hey there, @PhenixZhang! Great to see you diving into the depths of dify again. How's the coding journey treating you this time?

Based on the information you've provided, it seems like you're trying to utilize the full 8k context capability of the Yi34B model on dify, but are only able to receive a token length of <=2k. This could be due to the max_tokens parameter in the ernie-bot-8k.yaml configuration file. By default, this parameter is set to 1024, but it can be increased up to a maximum of 8000.

To utilize the full 8k context capability, you need to explicitly set the max_tokens parameter to a higher value when making the API call. Make sure you're not only setting the context_size to 8000 but also adjusting the max_tokens parameter accordingly in your request.

Here are the relevant files for your reference:

As for the specific reason why the Yi34B model is limited to a token length of <=2k in dify, it's not explicitly mentioned in the provided context. However, such limitations are often due to memory constraints and the computational resources required for processing larger sequences in the model.

Lastly, according to the specifications of the Yi34B model, it can handle a maximum token length of 192,000 tokens. This information is found under the parameter_rules section for the max_tokens parameter in the provided YAML file.

I hope this helps! If you have any other questions or need further clarification, feel free to ask.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

@takatost
Copy link
Collaborator

takatost commented Feb 2, 2024

Have you tried the OpenAI compatible API model provider?

@PhenixZhang
Copy link
Author

Have you tried the OpenAI compatible API model provider?

Yes, This is the config screenshot
image

And using openai server directly without dify is no problem.

@takatost
Copy link
Collaborator

takatost commented Feb 4, 2024

Can you try changing the max_tokens parameter to 4K or larger and see if it works?

@PhenixZhang
Copy link
Author

Can you try changing the max_tokens parameter to 4K or larger and see if it works?

Of course~ I set the maximum number of tokens to 8K at the beginning, respectively in the following two places:
image
image

@guchenhe
Copy link
Collaborator

guchenhe commented Feb 5, 2024

Hi @PhenixZhang, seems that I'm not able to reproduce this issue on my end. Can you upload the backend logs? If you're ok with it and your endpoint is accessible in public, you can also drop your endpoint's information at [email protected], i'll take a look

Copy link
Contributor

Close due to it's no longer active, if you have any questions, you can reopen it.

@soulteary
Copy link
Contributor

refs to #2865, When the function is online, the API can be output in the interface and the online model can be used directly.

Or configure the API of the internal testing platform.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐞 bug Something isn't working 🤔 cant-reproduce This issue can not be reproduced
Projects
None yet
Development

No branches or pull requests

5 participants