[Bug]Max output tokens from an LLM should be configurable. #1131

mkbhanda · 2024-11-14T05:30:01Z

Priority

P2-High

OS type

Ubuntu

Hardware type

Xeon-SPR

Installation method

Pull docker images from hub.docker.com
Build docker images from source

Deploy method

Docker compose
Docker
Kubernetes
Helm

Running nodes

Single Node

What's the version?

Development branch, post V1.0.

Description

The ChatQnA example appears to be using the max_tokens parameter to control the number of output llm tokens, but it is not getting passed along if the re-ranker component is removed from the pipeline.
Perhaps we have a bug in Mega or in the token used. The OpenAI openAPI uses max_completion_tokens and we perhaps have migrated to this incompletely. We may need to also check GenAIComps.

This was noticed by @leslieluyu.

Reproduce steps

Run ChatQnA without re-ranker and try to control the maximum output tokens by passing in some value.

Raw log

No response

mkbhanda assigned ftian1 Nov 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]Max output tokens from an LLM should be configurable. #1131

[Bug]Max output tokens from an LLM should be configurable. #1131

mkbhanda commented Nov 14, 2024

[Bug]Max output tokens from an LLM should be configurable. #1131

[Bug]Max output tokens from an LLM should be configurable. #1131

Comments

mkbhanda commented Nov 14, 2024

Priority

OS type

Hardware type

Installation method

Deploy method

Running nodes

What's the version?

Description

Reproduce steps

Raw log