[Bug] min_p from request is not used #2682

ErykCh · 2024-10-29T16:50:53Z

Checklist

1. I have searched related issues but cannot get the expected help.
2. The bug has not been fixed in the latest version.
3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.

Describe the bug

Information about min_p in request is not used during inference:

Reproduction

Create a request with min_p and check logs if parameter was used:

response = requests.post(
    "http://localhost:8000/v1/chat/completions",
    headers={
        "Content-Type": "application/json",
    },
    json={
        "model": "Qwen_2.5",
        "messages": [
            {"role": "system", "content": ""},
            {"role": "user", "content": "Hi"}
        ],
        "presence_penalty": 0.0,
        "frequency_penalty": 0.0,
        "repetition_penalty": 1.05,
        "temperature": 0.1,
        "top_k": 2,
        "top_p": 0.9,
        "min_p": 0.3,
        "max_tokens": 10480,
        "seed": 1234
    }
)

Environment

docker run --runtime nvidia --gpus all --shm-size 64g -d --name lmdeploy-QWEN_2.5_32B --restart unless-stopped -v ~/.cache/huggingface:/root/.cache/huggingface -p 8000:8000 openmmlab/lmdeploy:latest-cu12 lmdeploy serve api_server --server-port 8000 --model-name Qwen_2.5 --backend turbomind --model-format awq --enable-prefix-caching --log-level DEBUG Qwen/Qwen2.5-32B-Instruct-AWQ

Error traceback

No response

The text was updated successfully, but these errors were encountered:

AllentDan · 2024-10-30T05:13:22Z

May try #2681

lvhan028 assigned AllentDan Oct 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] min_p from request is not used #2682

[Bug] min_p from request is not used #2682

ErykCh commented Oct 29, 2024

AllentDan commented Oct 30, 2024

[Bug] min_p from request is not used #2682

[Bug] min_p from request is not used #2682

Comments

ErykCh commented Oct 29, 2024

Checklist

Describe the bug

Reproduction

Environment

Error traceback

AllentDan commented Oct 30, 2024