You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
1. I have searched related issues but cannot get the expected help.
2. The bug has not been fixed in the latest version.
3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
Describe the bug
VL model is running in apiserver mode as:
lmdeploy serve api_server ./InternVL2-Llama3-76B-AWQ --api-keys token-abc --model-name InternVL2-Llama3-76B --backend turbomind --server-port 8001 --model-format awq
when user sends request with system prompt,image and no user input, the server would throw exception as below.
hopefully lmdeploy should support the case where vllm could work just fine and that makes sense for user scenario.
INFO: 127.0.0.1:59488 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
ERROR: Exception in ASGI application
Traceback (most recent call last):
File "/root/miniconda3/lib/python3.12/site-packages/uvicorn/protocols/http/httptools_impl.py", line 401, in run_asgi
result = await app( # type: ignore[func-returns-value]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/lib/python3.12/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
return await self.app(scope, receive, send)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/lib/python3.12/site-packages/fastapi/applications.py", line 1054, in __call__
await super().__call__(scope, receive, send)
File "/root/miniconda3/lib/python3.12/site-packages/starlette/applications.py", line 113, in __call__
await self.middleware_stack(scope, receive, send)
File "/root/miniconda3/lib/python3.12/site-packages/starlette/middleware/errors.py", line 187, in __call__
raise exc
File "/root/miniconda3/lib/python3.12/site-packages/starlette/middleware/errors.py", line 165, in __call__
await self.app(scope, receive, _send)
File "/root/miniconda3/lib/python3.12/site-packages/starlette/middleware/cors.py", line 85, in __call__
await self.app(scope, receive, send)
File "/root/miniconda3/lib/python3.12/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
File "/root/miniconda3/lib/python3.12/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
raise exc
File "/root/miniconda3/lib/python3.12/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
await app(scope, receive, sender)
File "/root/miniconda3/lib/python3.12/site-packages/starlette/routing.py", line 715, in __call__
await self.middleware_stack(scope, receive, send)
File "/root/miniconda3/lib/python3.12/site-packages/starlette/routing.py", line 735, in app
await route.handle(scope, receive, send)
File "/root/miniconda3/lib/python3.12/site-packages/starlette/routing.py", line 288, in handle
await self.app(scope, receive, send)
File "/root/miniconda3/lib/python3.12/site-packages/starlette/routing.py", line 76, in app
await wrap_app_handling_exceptions(app, request)(scope, receive, send)
File "/root/miniconda3/lib/python3.12/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
raise exc
File "/root/miniconda3/lib/python3.12/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
await app(scope, receive, sender)
File "/root/miniconda3/lib/python3.12/site-packages/starlette/routing.py", line 73, in app
response = await f(request)
^^^^^^^^^^^^^^^^
File "/root/miniconda3/lib/python3.12/site-packages/fastapi/routing.py", line 301, in app
raw_response = await run_endpoint_function(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/lib/python3.12/site-packages/fastapi/routing.py", line 212, in run_endpoint_function
return await dependant.call(**values)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/lib/python3.12/site-packages/lmdeploy/serve/openai/api_server.py", line 475, in chat_completions_v1
async forresin result_generator:
File "/root/miniconda3/lib/python3.12/site-packages/lmdeploy/serve/async_engine.py", line 514, in generate
prompt_input = await self._get_prompt_input(prompt,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/lib/python3.12/site-packages/lmdeploy/serve/vl_async_engine.py", line 58, in _get_prompt_input
decorated = self.vl_prompt_template.messages2prompt(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/lib/python3.12/site-packages/lmdeploy/vl/templates.py", line 146, in messages2prompt
new_messages = self.convert_messages(messages, sequence_start)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/lib/python3.12/site-packages/lmdeploy/vl/templates.py", line 137, in convert_messages
prompt = self.append_image_token(prompt, num_images)
^^^^^^
UnboundLocalError: cannot access local variable 'prompt' where it is not associated with a value
The text was updated successfully, but these errors were encountered:
Checklist
Describe the bug
VL model is running in apiserver mode as:
lmdeploy serve api_server ./InternVL2-Llama3-76B-AWQ --api-keys token-abc --model-name InternVL2-Llama3-76B --backend turbomind --server-port 8001 --model-format awq
when user sends request with system prompt,image and no user input, the server would throw exception as below.
hopefully lmdeploy should support the case where vllm could work just fine and that makes sense for user scenario.
Reproduction
lmdeploy serve api_server ./InternVL2-Llama3-76B-AWQ --api-keys token-abc --model-name InternVL2-Llama3-76B --backend turbomind --server-port 8001 --model-format awq
Environment
Error traceback
The text was updated successfully, but these errors were encountered: