-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: Adjustable response timeout per GenAI connector #166561
Comments
Pinging @elastic/response-ops (Team:ResponseOps) |
I've tested the new changes in Kibana 8.11.0, and I'm running into less timeout errors now when using my local LLM. I'd still like to see this be configurable, if possible, so that users can adjust timeouts per OpenAI connector to account for different models and LLM settings. |
For testing different local models and optimize the performance of them, I would really appreciate this feature! |
+1 |
1 similar comment
+1 |
@cnasikas will defer to others, timeouts are not really an issue for the Observability AI Assistant because we use streaming. |
Describe the feature:
The default GenAI response timeout appears to be around 60 seconds. This timeout should be adjustable per connector to account for varying models and responsiveness of the API. Ideally this would be an additional field to set during the connector configuration workflow in Kibana.
Describe a specific use case for the feature:
In this specific case, I am self-hosting a large language model with an OpenAI conformant API for development purposes on a bare metal server with 24 CPU cores and 96 GB RAM. The model typically sends a response within 2 minutes, which is obviously beyond the default timeout. As adoption of LLMs and capabilities of the AI Assistant expand, this will help organizations with privacy concerns that are hosting their own LLMs on commodity hardware.
The text was updated successfully, but these errors were encountered: