[inference] Add non-stream versions of the `chatComplete` and `output` APIs. #198644

pgayvallet · 2024-11-01T09:23:34Z

At the moment, the chatComplete and output APIs are always returning on observable to allow supporting "llm response streaming".

If that's definitely useful in some scenarios (especially assistant-related calls), in most "task execution" scenario, we only really need the final and full response from the LLM, and having that observable-based API can be bothersome, as every call needs to be wrapped in the appropriate observable chaining to retrieve the data of the last event.

We should have a way to call those APIs in "non stream" mode, to have them return a promise of the complete response instead of the observable. One possible option for that would be to add a stream parameter, what would switch the shape of the response.

The text was updated successfully, but these errors were encountered:

elasticmachine · 2024-11-01T09:23:36Z

Pinging @elastic/appex-ai-infra (Team:AI Infra)

## Summary Fix elastic#198644 Add a `stream` parameter to the `chatComplete` and `output` APIs, defaulting to `false`, to switch between "full content response as promise" and "event observable" responses. Note: at the moment, in non-stream mode, the implementation is simply constructing the response from the observable. It should be possible later to improve this by having the LLM adapters handle the stream/no-stream logic, but this is out of scope of the current PR. ### Normal mode ```ts const response = await chatComplete({ connectorId: 'my-connector', system: "You are a helpful assistant", messages: [ { role: MessageRole.User, content: "Some question?"}, ] }); const { content, toolCalls } = response; // do something ``` ### Stream mode ```ts const events$ = chatComplete({ stream: true, connectorId: 'my-connector', system: "You are a helpful assistant", messages: [ { role: MessageRole.User, content: "Some question?"}, ] }); events$.subscribe((event) => { // do something }); ``` --------- Co-authored-by: kibanamachine <[email protected]> Co-authored-by: Elastic Machine <[email protected]> (cherry picked from commit fe16822)

## Summary Fix elastic#198644 Add a `stream` parameter to the `chatComplete` and `output` APIs, defaulting to `false`, to switch between "full content response as promise" and "event observable" responses. Note: at the moment, in non-stream mode, the implementation is simply constructing the response from the observable. It should be possible later to improve this by having the LLM adapters handle the stream/no-stream logic, but this is out of scope of the current PR. ### Normal mode ```ts const response = await chatComplete({ connectorId: 'my-connector', system: "You are a helpful assistant", messages: [ { role: MessageRole.User, content: "Some question?"}, ] }); const { content, toolCalls } = response; // do something ``` ### Stream mode ```ts const events$ = chatComplete({ stream: true, connectorId: 'my-connector', system: "You are a helpful assistant", messages: [ { role: MessageRole.User, content: "Some question?"}, ] }); events$.subscribe((event) => { // do something }); ``` --------- Co-authored-by: kibanamachine <[email protected]> Co-authored-by: Elastic Machine <[email protected]>

pgayvallet added llm-task-framework Team:AI Infra AppEx AI Infrastructure Team labels Nov 1, 2024

pgayvallet mentioned this issue Nov 1, 2024

Add stream param for inference APIs #198646

Merged

pgayvallet removed the llm-task-framework label Nov 5, 2024

pgayvallet closed this as completed in #198646 Nov 5, 2024

pgayvallet closed this as completed in fe16822 Nov 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[inference] Add non-stream versions of the `chatComplete` and `output` APIs. #198644

[inference] Add non-stream versions of the `chatComplete` and `output` APIs. #198644

pgayvallet commented Nov 1, 2024

elasticmachine commented Nov 1, 2024

[inference] Add non-stream versions of the chatComplete and output APIs. #198644

[inference] Add non-stream versions of the chatComplete and output APIs. #198644

Comments

pgayvallet commented Nov 1, 2024

elasticmachine commented Nov 1, 2024

[inference] Add non-stream versions of the `chatComplete` and `output` APIs. #198644

[inference] Add non-stream versions of the `chatComplete` and `output` APIs. #198644