[OpenAI] use usage stats for token counts #192962

pgayvallet · 2024-09-15T18:57:05Z

OpenAI can now exposes usage stats for the stream completion APIs

https://community.openai.com/t/usage-stats-now-available-when-using-streaming-with-the-chat-completions-api-or-completions-api/738156

We should leverage this, when possible, instead of manually counting the tokens like we're currently doing.

There are multiple places where we'll want to do that:

the openAI action connector

kibana/x-pack/plugins/actions/server/lib/get_token_count_from_openai_stream.ts

Lines 15 to 19 in 8bffd61

    
           export async function getTokenCountFromOpenAIStream({ 
        
             responseStream, 
        
             body, 
        
             logger, 
        
           }: {

the inference plugin (openAI adapter)

(atm we're not even emitting completion events:)

kibana/x-pack/plugins/inference/server/chat_complete/adapters/openai/openai_adapter.ts

Lines 60 to 62 in 8bffd61

    
           map( 
        
             (line) => JSON.parse(line) as OpenAI.ChatCompletionChunk | { error: { message: string } } 
        
           ),

o11y assistant openAI adapter

kibana/x-pack/plugins/observability_solution/observability_ai_assistant/server/service/client/adapters/process_openai_stream.ts

Lines 82 to 87 in 8bffd61

    
           completionTokenCount += sum( 
        
             [ 
        
               firstChoice?.delta.content, 
        
               firstChoice?.delta.function_call?.name, 
        
               firstChoice?.delta.function_call?.arguments, 
        
               ...(firstChoice?.delta.tool_calls?.flatMap((toolCall) => {

~~security assistant~~

elasticmachine · 2024-09-15T18:57:07Z

Pinging @elastic/appex-ai-infra (Team:AI Infra)

pgayvallet · 2024-09-15T19:00:51Z

cc @elastic/obs-ai-assistant @elastic/security-generative-ai

## Summary Fix #192962 Add support for native openAI token count for streaming APIs. This is done by adding the `stream_options: {"include_usage": true}` parameter when `stream: true` is being used ([doc](https://platform.openai.com/docs/api-reference/chat/create#chat-create-stream_options)), and then using the `usage` entry for the last emitted chunk. **Note**: this was done only for the `OpenAI` and `AzureAI` [providers](https://github.com/elastic/kibana/blob/83a701e837a7a84a86dcc8d359154f900f69676a/x-pack/plugins/stack_connectors/common/openai/constants.ts#L27-L31), and **not** for the `Other` provider. The reasoning is that not all openAI """compatible""" providers fully support all options, so I didn't want to risk adding a parameter that could cause some models using an openAI adapter to reject the requests. This is also the reason why I did not change the way [getTokenCountFromOpenAIStream](https://github.com/elastic/kibana/blob/8bffd618059aacc30d6190a0d143d8b0c7217faf/x-pack/plugins/actions/server/lib/get_token_count_from_openai_stream.ts#L15) function, as we want that to work for all providers. --------- Co-authored-by: Elastic Machine <[email protected]>

…#200745) ## Summary Fix elastic#192962 Add support for native openAI token count for streaming APIs. This is done by adding the `stream_options: {"include_usage": true}` parameter when `stream: true` is being used ([doc](https://platform.openai.com/docs/api-reference/chat/create#chat-create-stream_options)), and then using the `usage` entry for the last emitted chunk. **Note**: this was done only for the `OpenAI` and `AzureAI` [providers](https://github.com/elastic/kibana/blob/83a701e837a7a84a86dcc8d359154f900f69676a/x-pack/plugins/stack_connectors/common/openai/constants.ts#L27-L31), and **not** for the `Other` provider. The reasoning is that not all openAI """compatible""" providers fully support all options, so I didn't want to risk adding a parameter that could cause some models using an openAI adapter to reject the requests. This is also the reason why I did not change the way [getTokenCountFromOpenAIStream](https://github.com/elastic/kibana/blob/8bffd618059aacc30d6190a0d143d8b0c7217faf/x-pack/plugins/actions/server/lib/get_token_count_from_openai_stream.ts#L15) function, as we want that to work for all providers. --------- Co-authored-by: Elastic Machine <[email protected]> (cherry picked from commit 67171e1)

…#200745) ## Summary Fix elastic#192962 Add support for native openAI token count for streaming APIs. This is done by adding the `stream_options: {"include_usage": true}` parameter when `stream: true` is being used ([doc](https://platform.openai.com/docs/api-reference/chat/create#chat-create-stream_options)), and then using the `usage` entry for the last emitted chunk. **Note**: this was done only for the `OpenAI` and `AzureAI` [providers](https://github.com/elastic/kibana/blob/83a701e837a7a84a86dcc8d359154f900f69676a/x-pack/plugins/stack_connectors/common/openai/constants.ts#L27-L31), and **not** for the `Other` provider. The reasoning is that not all openAI """compatible""" providers fully support all options, so I didn't want to risk adding a parameter that could cause some models using an openAI adapter to reject the requests. This is also the reason why I did not change the way [getTokenCountFromOpenAIStream](https://github.com/elastic/kibana/blob/8bffd618059aacc30d6190a0d143d8b0c7217faf/x-pack/plugins/actions/server/lib/get_token_count_from_openai_stream.ts#L15) function, as we want that to work for all providers. --------- Co-authored-by: Elastic Machine <[email protected]>

pgayvallet added the Team:AI Infra AppEx AI Infrastructure Team label Sep 15, 2024

pgayvallet changed the title ~~[openAI] use usage stats for token counts~~ [OpenAI] use usage stats for token counts Nov 5, 2024

pgayvallet added llm-task-framework and removed llm-task-framework labels Nov 5, 2024

legrego assigned pgayvallet Nov 19, 2024

pgayvallet mentioned this issue Nov 19, 2024

[inference] add support for openAI native stream token count #200745

Merged

pgayvallet closed this as completed in #200745 Nov 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[OpenAI] use usage stats for token counts #192962

[OpenAI] use usage stats for token counts #192962

pgayvallet commented Sep 15, 2024 •

edited

Loading

elasticmachine commented Sep 15, 2024

pgayvallet commented Sep 15, 2024

[OpenAI] use usage stats for token counts #192962

[OpenAI] use usage stats for token counts #192962

Comments

pgayvallet commented Sep 15, 2024 • edited Loading

elasticmachine commented Sep 15, 2024

pgayvallet commented Sep 15, 2024

pgayvallet commented Sep 15, 2024 •

edited

Loading