Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[OpenAI] use usage stats for token counts #192962

Closed
pgayvallet opened this issue Sep 15, 2024 · 2 comments · Fixed by #200745
Closed

[OpenAI] use usage stats for token counts #192962

pgayvallet opened this issue Sep 15, 2024 · 2 comments · Fixed by #200745
Assignees
Labels
Team:AI Infra AppEx AI Infrastructure Team

Comments

@pgayvallet
Copy link
Contributor

pgayvallet commented Sep 15, 2024

OpenAI can now exposes usage stats for the stream completion APIs

https://community.openai.com/t/usage-stats-now-available-when-using-streaming-with-the-chat-completions-api-or-completions-api/738156

We should leverage this, when possible, instead of manually counting the tokens like we're currently doing.

There are multiple places where we'll want to do that:

  • the openAI action connector

export async function getTokenCountFromOpenAIStream({
responseStream,
body,
logger,
}: {

  • the inference plugin (openAI adapter)

(atm we're not even emitting completion events:)

map(
(line) => JSON.parse(line) as OpenAI.ChatCompletionChunk | { error: { message: string } }
),

  • o11y assistant openAI adapter

completionTokenCount += sum(
[
firstChoice?.delta.content,
firstChoice?.delta.function_call?.name,
firstChoice?.delta.function_call?.arguments,
...(firstChoice?.delta.tool_calls?.flatMap((toolCall) => {

  • security assistant

Related:

@pgayvallet pgayvallet added the Team:AI Infra AppEx AI Infrastructure Team label Sep 15, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/appex-ai-infra (Team:AI Infra)

@pgayvallet
Copy link
Contributor Author

cc @elastic/obs-ai-assistant @elastic/security-generative-ai

@pgayvallet pgayvallet changed the title [openAI] use usage stats for token counts [OpenAI] use usage stats for token counts Nov 5, 2024
pgayvallet added a commit that referenced this issue Nov 20, 2024
## Summary

Fix #192962

Add support for native openAI token count for streaming APIs.

This is done by adding the `stream_options: {"include_usage": true}`
parameter when `stream: true` is being used
([doc](https://platform.openai.com/docs/api-reference/chat/create#chat-create-stream_options)),
and then using the `usage` entry for the last emitted chunk.

**Note**: this was done only for the `OpenAI` and `AzureAI`
[providers](https://github.com/elastic/kibana/blob/83a701e837a7a84a86dcc8d359154f900f69676a/x-pack/plugins/stack_connectors/common/openai/constants.ts#L27-L31),
and **not** for the `Other` provider. The reasoning is that not all
openAI """compatible""" providers fully support all options, so I didn't
want to risk adding a parameter that could cause some models using an
openAI adapter to reject the requests. This is also the reason why I did
not change the way
[getTokenCountFromOpenAIStream](https://github.com/elastic/kibana/blob/8bffd618059aacc30d6190a0d143d8b0c7217faf/x-pack/plugins/actions/server/lib/get_token_count_from_openai_stream.ts#L15)
function, as we want that to work for all providers.

---------

Co-authored-by: Elastic Machine <[email protected]>
kibanamachine pushed a commit to kibanamachine/kibana that referenced this issue Nov 20, 2024
…#200745)

## Summary

Fix elastic#192962

Add support for native openAI token count for streaming APIs.

This is done by adding the `stream_options: {"include_usage": true}`
parameter when `stream: true` is being used
([doc](https://platform.openai.com/docs/api-reference/chat/create#chat-create-stream_options)),
and then using the `usage` entry for the last emitted chunk.

**Note**: this was done only for the `OpenAI` and `AzureAI`
[providers](https://github.com/elastic/kibana/blob/83a701e837a7a84a86dcc8d359154f900f69676a/x-pack/plugins/stack_connectors/common/openai/constants.ts#L27-L31),
and **not** for the `Other` provider. The reasoning is that not all
openAI """compatible""" providers fully support all options, so I didn't
want to risk adding a parameter that could cause some models using an
openAI adapter to reject the requests. This is also the reason why I did
not change the way
[getTokenCountFromOpenAIStream](https://github.com/elastic/kibana/blob/8bffd618059aacc30d6190a0d143d8b0c7217faf/x-pack/plugins/actions/server/lib/get_token_count_from_openai_stream.ts#L15)
function, as we want that to work for all providers.

---------

Co-authored-by: Elastic Machine <[email protected]>
(cherry picked from commit 67171e1)
TattdCodeMonkey pushed a commit to TattdCodeMonkey/kibana that referenced this issue Nov 21, 2024
…#200745)

## Summary

Fix elastic#192962

Add support for native openAI token count for streaming APIs.

This is done by adding the `stream_options: {"include_usage": true}`
parameter when `stream: true` is being used
([doc](https://platform.openai.com/docs/api-reference/chat/create#chat-create-stream_options)),
and then using the `usage` entry for the last emitted chunk.

**Note**: this was done only for the `OpenAI` and `AzureAI`
[providers](https://github.com/elastic/kibana/blob/83a701e837a7a84a86dcc8d359154f900f69676a/x-pack/plugins/stack_connectors/common/openai/constants.ts#L27-L31),
and **not** for the `Other` provider. The reasoning is that not all
openAI """compatible""" providers fully support all options, so I didn't
want to risk adding a parameter that could cause some models using an
openAI adapter to reject the requests. This is also the reason why I did
not change the way
[getTokenCountFromOpenAIStream](https://github.com/elastic/kibana/blob/8bffd618059aacc30d6190a0d143d8b0c7217faf/x-pack/plugins/actions/server/lib/get_token_count_from_openai_stream.ts#L15)
function, as we want that to work for all providers.

---------

Co-authored-by: Elastic Machine <[email protected]>
paulinashakirova pushed a commit to paulinashakirova/kibana that referenced this issue Nov 26, 2024
…#200745)

## Summary

Fix elastic#192962

Add support for native openAI token count for streaming APIs.

This is done by adding the `stream_options: {"include_usage": true}`
parameter when `stream: true` is being used
([doc](https://platform.openai.com/docs/api-reference/chat/create#chat-create-stream_options)),
and then using the `usage` entry for the last emitted chunk.

**Note**: this was done only for the `OpenAI` and `AzureAI`
[providers](https://github.com/elastic/kibana/blob/83a701e837a7a84a86dcc8d359154f900f69676a/x-pack/plugins/stack_connectors/common/openai/constants.ts#L27-L31),
and **not** for the `Other` provider. The reasoning is that not all
openAI """compatible""" providers fully support all options, so I didn't
want to risk adding a parameter that could cause some models using an
openAI adapter to reject the requests. This is also the reason why I did
not change the way
[getTokenCountFromOpenAIStream](https://github.com/elastic/kibana/blob/8bffd618059aacc30d6190a0d143d8b0c7217faf/x-pack/plugins/actions/server/lib/get_token_count_from_openai_stream.ts#L15)
function, as we want that to work for all providers.

---------

Co-authored-by: Elastic Machine <[email protected]>
CAWilson94 pushed a commit to CAWilson94/kibana that referenced this issue Dec 12, 2024
…#200745)

## Summary

Fix elastic#192962

Add support for native openAI token count for streaming APIs.

This is done by adding the `stream_options: {"include_usage": true}`
parameter when `stream: true` is being used
([doc](https://platform.openai.com/docs/api-reference/chat/create#chat-create-stream_options)),
and then using the `usage` entry for the last emitted chunk.

**Note**: this was done only for the `OpenAI` and `AzureAI`
[providers](https://github.com/elastic/kibana/blob/83a701e837a7a84a86dcc8d359154f900f69676a/x-pack/plugins/stack_connectors/common/openai/constants.ts#L27-L31),
and **not** for the `Other` provider. The reasoning is that not all
openAI """compatible""" providers fully support all options, so I didn't
want to risk adding a parameter that could cause some models using an
openAI adapter to reject the requests. This is also the reason why I did
not change the way
[getTokenCountFromOpenAIStream](https://github.com/elastic/kibana/blob/8bffd618059aacc30d6190a0d143d8b0c7217faf/x-pack/plugins/actions/server/lib/get_token_count_from_openai_stream.ts#L15)
function, as we want that to work for all providers.

---------

Co-authored-by: Elastic Machine <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:AI Infra AppEx AI Infrastructure Team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants