-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[OpenAI] use usage stats for token counts #192962
Labels
Team:AI Infra
AppEx AI Infrastructure Team
Comments
Pinging @elastic/appex-ai-infra (Team:AI Infra) |
cc @elastic/obs-ai-assistant @elastic/security-generative-ai |
pgayvallet
changed the title
[openAI] use usage stats for token counts
[OpenAI] use usage stats for token counts
Nov 5, 2024
pgayvallet
added a commit
that referenced
this issue
Nov 20, 2024
## Summary Fix #192962 Add support for native openAI token count for streaming APIs. This is done by adding the `stream_options: {"include_usage": true}` parameter when `stream: true` is being used ([doc](https://platform.openai.com/docs/api-reference/chat/create#chat-create-stream_options)), and then using the `usage` entry for the last emitted chunk. **Note**: this was done only for the `OpenAI` and `AzureAI` [providers](https://github.com/elastic/kibana/blob/83a701e837a7a84a86dcc8d359154f900f69676a/x-pack/plugins/stack_connectors/common/openai/constants.ts#L27-L31), and **not** for the `Other` provider. The reasoning is that not all openAI """compatible""" providers fully support all options, so I didn't want to risk adding a parameter that could cause some models using an openAI adapter to reject the requests. This is also the reason why I did not change the way [getTokenCountFromOpenAIStream](https://github.com/elastic/kibana/blob/8bffd618059aacc30d6190a0d143d8b0c7217faf/x-pack/plugins/actions/server/lib/get_token_count_from_openai_stream.ts#L15) function, as we want that to work for all providers. --------- Co-authored-by: Elastic Machine <[email protected]>
kibanamachine
pushed a commit
to kibanamachine/kibana
that referenced
this issue
Nov 20, 2024
…#200745) ## Summary Fix elastic#192962 Add support for native openAI token count for streaming APIs. This is done by adding the `stream_options: {"include_usage": true}` parameter when `stream: true` is being used ([doc](https://platform.openai.com/docs/api-reference/chat/create#chat-create-stream_options)), and then using the `usage` entry for the last emitted chunk. **Note**: this was done only for the `OpenAI` and `AzureAI` [providers](https://github.com/elastic/kibana/blob/83a701e837a7a84a86dcc8d359154f900f69676a/x-pack/plugins/stack_connectors/common/openai/constants.ts#L27-L31), and **not** for the `Other` provider. The reasoning is that not all openAI """compatible""" providers fully support all options, so I didn't want to risk adding a parameter that could cause some models using an openAI adapter to reject the requests. This is also the reason why I did not change the way [getTokenCountFromOpenAIStream](https://github.com/elastic/kibana/blob/8bffd618059aacc30d6190a0d143d8b0c7217faf/x-pack/plugins/actions/server/lib/get_token_count_from_openai_stream.ts#L15) function, as we want that to work for all providers. --------- Co-authored-by: Elastic Machine <[email protected]> (cherry picked from commit 67171e1)
TattdCodeMonkey
pushed a commit
to TattdCodeMonkey/kibana
that referenced
this issue
Nov 21, 2024
…#200745) ## Summary Fix elastic#192962 Add support for native openAI token count for streaming APIs. This is done by adding the `stream_options: {"include_usage": true}` parameter when `stream: true` is being used ([doc](https://platform.openai.com/docs/api-reference/chat/create#chat-create-stream_options)), and then using the `usage` entry for the last emitted chunk. **Note**: this was done only for the `OpenAI` and `AzureAI` [providers](https://github.com/elastic/kibana/blob/83a701e837a7a84a86dcc8d359154f900f69676a/x-pack/plugins/stack_connectors/common/openai/constants.ts#L27-L31), and **not** for the `Other` provider. The reasoning is that not all openAI """compatible""" providers fully support all options, so I didn't want to risk adding a parameter that could cause some models using an openAI adapter to reject the requests. This is also the reason why I did not change the way [getTokenCountFromOpenAIStream](https://github.com/elastic/kibana/blob/8bffd618059aacc30d6190a0d143d8b0c7217faf/x-pack/plugins/actions/server/lib/get_token_count_from_openai_stream.ts#L15) function, as we want that to work for all providers. --------- Co-authored-by: Elastic Machine <[email protected]>
paulinashakirova
pushed a commit
to paulinashakirova/kibana
that referenced
this issue
Nov 26, 2024
…#200745) ## Summary Fix elastic#192962 Add support for native openAI token count for streaming APIs. This is done by adding the `stream_options: {"include_usage": true}` parameter when `stream: true` is being used ([doc](https://platform.openai.com/docs/api-reference/chat/create#chat-create-stream_options)), and then using the `usage` entry for the last emitted chunk. **Note**: this was done only for the `OpenAI` and `AzureAI` [providers](https://github.com/elastic/kibana/blob/83a701e837a7a84a86dcc8d359154f900f69676a/x-pack/plugins/stack_connectors/common/openai/constants.ts#L27-L31), and **not** for the `Other` provider. The reasoning is that not all openAI """compatible""" providers fully support all options, so I didn't want to risk adding a parameter that could cause some models using an openAI adapter to reject the requests. This is also the reason why I did not change the way [getTokenCountFromOpenAIStream](https://github.com/elastic/kibana/blob/8bffd618059aacc30d6190a0d143d8b0c7217faf/x-pack/plugins/actions/server/lib/get_token_count_from_openai_stream.ts#L15) function, as we want that to work for all providers. --------- Co-authored-by: Elastic Machine <[email protected]>
CAWilson94
pushed a commit
to CAWilson94/kibana
that referenced
this issue
Dec 12, 2024
…#200745) ## Summary Fix elastic#192962 Add support for native openAI token count for streaming APIs. This is done by adding the `stream_options: {"include_usage": true}` parameter when `stream: true` is being used ([doc](https://platform.openai.com/docs/api-reference/chat/create#chat-create-stream_options)), and then using the `usage` entry for the last emitted chunk. **Note**: this was done only for the `OpenAI` and `AzureAI` [providers](https://github.com/elastic/kibana/blob/83a701e837a7a84a86dcc8d359154f900f69676a/x-pack/plugins/stack_connectors/common/openai/constants.ts#L27-L31), and **not** for the `Other` provider. The reasoning is that not all openAI """compatible""" providers fully support all options, so I didn't want to risk adding a parameter that could cause some models using an openAI adapter to reject the requests. This is also the reason why I did not change the way [getTokenCountFromOpenAIStream](https://github.com/elastic/kibana/blob/8bffd618059aacc30d6190a0d143d8b0c7217faf/x-pack/plugins/actions/server/lib/get_token_count_from_openai_stream.ts#L15) function, as we want that to work for all providers. --------- Co-authored-by: Elastic Machine <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
OpenAI can now exposes usage stats for the stream completion APIs
https://community.openai.com/t/usage-stats-now-available-when-using-streaming-with-the-chat-completions-api-or-completions-api/738156
We should leverage this, when possible, instead of manually counting the tokens like we're currently doing.
There are multiple places where we'll want to do that:
kibana/x-pack/plugins/actions/server/lib/get_token_count_from_openai_stream.ts
Lines 15 to 19 in 8bffd61
(atm we're not even emitting completion events:)
kibana/x-pack/plugins/inference/server/chat_complete/adapters/openai/openai_adapter.ts
Lines 60 to 62 in 8bffd61
kibana/x-pack/plugins/observability_solution/observability_ai_assistant/server/service/client/adapters/process_openai_stream.ts
Lines 82 to 87 in 8bffd61
security assistantRelated:
choices
list #192951choices
list (bis) #192961The text was updated successfully, but these errors were encountered: