Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add support for new cohere command r models #118

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

yiyuan-he
Copy link
Contributor

@yiyuan-he yiyuan-he commented Nov 6, 2024

Description of changes:
Adding support for Cohere Command R models. The previous Cohere Command models are not yet fully deprecated (EOL April 2025) so we still include support for now.

Beginning 11/05/24 - Calls to old Cohere Command models now throw an exception for deprecation. I wasn't able to find any official announcement for this change, but I noticed it while testing during development in the Java SDK.

Interestingly, calls to the old model still return a response so the full gen ai attributes are still generated for the time being.

Screenshot 2024-11-05 at 5 01 11 PM

Test Plan:
Verified the attributes for the Command R model is being generated with sample app auto-instrumentation.
Screenshot 2024-11-05 at 4 52 36 PM

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@yiyuan-he yiyuan-he requested a review from a team as a code owner November 6, 2024 00:35
Comment on lines +258 to +263
if (requestBody.message !== undefined) {
// NOTE: We approximate the token count since this value is not directly available in the body
// According to Bedrock docs they use (total_chars / 6) to approximate token count for pricing.
// https://docs.aws.amazon.com/bedrock/latest/userguide/model-customization-prepare.html
spanAttributes[AwsSpanProcessingUtil.GEN_AI_USAGE_INPUT_TOKENS] = Math.ceil(requestBody.message.length / 6);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the docs, this data should be available in the response body. However when logging the response body in the implementation it seems the data is not actually there. As a result, I decided to stay with this token approximation approach.

Screenshot 2024-11-06 at 9 38 29 AM

if (responseBody.prompt !== undefined) {
} else if (currentModelId.includes('cohere.command-r')) {
console.log('Response Body:', responseBody);
if (responseBody.text !== undefined) {
// NOTE: We approximate the token count since this value is not directly available in the body
// According to Bedrock docs they use (total_chars / 6) to approximate token count for pricing.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The prompt is only available in the JavaScript implementation because of special data model defined in an upstream Otel package. This makes it possible to approximate the input token usage from the response body.

However, this is not possible in the Java implementation as there is no special data model wrapping the inputs into the response body. As a result, I decided to move this approximation logic strictly to the request body to keep the implementation logic consistent between languages.

Screenshot 2024-11-06 at 9 42 25 AM

// According to Bedrock docs they use (total_chars / 6) to approximate token count for pricing.
// https://docs.aws.amazon.com/bedrock/latest/userguide/model-customization-prepare.html
spanAttributes[AwsSpanProcessingUtil.GEN_AI_USAGE_INPUT_TOKENS] = Math.ceil(requestBody.message.length / 6);
}
} else if (modelId.includes('cohere.command')) {
if (requestBody.max_tokens !== undefined) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we should go ahead and remove support for the old Cohere Command model.

According to the docs, EOL should not be until 2025 but we are already getting 404s to this model.

@yiyuan-he yiyuan-he changed the title feat: add support for new cohere command r models [not ready] feat: add support for new cohere command r models Nov 6, 2024
@@ -265,7 +284,7 @@ export class BedrockRuntimeServiceExtension implements ServiceExtension {
if (requestBody.top_p !== undefined) {
spanAttributes[AwsSpanProcessingUtil.GEN_AI_REQUEST_TOP_P] = requestBody.top_p;
}
} else if (modelId.includes('mistral.mistral')) {
} else if (modelId.includes('mistral')) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We loosen this conditional because out of this list for mistral models, one of the model ids starts with mistral.mixtral instead of mistral.mistral. The request/response body syntax is still the same for both.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant