feat: add support for new cohere command r models #118

yiyuan-he · 2024-11-06T00:35:31Z

Description of changes:
Adding support for Cohere Command R models. The previous Cohere Command models are not yet fully deprecated (EOL April 2025) so we still include support for now.

Beginning 11/05/24 - Calls to old Cohere Command models now throw an exception for deprecation. I wasn't able to find any official announcement for this change, but I noticed it while testing during development in the Java SDK.

Interestingly, calls to the old model still return a response so the full gen ai attributes are still generated for the time being.

Test Plan:
Verified the attributes for the Command R model is being generated with sample app auto-instrumentation.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

yiyuan-he · 2024-11-06T17:38:40Z

aws-distro-opentelemetry-node-autoinstrumentation/src/patches/aws/services/bedrock.ts

+        if (requestBody.message !== undefined) {
+          // NOTE: We approximate the token count since this value is not directly available in the body
+          // According to Bedrock docs they use (total_chars / 6) to approximate token count for pricing.
+          // https://docs.aws.amazon.com/bedrock/latest/userguide/model-customization-prepare.html
+          spanAttributes[AwsSpanProcessingUtil.GEN_AI_USAGE_INPUT_TOKENS] = Math.ceil(requestBody.message.length / 6);
+        }


According to the docs, this data should be available in the response body. However when logging the response body in the implementation it seems the data is not actually there. As a result, I decided to stay with this token approximation approach.

yiyuan-he · 2024-11-06T17:43:47Z

aws-distro-opentelemetry-node-autoinstrumentation/src/patches/aws/services/bedrock.ts

-        if (responseBody.prompt !== undefined) {
+      } else if (currentModelId.includes('cohere.command-r')) {
+        console.log('Response Body:', responseBody);
+        if (responseBody.text !== undefined) {
          // NOTE: We approximate the token count since this value is not directly available in the body
          // According to Bedrock docs they use (total_chars / 6) to approximate token count for pricing.


The prompt is only available in the JavaScript implementation because of special data model defined in an upstream Otel package. This makes it possible to approximate the input token usage from the response body.

However, this is not possible in the Java implementation as there is no special data model wrapping the inputs into the response body. As a result, I decided to move this approximation logic strictly to the request body to keep the implementation logic consistent between languages.

yiyuan-he · 2024-11-06T18:02:59Z

aws-distro-opentelemetry-node-autoinstrumentation/src/patches/aws/services/bedrock.ts

+          // According to Bedrock docs they use (total_chars / 6) to approximate token count for pricing.
+          // https://docs.aws.amazon.com/bedrock/latest/userguide/model-customization-prepare.html
+          spanAttributes[AwsSpanProcessingUtil.GEN_AI_USAGE_INPUT_TOKENS] = Math.ceil(requestBody.message.length / 6);
+        }
      } else if (modelId.includes('cohere.command')) {
        if (requestBody.max_tokens !== undefined) {


I'm not sure if we should go ahead and remove support for the old Cohere Command model.

According to the docs, EOL should not be until 2025 but we are already getting 404s to this model.

yiyuan-he · 2024-11-08T22:09:08Z

aws-distro-opentelemetry-node-autoinstrumentation/src/patches/aws/services/bedrock.ts

@@ -265,7 +284,7 @@ export class BedrockRuntimeServiceExtension implements ServiceExtension {
        if (requestBody.top_p !== undefined) {
          spanAttributes[AwsSpanProcessingUtil.GEN_AI_REQUEST_TOP_P] = requestBody.top_p;
        }
-      } else if (modelId.includes('mistral.mistral')) {
+      } else if (modelId.includes('mistral')) {


We loosen this conditional because out of this list for mistral models, one of the model ids starts with mistral.mixtral instead of mistral.mistral. The request/response body syntax is still the same for both.

yiyuan-he requested a review from a team as a code owner November 6, 2024 00:35

feat: add support for new cohere command r models

a3355ea

yiyuan-he force-pushed the add-command-r-model branch from 41663ab to a3355ea Compare November 6, 2024 17:34

yiyuan-he commented Nov 6, 2024

View reviewed changes

yiyuan-he changed the title ~~feat: add support for new cohere command r models [not ready]~~ feat: add support for new cohere command r models Nov 6, 2024

fix: mistral ai model conditional

c25d7de

yiyuan-he force-pushed the add-command-r-model branch from ced18cb to c25d7de Compare November 8, 2024 22:07

yiyuan-he commented Nov 8, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add support for new cohere command r models #118

feat: add support for new cohere command r models #118

yiyuan-he commented Nov 6, 2024 •

edited

Loading

yiyuan-he Nov 6, 2024

yiyuan-he Nov 6, 2024

yiyuan-he Nov 6, 2024

yiyuan-he Nov 8, 2024

feat: add support for new cohere command r models #118

Are you sure you want to change the base?

feat: add support for new cohere command r models #118

Conversation

yiyuan-he commented Nov 6, 2024 • edited Loading

yiyuan-he Nov 6, 2024

Choose a reason for hiding this comment

yiyuan-he Nov 6, 2024

Choose a reason for hiding this comment

yiyuan-he Nov 6, 2024

Choose a reason for hiding this comment

yiyuan-he Nov 8, 2024

Choose a reason for hiding this comment

yiyuan-he commented Nov 6, 2024 •

edited

Loading