Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: AI proxy advanced #7817

Merged
merged 7 commits into from
Sep 11, 2024
Merged

Feat: AI proxy advanced #7817

merged 7 commits into from
Sep 11, 2024

Conversation

lena-larionova
Copy link
Contributor

@lena-larionova lena-larionova commented Aug 29, 2024

Description

AI Proxy Advanced plugin.

Schema (config reference) and basic example added via Kong/docs-plugin-toolkit#47.

https://konghq.atlassian.net/browse/DOCU-3884

Questions for @fffonion @AntoineJac:

  • How many of the currently existing AI Proxy docs, especially the LLM integration guides, do we want to reuse for this plugin? How many of them apply?
  • Is there somewhere that I can find info on the load balancing and semantic routing capabilities of this plugin?

Testing instructions

Preview link: https://deploy-preview-7817--kongdocs.netlify.app/hub/kong-inc/ai-proxy-advanced/unreleased/

Checklist

@lena-larionova lena-larionova added review:copyedit Request for writer review. review:sme Request for SME review, external to the docs team. labels Aug 29, 2024
@lena-larionova lena-larionova requested a review from a team as a code owner August 29, 2024 19:07
@lena-larionova lena-larionova added this to the Gateway 3.8 milestone Aug 29, 2024
Copy link

netlify bot commented Aug 29, 2024

Deploy Preview for kongdocs ready!

Name Link
🔨 Latest commit ef4c7ca
🔍 Latest deploy log https://app.netlify.com/sites/kongdocs/deploys/66e12968f60d6f0008796bbd
😎 Deploy Preview https://deploy-preview-7817--kongdocs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.
Lighthouse
Lighthouse
9 paths audited
Performance: 90 (🟢 up 1 from production)
Accessibility: 92 (no change from production)
Best Practices: 98 (🟢 up 8 from production)
SEO: 91 (no change from production)
PWA: -
View the detailed breakdown and full score reports

To edit notification comments on pull requests, go to your Netlify site configuration.


## Semantic routing

Info about semantic routing?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This plugin supports distributing requests based on similarity of prompts to the given description of each model, allowing Kong to select specific model that performs better
in given domains.

@fffonion
Copy link
Contributor

fffonion commented Sep 2, 2024

How many of the currently existing AI Proxy docs, especially the LLM integration guides, do we want to reuse for this plugin? How many of them apply?

All existing configurations, guides etc of ai-proxy can be reused for ai-proxy-advanced. It's sharing the same codebase, thus I think we can share the docs too.

Is there somewhere that I can find info on the load balancing and semantic routing capabilities of this plugin?
Is the plugin's lowest-usage algorithm analogous to the Gateway least-connections algorithm defined here: https://docs.konghq.com/gateway/latest/how-kong-works/load-balancing/#least-connections
Are the consistent-hashing and round-robin algorithms that we already use in the Gateway exactly the same as we already define on that same page?

Yes, the load-balancing algorithms are same thing as the algorithms for kong upstreams. Semantic routing is actually similar as lowest-usage or least-connection, given that the datapoint is the similiary score vs. usage volume or connection count.

lowest-usage is not same as Kong's least-connection, it's load-balancing based on the usage volume (prompt token counts, response token counts etc). We have least-connections in ai-proxy-advanced that matches Kong's least-connection.

@Guaris
Copy link
Contributor

Guaris commented Sep 5, 2024

@fffonion Hey! I added all of the docs from the AI proxy plugin to the AI proxy advanced plugin section and renamed them.
https://deploy-preview-7817--kongdocs.netlify.app/
Can you confirm that the following docs still apply with my changes?
https://deploy-preview-7817--kongdocs.netlify.app//hub/kong-inc/ai-proxy-advanced/unreleased/how-to/streaming/
https://deploy-preview-7817--kongdocs.netlify.app//hub/kong-inc/ai-proxy-advanced/unreleased/how-to/sdk-usage/
https://deploy-preview-7817--kongdocs.netlify.app//hub/kong-inc/ai-proxy-advanced/unreleased/how-to/cloud-provider-authentication/

And in, the mistral integration doc for AI proxy we reference this file https://github.com/kong/kong/blob/master/spec/fixtures/ai-proxy/oas.yaml

Is there a corresponding filre for AI Proxy Advanced?

@fffonion
Copy link
Contributor

@Guaris thanks! one fix is needed, in ai-proxy-advanced, we need to update the config path of config.model to config.targets[].model`.
For example:

curl -X POST http://localhost:8001/routes/{routeName|Id}/plugins \
    --header "accept: application/json" \
    --header "Content-Type: application/json" \
    --data '
    {
  "name": "ai-proxy-advanced",
  "config": {
    "route_type": "llm/v1/chat",
    "auth": {
      "azure_use_managed_identity": true
    },
    "model": {
      "provider": "azure",
      "name": "gpt-35-turbo",
      "options": {
        "azure_instance": "my-openai-instance",
        "azure_deployment_id": "kong-gpt-3-5"
      }
    }
  }
}

would become

curl -X POST http://localhost:8001/routes/{routeName|Id}/plugins \
    --header "accept: application/json" \
    --header "Content-Type: application/json" \
    --data '
   {
    "name": "ai-proxy-advanced",
    "config": {
      "targets": [
        {
          "route_type": "llm/v1/chat",
          "auth": {
            "azure_use_managed_identity": true
          },
          "model": {
            "provider": "azure",
            "name": "gpt-35-turbo",
            "options": {
              "azure_instance": "my-openai-instance",
              "azure_deployment_id": "kong-gpt-3-5"
            }
          }
        }
      ]
    }
  }

@Guaris
Copy link
Contributor

Guaris commented Sep 11, 2024

@fffonion
Isn't that what appears on this page?
https://deploy-preview-7817--kongdocs.netlify.app/hub/kong-inc/ai-proxy-advanced/unreleased/how-to/basic-example/

curl -X POST http://localhost:8001/routes/{routeName|Id}/plugins \
   --header "accept: application/json" \
   --header "Content-Type: application/json" \
   --data '
   {
 "name": "ai-proxy-advanced",
 "config": {
   "targets": [
     {
       "route_type": "llm/v1/chat",
       "auth": {
         "header_name": "Authorization",
         "header_value": "Bearer token"
       },
       "model": {
         "provider": "openai"
       }
     },
     {
       "route_type": "llm/v1/chat",
       "model": {
         "provider": "llama2",
         "options": {
           "llama2_format": "openai",
           "upstream_url": "http://httpbin"
         }
       }
     }
   ]
 }
}
   '

@lena-larionova lena-larionova merged commit 870c301 into main Sep 11, 2024
15 checks passed
@lena-larionova lena-larionova deleted the feat/ai-proxy-advanced branch September 11, 2024 16:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci:manual-approve:link-validation review:copyedit Request for writer review. review:sme Request for SME review, external to the docs team.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants