-
Notifications
You must be signed in to change notification settings - Fork 604
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat: AI proxy advanced #7817
Feat: AI proxy advanced #7817
Conversation
✅ Deploy Preview for kongdocs ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
|
||
## Semantic routing | ||
|
||
Info about semantic routing? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This plugin supports distributing requests based on similarity of prompts to the given description of each model, allowing Kong to select specific model that performs better
in given domains.
All existing configurations, guides etc of ai-proxy can be reused for ai-proxy-advanced. It's sharing the same codebase, thus I think we can share the docs too.
Yes, the load-balancing algorithms are same thing as the algorithms for kong upstreams. Semantic routing is actually similar as lowest-usage or least-connection, given that the datapoint is the similiary score vs. usage volume or connection count.
|
@fffonion Hey! I added all of the docs from the AI proxy plugin to the AI proxy advanced plugin section and renamed them. And in, the mistral integration doc for AI proxy we reference this file https://github.com/kong/kong/blob/master/spec/fixtures/ai-proxy/oas.yaml Is there a corresponding filre for AI Proxy Advanced? |
@Guaris thanks! one fix is needed, in
would become
|
@fffonion
|
Description
AI Proxy Advanced plugin.
Schema (config reference) and basic example added via Kong/docs-plugin-toolkit#47.
https://konghq.atlassian.net/browse/DOCU-3884
Questions for @fffonion @AntoineJac:
lowest-usage
algorithm analogous to the Gatewayleast-connections
algorithm defined here: https://docs.konghq.com/gateway/latest/how-kong-works/load-balancing/#least-connectionsconsistent-hashing
andround-robin
algorithms that we already use in the Gateway exactly the same as we already define on that same page?Testing instructions
Preview link: https://deploy-preview-7817--kongdocs.netlify.app/hub/kong-inc/ai-proxy-advanced/unreleased/
Checklist