Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update: AI proxy advanced load balancing #7971

Merged
merged 6 commits into from
Oct 17, 2024
Merged

Update: AI proxy advanced load balancing #7971

merged 6 commits into from
Oct 17, 2024

Conversation

lmilan
Copy link
Contributor

@lmilan lmilan commented Sep 26, 2024

Description

  • Added link to AI proxy advanced load balancing in navigation under AI Gateway
  • Updated AI proxy advanced overview to add missing load balancing algorithms and a section about retry/fallback
  • Removed the semantic routing section and included the info in the load balancer algorithms section
  • Added config example for each load balancer type

DOCU-4076

Testing instructions

Preview link:

Checklist

@lmilan lmilan added review:general Review for general accuracy and presentation. Does the doc work? Does it output correctly? review:sme Request for SME review, external to the docs team. labels Sep 26, 2024
@lmilan lmilan self-assigned this Sep 26, 2024
Copy link

netlify bot commented Sep 26, 2024

Deploy Preview for kongdocs ready!

Name Link
🔨 Latest commit bc07b74
🔍 Latest deploy log https://app.netlify.com/sites/kongdocs/deploys/670f8d09d6f0640008fe555e
😎 Deploy Preview https://deploy-preview-7971--kongdocs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.
Lighthouse
Lighthouse
9 paths audited
Performance: 92 (🟢 up 2 from production)
Accessibility: 92 (no change from production)
Best Practices: 98 (🟢 up 8 from production)
SEO: 91 (no change from production)
PWA: -
View the detailed breakdown and full score reports

To edit notification comments on pull requests, go to your Netlify site configuration.

app/_hub/kong-inc/ai-proxy-advanced/overview/_index.md Outdated Show resolved Hide resolved

By analyzing the content of the request, the plugin can match it to the most appropriate model that is known to perform better in similar contexts. This feature enhances the flexibility and efficiency of model selection, especially when dealing with a diverse range of AI providers and models.
This plugin does not support fallback over targets with different formats. For example, you can have a load balancer containing targets with different OpenAI models, but you can't have one target with an OpenAI model and another with an Ollama model. However, you use can an OpenAI model alongside a Mistral model compatible with the OpenAI format.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having trouble understanding this section, is it possible to display this info in like a table or individual bullet points?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated this part, let me know if it's clearer

@lmilan lmilan marked this pull request as ready for review October 1, 2024 11:41
@lmilan lmilan requested a review from a team as a code owner October 1, 2024 11:41
@Guaris Guaris merged commit b8e0c1e into main Oct 17, 2024
15 checks passed
@Guaris Guaris deleted the DOCU-4076 branch October 17, 2024 15:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
review:general Review for general accuracy and presentation. Does the doc work? Does it output correctly? review:sme Request for SME review, external to the docs team.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants