Skip to content

Commit

Permalink
Feat: Add Hugging Face as LLM provider (#8182)
Browse files Browse the repository at this point in the history
* add hugging face provider

* Update _huggingface.md

* add prerequisite

* Update app/_hub/kong-inc/ai-proxy-advanced/how-to/llm-provider-integration-guides/_huggingface.md

Co-authored-by: lena-larionova <[email protected]>

* update tables

---------

Co-authored-by: lena-larionova <[email protected]>
  • Loading branch information
lmilan and lena-larionova authored Dec 9, 2024
1 parent 0dd5afe commit 970435e
Show file tree
Hide file tree
Showing 7 changed files with 191 additions and 18 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
---
nav_title: Hugging Face
title: Set up AI Proxy Advanced with Hugging Face
minimum_version: 3.9.x
---

This guide walks you through setting up the AI Proxy plugin with [Hugging Face](https://huggingface.co/).

{% include_cached /md/plugins-hub/ai-providers-prereqs-advanced.md snippet='intro' %}

## Prerequisites

{% include_cached /md/plugins-hub/ai-providers-prereqs-advanced.md snippet='service' provider='Hugging Face' %}
* Hugging Face access token with permissions to make calls to the Inference API
* [Text-generation model](https://huggingface.co/models?pipeline_tag=text-generation&sort=trending) from Hugging Face

## Provider configuration

### Set up route and plugin

Create the route:

```bash
curl -X POST http://localhost:8001/services/ai-proxy-advanced/routes \
--data "name=huggingface-chat" \
--data "paths[]=~/huggingface-chat$"
```

Enable and configure the AI Proxy Advanced plugin for Hugging Face, replacing the `<huggingface_token>` with your own access token and `<huggingface_model>` with the name of the model to use.

<!--vale off-->
{% plugin_example %}
plugin: kong-inc/ai-proxy-advanced
name: ai-proxy-advanced
config:
targets:
- route_type: "llm/v1/chat"
auth:
header_name: Authorization
header_value: "Bearer <huggingface_token>"
model:
provider: huggingface
name: <huggingface_model>
options:
max_tokens: 512
temperature: 1.0
top_p: 256
top_k: 0.5
targets:
- route
formats:
- curl
- konnect
- yaml
- kubernetes
- terraform
{% endplugin_example %}
<!--vale on-->

### Test the configuration

Make an `llm/v1/chat` type request to test your new endpoint:

```bash
curl -X POST http://localhost:8000/huggingface-chat \
-H 'Content-Type: application/json' \
--data-raw '{ "messages": [ { "role": "system", "content": "You are a mathematician" }, { "role": "user", "content": "What is 1+1?"} ] }'
```
Original file line number Diff line number Diff line change
@@ -1,22 +1,22 @@
---
nav_title: OpenAI
title: Set up AI Proxy with OpenAI
title: Set up AI Proxy Advanced with OpenAI
---

This guide walks you through setting up the AI Proxy plugin with [OpenAI](https://openai.com/).
This guide walks you through setting up the AI Proxy Advanced plugin with [OpenAI](https://openai.com/).

{% include_cached /md/plugins-hub/ai-providers-prereqs.md snippet='intro' %}
{% include_cached /md/plugins-hub/ai-providers-prereqs-advanced.md snippet='intro' %}

## Prerequisites

{% include_cached /md/plugins-hub/ai-providers-prereqs.md snippet='service' provider='OpenAI' %}
{% include_cached /md/plugins-hub/ai-providers-prereqs-advanced.md snippet='service' provider='OpenAI' %}

## Provider configuration

### Set up route and plugin

After creating an OpenAI account, and purchasing a subscription, you can then create an
AI Proxy route and plugin configuration.
AI Proxy Advanced route and plugin configuration.

Create a route:

Expand Down
10 changes: 10 additions & 0 deletions app/_hub/kong-inc/ai-proxy-advanced/overview/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,9 @@ The following table describes which providers and requests the AI Proxy Advanced
| Llama3 (OLLAMA and OpenAI formats) ||||
| Amazon Bedrock ||||
| Gemini ||||
{% if_version gte:3.9.x %}
| Hugging Face ||||
{% endif_version %}

## How it works

Expand Down Expand Up @@ -94,6 +97,10 @@ The plugin's [`config.route_type`](/hub/kong-inc/ai-proxy-advanced/configuration
| Amazon Bedrock | Use the LLM `completions` upstream path | `llm/v1/completions` | [Use the model name for the specific LLM provider](https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html) |
| Gemini | `llm/v1/chat` | `llm/v1/chat` | `gemini-1.5-flash` or `gemini-1.5-pro` |
| Gemini | `llm/v1/completions` | `llm/v1/completions` | `gemini-1.5-flash` or `gemini-1.5-pro` |
{% if_version gte:3.9.x %}
| Hugging Face | `/models/{model_provider}/{model_name}` | `llm/v1/chat` | [Use the model name for the specific LLM provider](https://huggingface.co/models?inference=warm&pipeline_tag=text-generation&sort=trending) |
| Hugging Face | `/models/{model_provider}/{model_name}` | `llm/v1/completions` | [Use the model name for the specific LLM provider](https://huggingface.co/models?inference=warm&pipeline_tag=text-generation&sort=trending) |
{% endif_version %}


The following upstream URL patterns are used:
Expand All @@ -108,6 +115,9 @@ The following upstream URL patterns are used:
| Mistral | As defined in `config.model.options.upstream_url` |
| Amazon Bedrock | `https://bedrock-runtime.{region}.amazonaws.com` |
| Gemini | `https://generativelanguage.googleapis.com` |
{% if_version gte:3.9.x %}
| Hugging Face | `https://api-inference.huggingface.co` |
{% endif_version %}

{:.important}
> While only the **Llama2** and **Mistral** models are classed as self-hosted, the target URL can be overridden for any of the supported providers.
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
---
nav_title: Hugging Face
title: Set up AI Proxy with Hugging Face
minimum_version: 3.9.x
---

This guide walks you through setting up the AI Proxy plugin with [Hugging Face](https://huggingface.co/).

{% include_cached /md/plugins-hub/ai-providers-prereqs.md snippet='intro' %}

## Prerequisites

{% include_cached /md/plugins-hub/ai-providers-prereqs.md snippet='service' provider='Hugging Face' %}
* Hugging Face access token with permissions to make calls to the Inference API
* [Text-generation model](https://huggingface.co/models?pipeline_tag=text-generation&sort=trending) from Hugging Face

## Provider configuration

### Set up route and plugin

{% navtabs %}
{% navtab Kong Admin API %}

Create the route:

```bash
curl -X POST http://localhost:8001/services/ai-proxy/routes \
--data "name=huggingface-chat" \
--data "paths[]=~/huggingface-chat$"
```

Enable and configure the AI Proxy plugin for Hugging Face, replacing the `<huggingface_token>` with your access token and `<huggingface_model>` with the name of the model to use:

```bash
curl -X POST http://localhost:8001/routes/huggingface-chat/plugins \
--data "name=ai-proxy" \
--data "config.route_type=llm/v1/chat" \
--data "config.auth.header_name=Authorization" \
--data "config.auth.header_value= Bearer <huggingface_token>" \
--data "config.model.provider=huggingface" \
--data "config.model.name=<huggingface_model>" \
--data "config.model.options.max_tokens=512" \
--data "config.model.options.temperature=1.0" \
--data "config.model.options.top_p=256" \
--data "config.model.options.top_k=0.5"
```

{% endnavtab %}
{% navtab YAML %}
```yaml
routes:
- name: huggingface-chat
service:
name: ai-proxy
paths:
- "~/huggingface-chat$"
methods:
- POST
plugins:
- name: ai-proxy
config:
route_type: "llm/v1/chat"
auth:
header_name: "Authorization"
header_value: "Bearer <huggingface_token>" # add your Hugging Face access token
model:
provider: "huggingface"
name: "<huggingface_model>" # add the Hugging Face model to use
options:
max_tokens: 512
temperature: 1.0
top_p: 256
top_k: 0.5
```
{% endnavtab %}
{% endnavtabs %}
### Test the configuration
Make an `llm/v1/chat` type request to test your new endpoint:

```bash
curl -X POST http://localhost:8000/huggingface-chat \
-H 'Content-Type: application/json' \
--data-raw '{ "messages": [ { "role": "system", "content": "You are a mathematician" }, { "role": "user", "content": "What is 1+1?"} ] }'
```
10 changes: 10 additions & 0 deletions app/_hub/kong-inc/ai-proxy/overview/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,9 @@ The following table describes which providers and requests the AI Proxy plugin s
| Amazon Bedrock ||||
| Gemini ||||
{% endif_version %}
{% if_version gte:3.9.x %}
| Hugging Face ||||
{% endif_version %}

## How it works

Expand Down Expand Up @@ -87,6 +90,10 @@ The plugin's [`config.route_type`](/hub/kong-inc/ai-proxy/configuration/#config-
| Gemini | `llm/v1/chat` | `llm/v1/chat` | `gemini-1.5-flash` or `gemini-1.5-pro` |
| Gemini | `llm/v1/completions` | `llm/v1/completions` | `gemini-1.5-flash` or `gemini-1.5-pro` |
{% endif_version %}
{% if_version gte:3.9.x %}
| Hugging Face | `/models/{model_provider}/{model_name}` | `llm/v1/chat` | [Use the model name for the specific LLM provider](https://huggingface.co/models?inference=warm&pipeline_tag=text-generation&sort=trending) |
| Hugging Face | `/models/{model_provider}/{model_name}` | `llm/v1/completions` | [Use the model name for the specific LLM provider](https://huggingface.co/models?inference=warm&pipeline_tag=text-generation&sort=trending) |
{% endif_version %}

The following upstream URL patterns are used:

Expand All @@ -103,6 +110,9 @@ The following upstream URL patterns are used:
| Amazon Bedrock | `https://bedrock-runtime.{region}.amazonaws.com` |
| Gemini | `https://generativelanguage.googleapis.com` |
{% endif_version %}
{% if_version gte:3.9.x %}
| Hugging Face | `https://api-inference.huggingface.co` |
{% endif_version %}

{:.important}
> While only the **Llama2** and **Mistral** models are classed as self-hosted, the target URL can be overridden for any of the supported providers.
Expand Down
13 changes: 6 additions & 7 deletions app/_includes/md/plugins-hub/ai-providers-prereqs-advanced.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,11 +18,10 @@ it can point somewhere empty (for example, `http://localhost:32000`), because th
{% if include.provider %}
* {{include.provider}} account and subscription
* {% endif %}You need a service to contain the route for the LLM provider. Create a service **first**:

```bash
curl -X POST http://localhost:8001/services \
--data "name=ai-proxy-advanced" \
--data "url=http://localhost:32000"
```
Remember that the upstream URL can point anywhere empty, as it won't be used by the plugin.
```bash
curl -X POST http://localhost:8001/services \
--data "name=ai-proxy-advanced" \
--data "url=http://localhost:32000"
```
Remember that the upstream URL can point anywhere empty, as it won't be used by the plugin.
{% endif %}
11 changes: 5 additions & 6 deletions app/_includes/md/plugins-hub/ai-providers-prereqs.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,11 +18,10 @@ it can point somewhere empty (for example, `http://localhost:32000`), because th
{% if include.provider %}
* {{include.provider}} account and subscription
* {% endif %}You need a service to contain the route for the LLM provider. Create a service **first**:

```bash
curl -X POST http://localhost:8001/services \
--data "name=ai-proxy" \
--data "url=http://localhost:32000"
```
```bash
curl -X POST http://localhost:8001/services \
--data "name=ai-proxy" \
--data "url=http://localhost:32000"
```
Remember that the upstream URL can point anywhere empty, as it won't be used by the plugin.
{% endif %}

0 comments on commit 970435e

Please sign in to comment.