Skip to content

Cloudflare worker.js that proxy Claude/AzureOpenAI to OpenAI ChatCompletions API.

Notifications You must be signed in to change notification settings

lroolle/one-llm-api

Repository files navigation

ONE-LLM-API

This is a Cloudflare worker that acts as an adapter to proxy requests to the OpenAI Chat Completions API to other AI services like Claude, Azure OpenAI, and Google Palm.

This project draws inspiration from, and is based on, the work of haibbo/cf-openai-azure-proxy. If you only require a single service proxy (such as Claude/Azure), please consider using the original project. This project, however, is more complex and is specifically designed to accommodate multiple services.

Quick Start

curl TODO

Overview

The worker allows you to make requests to the OpenAI API format and seamlessly redirect them under the hood to other services. This provides a unified API interface and abstraction layer across multiple AI providers.

Key Features

  • Single endpoint to access multiple AI services;
  • Chat to Multiple models in a single request;
  • Unified request/response format using the OpenAI API;
  • Handles streaming responses;
  • Single API key for authentication;
  • Support for OpenAI, Claude, Azure OpenAI, and Google Palm;
  • Support multiple resource config for Azure OpenAI;

Features may be added in the future

  1. Multiple “ONE_API_KEY” support;
  2. Logging request by keys, pricing and token count;
  3. Throttling;

Usages

To use the adapter, simply make requests to the worker endpoint with the OpenAI JSON request payload.

Behind the scenes the worker will:

  • Route requests to the appropriate backend based on the `model` specified
  • Transform request payload to the destination API format
  • Proxy the request and response
  • Convert responses back to OpenAI format

Request Example

For example, to use gpt-3.5-turbo:

{
	"model": "gpt-3.5-turbo",
	"stream": true,
	"messages": [
		{
			"role": "user",
			"content": "Hello there!"
		}
	]
}

To use claude-2:

{
	"model": "claude-2",
	"stream": true,
	"messages": [...]
}

You can specify multiple models (delimitered by ,) to query in parallel:

{
	"model": "gpt-3.5-turbo,claude-2",
	"stream": true,
	"messages": [...]
}

The response will contain the concatenated output from both models streamed in the OpenAI API format.

Other OpenAI parameters like `temperature`, `stream`, `stop` etc. can also be specified normally.

Python Example

import openai

openai.api_key = "<your specified API_KEY>"
openai.api_base = "<your worker endpoint>/v1"

# For example, the local wrangler development endpoint
# openai.api_key = 'sk-fakekey'
# openai.api_base = "http://127.0.0.1:8787/v1"

chat_completion = openai.ChatCompletion.create(
    model="gpt-4,claude-2",
    messages=[
        {
            "role": "user",
            "content": "A brief introduction about yourself and say hello!",
        }
    ],
    stream=True,
)


for chunk in chat_completion:
    if chunk["choices"]:
        print(chunk["model"], chunk["choices"][0]["delta"].get("content", ""))

The API Services supported [2/4]

[X] OpenAI

[X] Azure OpenAI

[ ] Claude

[ ] Google Palm

The models suported

Here are the models currently supported by the adapter service:

To use a particular model, specify its ID in the `model` field of the request body.

OpenAI Models

All the chat models available by your OPENAI_API_KEY

Azure OpenAI Models

Based on your deployment name, you will have to set the environment variable AZURE_OPENAI_API_KEY to the corresponding API key.

You can also setup multiple deployments with different API keys to access different models.

// TODO:

Claude Models

  • State “TODO” from [2023-09-04 Mon 23:24]
  • claude-instant-1(claude-instant-1.2)
  • claude-2(claude-2.0)

Google Palm Models

  • State “TODO” from [2023-09-04 Mon 23:24]
  • text-bison-001
  • chat-bison-001

Deployment

Deploy to Cloudflare Workers

To deploy, you will need:

  • Cloudflare account
  • API keys for each service

Install wrangler

npm i wrangler -g

KV create

wrangler kv:namespace create ONELLM_KV

# if you need to test in the local wrangler dev
wrangler kv:namespace create ONELLM_KV --preview

Environment Variables

Configure the worker environment variables with your secret keys.

Skip the service key if you do not have one or you do not want to deploy it.

wrangler secret put ONE_API_KEY
wrangler secret put OPENAI_API_KEY
wrangler secret put AZURE_OPENAI_API_KEYS
wrangler secret put ANTHROPIC_API_KEY
wrangler secret put PALM_API_KEY

Or you can add the keys after deploy using the Cloudflare dashboard.

Worker -> Settings -> Variables -> Environment Variables

Run publish/deploy

wrangler depoly

Development

Create a .dev.vars with your environment API_KEYs, then run:

wrangler dev
curl -vvv http://127.0.0.1:8787/v1/chat/completions -H "Content-Type: application/json" -H "Authorization: Bearer sk-fakekey" -d '{
    "model": "gpt-3.5-turbo,claude-2", "stream": true,
    "messages": [{"role": "user", "content": "Say: Hello I am your helpful one Assistant."}]
  }'

Contributions

Contributions and improvements are welcome! Please open GitHub issues or PRs.

Let me know if you would like any changes or have additional sections to add!

About

Cloudflare worker.js that proxy Claude/AzureOpenAI to OpenAI ChatCompletions API.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published