diff --git a/examples/azure_fallback_loadbalance.ipynb b/examples/azure_fallback_loadbalance.ipynb new file mode 100644 index 0000000..72f0ff9 --- /dev/null +++ b/examples/azure_fallback_loadbalance.ipynb @@ -0,0 +1,254 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Portkey | Building Resilient LLM Apps\n", + "\n", + "**Portkey** is a full-stack LLMOps platform that productionizes your Gen AI app reliably and securely." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Key Features of Portkey\n", + "\n", + "1. **AI Gateway**:\n", + " - **Automated Fallbacks & Retries**: Ensure your application remains functional even if a primary service fails.\n", + " - **Load Balancing**: Efficiently distribute incoming requests among multiple models.\n", + " - **Semantic Caching**: Reduce costs and latency by intelligently caching results.\n", + " \n", + "2. **Observability**:\n", + " - **Logging**: Keep track of all requests for monitoring and debugging.\n", + " - **Requests Tracing**: Understand the journey of each request for optimization.\n", + " - **Custom Tags**: Segment and categorize requests for better insights.\n", + "\n", + "To harness these features, let's start with the setup:\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "5Z933R9wuZ4z" + }, + "outputs": [], + "source": [ + "# Installing the Portkey AI python SDK developed by the Portkey team\n", + "!pip install -e .\n", + "!portkey --version" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Importing necessary libraries and modules\n", + "import portkey as pk\n", + "from portkey import Config, LLMOptions\n", + "from getpass import getpass" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### **Step 1: Get your Portkey API key**\n", + "\n", + "Log into [Portkey here](https://app.portkey.ai/), then click on the profile icon on top right and \"Copy API Key\". Let's also set OpenAI & Anthropic API keys." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Enter the password on the prompt window.\n", + "API_KEY = getpass(\"Enter your PORTKEY_API_KEY \")\n", + "\n", + "# Setting the API key\n", + "pk.api_key = API_KEY\n", + "\n", + "# NOTE: For adding custom url, uncomment this line and add your custom url in a selfhosted version.\n", + "# pk.base_url = \"\"\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### **Step 2: Configure Portkey Features**\n", + "\n", + "To harness the full potential of Portkey, you can configure various features as illustrated above. Here's a guide to all Portkey features and the expected values:\n", + "\n", + "| Feature | Config Key | Value(Type) | Required |\n", + "|---------------------|-------------------------|--------------------------------------------------|-------------|\n", + "| API Key | `api_key` | `string` | ✅ Required (can be set externally) |\n", + "| Mode | `mode` | `fallback`, `ab_test`, `single` | ✅ Required |\n", + "| Cache Type | `cache_status` | `simple`, `semantic` | ❔ Optional |\n", + "| Force Cache Refresh | `cache_force_refresh` | `boolean` | ❔ Optional |\n", + "| Cache Age | `cache_age` | `integer` (in seconds) | ❔ Optional |\n", + "| Trace ID | `trace_id` | `string` | ❔ Optional |\n", + "| Retries | `retry` | `integer` [0,5] | ❔ Optional |\n", + "| Metadata | `metadata` | `json object` [More info](https://docs.portkey.ai/key-features/custom-metadata) | ❔ Optional |\n", + "| Base URL | `base_url` | `url` | ❔ Optional |\n", + "\n", + "\n", + "To set up Portkey for different modes and features, refer to the provided IPython Notebook examples in the examples/ directory.\n", + "\n", + "For more information and detailed documentation, please visit [Portkey Documentation](https://docs.portkey.ai/)." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Example-1: Configuring Portkey for Fallback Mode with Azure models\n", + "In this example, we'll demonstrate how to configure Portkey for the Fallback Mode using the sdk. Fallback Mode allows you to define a backup strategy when your primary service is unavailable.\n", + "\n", + "`Note`: The order of definition of LLMOptions is important for fallbacks. Ensure that you define your fallback strategy in the order of preference. This ensures that your fallback logic is in place and ready to be used when needed." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "pk.config = Config(\n", + " mode=\"fallback\",\n", + " llms=[\n", + " LLMOptions(api_key=\"\", provider=\"azure-openai\", resource_name=\"\", api_version=\"\", deployment_id=\"\"),\n", + " LLMOptions(api_key=\"\", provider=\"azure-openai\", resource_name=\"\", api_version=\"\", deployment_id=\"\")\n", + " ]\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Example 1: Basic example\n", + "\n", + "response = pk.ChatCompletions.create(\n", + " messages=[{\"role\": \"user\", \"content\": \"Who are you ?\"}]\n", + ")\n", + "\n", + "print(response.choices[0].message)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Example 3: Streaming results\n", + "\n", + "response3 = pk.ChatCompletions.create(\n", + " messages=[{\"role\": \"user\", \"content\": \"Translate the following English text to French: 'Hello, how are you?'\"}],\n", + " stream=True # Stream back partial progress\n", + ")\n", + "\n", + "for event in response3:\n", + " if len(event.choices) == 0:\n", + " continue\n", + " if event.choices[0].delta is None:\n", + " break\n", + " print(event.choices[0].delta.get(\"content\", \"\"), end=\"\", flush=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Example-2: Configuring Portkey for Load Balancing (A/B test) Mode with Azure models\n", + "\n", + "To utilize Portkey's Load Balancing Mode, follow the steps below. Load Balancing Mode enables you to distribute incoming requests across multiple services to ensure high availability and scalability.\n", + "\n", + "`NOTE`: Loadbalance is also called A/B test." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "pk.config = Config(\n", + " mode=\"ab_test\",\n", + " llms=[\n", + " LLMOptions(api_key=\"\", provider=\"azure-openai\", weight=0.4, resource_name=\"\", api_version=\"\", deployment_id=\"\"),\n", + " LLMOptions(api_key=\"\", provider=\"azure-openai\", weight=0.6, resource_name=\"\", api_version=\"\", deployment_id=\"\")\n", + " ]\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Example 1: Basic example\n", + "\n", + "response = pk.ChatCompletions.create(\n", + " messages=[{\"role\": \"user\", \"content\": \"Summarize the key points from the article titled 'The Impact of Climate Change on Global Biodiversity.'\"}]\n", + ")\n", + "\n", + "print(response.choices[0].message)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Example 3: Streaming results\n", + "\n", + "response3 = pk.ChatCompletions.create(\n", + " messages=[{\"role\": \"user\", \"content\": \"Generate a creative short story about a detective solving a mysterious case.\"}],\n", + " stream=True # Stream back partial progre|ss\n", + ")\n", + "\n", + "for event in response3:\n", + " if len(event.choices) == 0:\n", + " continue\n", + " if event.choices[0].delta is None:\n", + " break\n", + " print(event.choices[0].delta.get(\"content\", \"\"), end=\"\", flush=True)" + ] + } + ], + "metadata": { + "colab": { + "provenance": [] + }, + "kernelspec": { + "display_name": "Python 3", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.5" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +}