From 99a2e320824587a36e5604967c57c1551285c029 Mon Sep 17 00:00:00 2001 From: Scott Date: Wed, 20 Sep 2023 12:19:28 +0100 Subject: [PATCH 1/5] move custom columns, remove wandbaddons --- .../W&B_Prompts_with_Custom_Columns.ipynb | 618 ++++++++++++++++++ 1 file changed, 618 insertions(+) create mode 100644 colabs/prompts/W&B_Prompts_with_Custom_Columns.ipynb diff --git a/colabs/prompts/W&B_Prompts_with_Custom_Columns.ipynb b/colabs/prompts/W&B_Prompts_with_Custom_Columns.ipynb new file mode 100644 index 00000000..ebc811cf --- /dev/null +++ b/colabs/prompts/W&B_Prompts_with_Custom_Columns.ipynb @@ -0,0 +1,618 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "e-ZYaV5KGVmA" + }, + "source": [ + "\"Open\n", + "" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "gJSVEAGWGVmA" + }, + "source": [ + "\"Weights\n", + "" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "9f7yMKLwGVmA" + }, + "source": [ + "**[Weights & Biases Prompts](https://docs.wandb.ai/guides/prompts?utm_source=code&utm_medium=colab&utm_campaign=prompts)** is a suite of LLMOps tools built for the development of LLM-powered applications.\n", + "\n", + "Use W&B Prompts to visualize and inspect the execution flow of your LLMs, analyze the inputs and outputs of your LLMs, view the intermediate results and securely store and manage your prompts and LLM chain configurations.\n", + "\n", + "#### [🪄 View Prompts In Action](https://wandb.ai/timssweeney/prompts-demo/)\n", + "\n", + "**In this notebook we will demostrate W&B Prompts:**\n", + "\n", + "- Using our 1-line LangChain integration\n", + "- Using our Trace class when building your own LLM Pipelines\n", + "\n", + "See here for the full [W&B Prompts documentation](https://docs.wandb.ai/guides/prompts)\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "A4wI3b_8GVmB" + }, + "source": [ + "## Installation" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": { + "id": "nDoIqQ8_GVmB" + }, + "outputs": [], + "source": [ + "!pip install \"wandb>=0.15.4\" -qqq\n", + "!pip install \"langchain>=0.0.218\" openai -qqq" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": { + "id": "PcGiSWBSGVmB" + }, + "outputs": [], + "source": [ + "import langchain\n", + "assert langchain.__version__ >= \"0.0.218\", \"Please ensure you are using LangChain v0.0.188 or higher\"" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "pbmQIsjJGVmB" + }, + "source": [ + "## Setup\n", + "\n", + "This demo requires that you have an [OpenAI key](https://platform.openai.com)" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "ZH4g2B0lGVmB", + "outputId": "22295db6-5369-474d-a8ea-fb45c4c92085" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Paste your OpenAI key from: https://platform.openai.com/account/api-keys\n", + "··········\n", + "OpenAI API key configured\n" + ] + } + ], + "source": [ + "import os\n", + "from getpass import getpass\n", + "\n", + "if os.getenv(\"OPENAI_API_KEY\") is None:\n", + " os.environ[\"OPENAI_API_KEY\"] = getpass(\"Paste your OpenAI key from: https://platform.openai.com/account/api-keys\\n\")\n", + "assert os.getenv(\"OPENAI_API_KEY\", \"\").startswith(\"sk-\"), \"This doesn't look like a valid OpenAI API key\"\n", + "print(\"OpenAI API key configured\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "79KOB2EhGVmB" + }, + "source": [ + "# W&B Prompts\n", + "\n", + "W&B Prompts consists of three main components:\n", + "\n", + "**Trace table**: Overview of the inputs and outputs of a chain.\n", + "\n", + "**Trace timeline**: Displays the execution flow of the chain and is color-coded according to component types.\n", + "\n", + "**Model architecture**: View details about the structure of the chain and the parameters used to initialize each component of the chain.\n", + "\n", + "After running this section, you will see a new panel automatically created in your workspace, showing each execution, the trace, and the model architecture" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "5kxmdm3zGVmC" + }, + "source": [ + "\"Weights" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "9u97K5vVGVmC" + }, + "source": [ + "## Maths with LangChain" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "oneRFmv6GVmC" + }, + "source": [ + "Set the `LANGCHAIN_WANDB_TRACING` environment variable as well as any other relevant [W&B environment variables](https://docs.wandb.ai/guides/track/environment-variables). This could includes a W&B project name, team name, and more. See [wandb.init](https://docs.wandb.ai/ref/python/init) for a full list of arguments." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": { + "id": "ACl-rMtAGVmC" + }, + "outputs": [], + "source": [ + "os.environ[\"LANGCHAIN_WANDB_TRACING\"] = \"true\"\n", + "os.environ[\"WANDB_PROJECT\"] = \"langchain-testing\"" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": { + "id": "csp3MXG4GVmC" + }, + "outputs": [], + "source": [ + "from langchain.chat_models import ChatOpenAI\n", + "from langchain.agents import load_tools, initialize_agent, AgentType" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "2hWU2GcAGVmC" + }, + "source": [ + "Create a standard math Agent using LangChain" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": { + "id": "l_JkVMlRGVmC" + }, + "outputs": [], + "source": [ + "llm = ChatOpenAI(temperature=0)\n", + "tools = load_tools([\"llm-math\"], llm=llm)\n", + "math_agent = initialize_agent(tools,\n", + " llm,\n", + " agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "9FFviwCPGVmC" + }, + "source": [ + "Use LangChain as normal by calling your Agent.\n", + "\n", + " You will see a Weights & Biases run start and you will be asked for your [Weights & Biases API key](wwww.wandb.ai/authorize). Once your enter your API key, the inputs and outputs of your Agent calls will start to be streamed to the Weights & Biases App." + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 178 + }, + "id": "y-RHjVN4GVmC", + "outputId": "5ccd5f32-6137-46c3-9abd-d458dbdbacca" + }, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "\u001b[34m\u001b[1mwandb\u001b[0m: Streaming LangChain activity to W&B at https://wandb.ai/carey/langchain-testing/runs/lcznj5lg\n", + "\u001b[34m\u001b[1mwandb\u001b[0m: `WandbTracer` is currently in beta.\n", + "\u001b[34m\u001b[1mwandb\u001b[0m: Please report any issues to https://github.com/wandb/wandb/issues with the tag `langchain`.\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "LLMMathChain._evaluate(\"\n", + "import math\n", + "math.sqrt(5.4)\n", + "\") raised error: invalid syntax (, line 1). Please try again with a valid numerical expression\n", + "0.005720801417544866\n", + "0.15096209512635608\n" + ] + } + ], + "source": [ + "# some sample maths questions\n", + "questions = [\n", + " \"Find the square root of 5.4.\",\n", + " \"What is 3 divided by 7.34 raised to the power of pi?\",\n", + " \"What is the sin of 0.47 radians, divided by the cube root of 27?\"\n", + "]\n", + "\n", + "for question in questions:\n", + " try:\n", + " # call your Agent as normal\n", + " answer = math_agent.run(question)\n", + " print(answer)\n", + " except Exception as e:\n", + " # any errors will be also logged to Weights & Biases\n", + " print(e)\n", + " pass" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "SNYFSaUrGVmC" + }, + "source": [ + "Once each Agent execution completes, all calls in your LangChain object will be logged to Weights & Biases" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "m0bL1xpkGVmC" + }, + "source": [ + "### LangChain Context Manager\n", + "Depending on your use case, you might instead prefer to use a context manager to manage your logging to W&B.\n", + "\n", + "**✨ New: Custom columns** can be logged directly to W&B to display in the same Trace Table with this snippet:\n", + "```python\n", + "import wandb\n", + "wandb.log(custom_metrics_dict, commit=False})\n", + "```\n", + "Use `commit=False` to make sure that metadata is logged to the same row of the Trace Table as the LangChain output." + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 35 + }, + "id": "7i9Pj1NKGVmC", + "outputId": "b44f3ae7-fd49-437f-af7b-fb8f82056bd0" + }, + "outputs": [ + { + "data": { + "application/vnd.google.colaboratory.intrinsic+json": { + "type": "string" + }, + "text/plain": [ + "'1.0891804557407723'" + ] + }, + "execution_count": 10, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "from langchain.callbacks import wandb_tracing_enabled\n", + "import wandb # To enable custom column logging with wandb.run.log()\n", + "\n", + "# unset the environment variable and use a context manager instead\n", + "if \"LANGCHAIN_WANDB_TRACING\" in os.environ:\n", + " del os.environ[\"LANGCHAIN_WANDB_TRACING\"]\n", + "\n", + "# enable tracing using a context manager\n", + "with wandb_tracing_enabled():\n", + " for i in range (10):\n", + " # Log any custom columns you'd like to add to the Trace Table\n", + " wandb.log({\"custom_column\": i}, commit=False)\n", + " try:\n", + " math_agent.run(f\"What is {i} raised to .123243 power?\") # this should be traced\n", + " except:\n", + " pass\n", + "\n", + "math_agent.run(\"What is 2 raised to .123243 power?\") # this should not be traced" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "JDLzoorhGVmC" + }, + "source": [ + "# Non-Lang Chain Implementation\n", + "\n", + "\n", + "A W&B Trace is created by logging 1 or more \"spans\". A root span is expected, which can accept nested child spans, which can in turn accept their own child spans. A Span represents a unit of work, Spans can have type `AGENT`, `TOOL`, `LLM` or `CHAIN`\n", + "\n", + "When logging with Trace, a single W&B run can have multiple calls to a LLM, Tool, Chain or Agent logged to it, there is no need to start a new W&B run after each generation from your model or pipeline, instead each call will be appended to the Trace Table.\n", + "\n", + "In this quickstart, we will how to log a single call to an OpenAI model to W&B Trace as a single span. Then we will show how to log a more complex series of nested spans.\n", + "\n", + "## Logging with W&B Trace" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "7z98yfoqGVmD" + }, + "source": [ + "Call wandb.init to start a W&B run. Here you can pass a W&B project name as well as an entity name (if logging to a W&B Team), as well as a config and more. See wandb.init for the full list of arguments.\n", + "\n", + "You will see a Weights & Biases run start and be asked for your [Weights & Biases API key](wwww.wandb.ai/authorize). Once your enter your API key, the inputs and outputs of your Agent calls will start to be streamed to the Weights & Biases App.\n", + "\n", + "**Note:** A W&B run supports logging as many traces you needed to a single run, i.e. you can make multiple calls of `run.log` without the need to create a new run each time" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "ZcvgzZ55GVmD" + }, + "outputs": [], + "source": [ + "import wandb\n", + "\n", + "# start a wandb run to log to\n", + "wandb.init(project=\"trace-example\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "4_3Wrg2YGVmD" + }, + "source": [ + "You can also set the entity argument in wandb.init if logging to a W&B Team.\n", + "\n", + "### Logging a single Span\n", + "Now we will query OpenAI times and log the results to a W&B Trace. We will log the inputs and outputs, start and end times, whether the OpenAI call was successful, the token usage, and additional metadata.\n", + "\n", + "You can see the full description of the arguments to the Trace class [here](https://soumik12345.github.io/wandb-addons/prompts/tracer/)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "q2pkMhpMGVmD" + }, + "outputs": [], + "source": [ + "import openai\n", + "import datetime\n", + "from wandb.sdk.data_types.trace_tree import Trace\n", + "\n", + "openai.api_key = os.environ[\"OPENAI_API_KEY\"]\n", + "\n", + "# define your conifg\n", + "model_name = \"gpt-3.5-turbo\"\n", + "temperature = 0.7\n", + "system_message = \"You are a helpful assistant that always replies in 3 concise bullet points using markdown.\"\n", + "\n", + "queries_ls = [\n", + " \"What is the capital of France?\",\n", + " \"How do I boil an egg?\" * 10000, # deliberately trigger an openai error\n", + " \"What to do if the aliens arrive?\"\n", + "]\n", + "\n", + "for query in queries_ls:\n", + " messages=[\n", + " {\"role\": \"system\", \"content\": system_message},\n", + " {\"role\": \"user\", \"content\": query}\n", + " ]\n", + "\n", + " start_time_ms = datetime.datetime.now().timestamp() * 1000\n", + " try:\n", + " response = openai.ChatCompletion.create(model=model_name,\n", + " messages=messages,\n", + " temperature=temperature\n", + " )\n", + "\n", + " end_time_ms = round(datetime.datetime.now().timestamp() * 1000) # logged in milliseconds\n", + " status=\"success\"\n", + " status_message=None,\n", + " response_text = response[\"choices\"][0][\"message\"][\"content\"]\n", + " token_usage = response[\"usage\"].to_dict()\n", + "\n", + "\n", + " except Exception as e:\n", + " end_time_ms = round(datetime.datetime.now().timestamp() * 1000) # logged in milliseconds\n", + " status=\"error\"\n", + " status_message=str(e)\n", + " response_text = \"\"\n", + " token_usage = {}\n", + "\n", + " # create a span in wandb\n", + " root_span = Trace(\n", + " name=\"root_span\",\n", + " kind=\"llm\", # kind can be \"llm\", \"chain\", \"agent\" or \"tool\"\n", + " status_code=status,\n", + " status_message=status_message,\n", + " metadata={\"temperature\": temperature,\n", + " \"token_usage\": token_usage,\n", + " \"model_name\": model_name},\n", + " start_time_ms=start_time_ms,\n", + " end_time_ms=end_time_ms,\n", + " inputs={\"system_prompt\": system_message, \"query\": query},\n", + " outputs={\"response\": response_text},\n", + " )\n", + "\n", + " # log the span to wandb\n", + " root_span.log(name=\"openai_trace\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "XFcwFgaDGVmD" + }, + "source": [ + "### Logging a LLM pipeline using nested Spans\n", + "\n", + "In this example we will simulate an Agent being called, which then calls a LLM Chain, which calls an OpenAI LLM and then the Agent \"calls\" a Calculator tool.\n", + "\n", + "The inputs, outputs and metadata for each step in the execution of our \"Agent\" is logged in its own span. Spans can have child" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "ACMaGuYUGVmD" + }, + "outputs": [], + "source": [ + "import time\n", + "\n", + "openai.api_key = os.environ[\"OPENAI_API_KEY\"]\n", + "\n", + "# The query our agent has to answer\n", + "query = \"How many days until the next US election?\"\n", + "\n", + "# part 1 - an Agent is started...\n", + "start_time_ms = round(datetime.datetime.now().timestamp() * 1000)\n", + "\n", + "root_span = Trace(\n", + " name=\"MyAgent\",\n", + " kind=\"agent\",\n", + " start_time_ms=start_time_ms,\n", + " metadata={\"user\": \"optimus_12\"})\n", + "\n", + "\n", + "# part 2 - The Agent calls into a LLMChain..\n", + "chain_span = Trace(\n", + " name=\"LLMChain\",\n", + " kind=\"chain\",\n", + " start_time_ms=start_time_ms)\n", + "\n", + "# add the Chain span as a child of the root\n", + "root_span.add_child(chain_span)\n", + "\n", + "\n", + "# part 3 - the LLMChain calls an OpenAI LLM...\n", + "messages=[\n", + " {\"role\": \"system\", \"content\": system_message},\n", + " {\"role\": \"user\", \"content\": query}\n", + "]\n", + "\n", + "response = openai.ChatCompletion.create(model=model_name,\n", + " messages=messages,\n", + " temperature=temperature)\n", + "\n", + "llm_end_time_ms = round(datetime.datetime.now().timestamp() * 1000)\n", + "response_text = response[\"choices\"][0][\"message\"][\"content\"]\n", + "token_usage = response[\"usage\"].to_dict()\n", + "\n", + "llm_span = Trace(\n", + " name=\"OpenAI\",\n", + " kind=\"llm\",\n", + " status_code=\"success\",\n", + " metadata={\"temperature\":temperature,\n", + " \"token_usage\": token_usage,\n", + " \"model_name\":model_name},\n", + " start_time_ms=start_time_ms,\n", + " end_time_ms=llm_end_time_ms,\n", + " inputs={\"system_prompt\":system_message, \"query\":query},\n", + " outputs={\"response\": response_text},\n", + " )\n", + "\n", + "# add the LLM span as a child of the Chain span...\n", + "chain_span.add_child(llm_span)\n", + "\n", + "# update the end time of the Chain span\n", + "chain_span.add_inputs_and_outputs(\n", + " inputs={\"query\":query},\n", + " outputs={\"response\": response_text})\n", + "\n", + "# update the Chain span's end time\n", + "chain_span._span.end_time_ms = llm_end_time_ms\n", + "\n", + "\n", + "# part 4 - the Agent then calls a Tool...\n", + "time.sleep(3)\n", + "days_to_election = 117\n", + "tool_end_time_ms = round(datetime.datetime.now().timestamp() * 1000)\n", + "\n", + "# create a Tool span\n", + "tool_span = Trace(\n", + " name=\"Calculator\",\n", + " kind=\"tool\",\n", + " status_code=\"success\",\n", + " start_time_ms=llm_end_time_ms,\n", + " end_time_ms=tool_end_time_ms,\n", + " inputs={\"input\": response_text},\n", + " outputs={\"result\": days_to_election})\n", + "\n", + "# add the TOOL span as a child of the root\n", + "root_span.add_child(tool_span)\n", + "\n", + "\n", + "# part 5 - the final results from the tool are added\n", + "root_span.add_inputs_and_outputs(inputs={\"query\": query},\n", + " outputs={\"result\": days_to_election})\n", + "root_span._span.end_time_ms = tool_end_time_ms\n", + "\n", + "\n", + "# part 6 - log all spans to W&B by logging the root span\n", + "root_span.log(name=\"openai_trace\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "nBFVwawPGVmD" + }, + "source": [ + "Once each Agent execution completes, all calls in your LangChain object will be logged to Weights & Biases" + ] + } + ], + "metadata": { + "accelerator": "GPU", + "colab": { + "include_colab_link": true, + "provenance": [] + }, + "kernelspec": { + "display_name": "Python 3", + "name": "python3" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} From 46e7ca7182155bf9c30107005deb9c61c19c1d41 Mon Sep 17 00:00:00 2001 From: Scott Date: Tue, 30 Jan 2024 13:12:53 +0000 Subject: [PATCH 2/5] add prompts_evaluation --- colabs/prompts/prompts_evaluation.ipynb | 506 ++++++++++++++++++++++++ 1 file changed, 506 insertions(+) create mode 100644 colabs/prompts/prompts_evaluation.ipynb diff --git a/colabs/prompts/prompts_evaluation.ipynb b/colabs/prompts/prompts_evaluation.ipynb new file mode 100644 index 00000000..981fdb30 --- /dev/null +++ b/colabs/prompts/prompts_evaluation.ipynb @@ -0,0 +1,506 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "280912a4-7121-4185-addf-e5dc315f9b03", + "metadata": {}, + "source": [ + "\"Open\n", + "" + ] + }, + { + "cell_type": "markdown", + "id": "bafb5303-fa1d-4abc-89f3-de10e8d282c8", + "metadata": {}, + "source": [ + "\"Weights\n", + "" + ] + }, + { + "cell_type": "markdown", + "id": "b76f1564", + "metadata": {}, + "source": [ + "# Iterate and Evaluate LLM applications" + ] + }, + { + "cell_type": "markdown", + "id": "a3e61dd9", + "metadata": {}, + "source": [ + "AI application building is an experimental process where you likely don't know how a given system will perform on your task. To iterate on an application, we need a way to evaluate if it's improving. To do so, a common practice is to test it against the same dataset when there is a change.\n", + "\n", + "This tutorial will show you how to:\n", + "- track input prompts and pipeline settings with `wandb.config`\n", + "- track final evaluation metrics e.g. F1 score or scores from LLM judges, with `wandb.log`\n", + "- track individual model predictions and metadata in `W&B Tables`" + ] + }, + { + "cell_type": "markdown", + "id": "c9720603", + "metadata": {}, + "source": [ + "We'll track F1 score on extracting named entities from an example news headlines dataset from `explosion/prodigy-recipes` from the https://prodi.gy/ team." + ] + }, + { + "cell_type": "markdown", + "id": "52c11c15", + "metadata": {}, + "source": [ + "# Setup\n", + "## Download Data" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b983c7b8", + "metadata": {}, + "outputs": [], + "source": [ + "!curl -O https://raw.githubusercontent.com/explosion/prodigy-recipes/master/example-datasets/annotated_news_headlines-ORG-PERSON-LOCATION-ner.jsonl" + ] + }, + { + "cell_type": "markdown", + "id": "28bba1c5", + "metadata": {}, + "source": [ + "## Installation" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4b99b883", + "metadata": {}, + "outputs": [], + "source": [ + "!pip install wandb openai" + ] + }, + { + "cell_type": "markdown", + "id": "58518d19", + "metadata": {}, + "source": [ + "## Create a W&B account and log in" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3a935c78", + "metadata": {}, + "outputs": [], + "source": [ + "import wandb\n", + "wandb.login()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3834be0a", + "metadata": {}, + "outputs": [], + "source": [ + "import json\n", + "from functools import partial\n", + "import timeit\n", + "import openai\n", + "from concurrent.futures import ThreadPoolExecutor\n", + "data = []\n", + "with open('annotated_news_headlines-ORG-PERSON-LOCATION-ner.jsonl') as f:\n", + " for line in f:\n", + " data.append(json.loads(line))" + ] + }, + { + "cell_type": "markdown", + "id": "ba4ad8a1", + "metadata": {}, + "source": [ + "# Format data" + ] + }, + { + "cell_type": "markdown", + "id": "2fd386fd", + "metadata": {}, + "source": [ + "Here we just remove data we're not using and format the examples for our task." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "34004089", + "metadata": {}, + "outputs": [], + "source": [ + "def clean_examples():\n", + " labelled_examples = []\n", + " for example in data:\n", + " entities = []\n", + " if 'spans' in example:\n", + " for span in example['spans']:\n", + " start = span['start']\n", + " end = span['end']\n", + " label = span['label']\n", + " # Extract the corresponding text from tokens\n", + " text = ''\n", + " for token in example['tokens']:\n", + " if token['start'] >= start and token['end'] <= end:\n", + " text += token['text'] + ' '\n", + " entities.append(text.rstrip())\n", + " labelled_examples.append({'text': example['text'], 'entities': entities})\n", + " return labelled_examples\n", + "\n", + "labelled_examples = clean_examples()" + ] + }, + { + "cell_type": "markdown", + "id": "3e898d60", + "metadata": {}, + "source": [ + "# Set up LLM boilerplate" + ] + }, + { + "cell_type": "markdown", + "id": "3ada324f", + "metadata": {}, + "source": [ + "We'll call `openai` (you'll need to add an OpenAI API key) with a given prompt to extract the entities and replace `` with our input. We'll also grab useful metadata from the openai response for logging." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c630e402", + "metadata": {}, + "outputs": [], + "source": [ + "def extract_entities_with_template(text, template_prompt, system_prompt, model, temperature):\n", + " start_time = timeit.default_timer()\n", + " prompt=template_prompt.replace('', text)\n", + " from openai import OpenAI\n", + " client = OpenAI()\n", + " response = client.chat.completions.create(\n", + " model=model,\n", + " messages=[\n", + " {\n", + " \"role\": \"system\",\n", + " \"content\": system_prompt\n", + " },\n", + " {\n", + " \"role\": \"user\",\n", + " \"content\": prompt\n", + " }\n", + " ],\n", + " temperature=temperature,\n", + " )\n", + " text = response.choices[0].message.content\n", + " entities = list(filter(None, text.split('\\n')))\n", + " usage = response.usage\n", + " prompt_tokens = usage.prompt_tokens\n", + " completion_tokens = usage.completion_tokens\n", + " total_tokens = usage.total_tokens\n", + " end_time = timeit.default_timer()\n", + " elapsed = end_time - start_time\n", + " return {\n", + " 'entities': entities,\n", + " 'model': model,\n", + " 'prompt': prompt,\n", + " 'elapsed': elapsed,\n", + " 'prompt_tokens': prompt_tokens,\n", + " 'completion_tokens': completion_tokens,\n", + " 'total_tokens': total_tokens\n", + " }" + ] + }, + { + "cell_type": "markdown", + "id": "b9ee3ea0", + "metadata": {}, + "source": [ + "# Calculate Metric" + ] + }, + { + "cell_type": "markdown", + "id": "2a2ebb75", + "metadata": {}, + "source": [ + "Here, we make an evaluation metric for our task. \n", + "Note: It's not shown here, but you could also use an LLM to evaluate your task if it's not as straight forward to evaluate as this task." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "0ad120db", + "metadata": {}, + "outputs": [], + "source": [ + "def calculate_f1(extracted_entities, ground_truth_entities):\n", + " extracted_set = set(map(str.lower, extracted_entities))\n", + " ground_truth_set = set(map(str.lower, ground_truth_entities))\n", + " tp_examples = extracted_set & ground_truth_set\n", + " tp = len(tp_examples)\n", + " fp_examples = extracted_set - ground_truth_set\n", + " fp = len(fp_examples)\n", + " fn_examples = ground_truth_set - extracted_set\n", + " fn = len(fn_examples)\n", + " precision = tp / (tp + fp) if (tp + fp) else 0\n", + " recall = tp / (tp + fn) if (tp + fn) else 0\n", + " f1 = 2 * precision * recall / (precision + recall) if (precision + recall) else 0\n", + " return f1, tp, fp, fn" + ] + }, + { + "cell_type": "markdown", + "id": "43a50281", + "metadata": {}, + "source": [ + "# Perform inference in parallel" + ] + }, + { + "cell_type": "markdown", + "id": "6cb082bd", + "metadata": {}, + "source": [ + "Running evaluations can be a bit slow. To speed it up, here is a bit of useful code to gather your examples in parallel. None of this is specific to W&B, but it's useful to have nonetheless." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "7256eba2", + "metadata": {}, + "outputs": [], + "source": [ + "def inference(examples, system_prompt, template_prompt, model, temperature):\n", + " extracted = []\n", + " # making a new function to openai which has the template\n", + " # this is needed because exectutor.map wants a func with one arg\n", + " openai_func = partial(extract_entities_with_template, model=model, \n", + " system_prompt=system_prompt, template_prompt=template_prompt, \n", + " temperature=temperature)\n", + " # Run the model to extract the entities\n", + " start_time = timeit.default_timer()\n", + " with ThreadPoolExecutor(max_workers=8) as executor:\n", + " for i in executor.map(openai_func, [t['text'] for t in examples]):\n", + " extracted.append(i)\n", + " end_time = timeit.default_timer()\n", + " elapsed = end_time - start_time\n", + " return extracted, elapsed\n", + "\n", + "model = 'gpt-3.5-turbo'\n", + "temperature = 0.7\n", + "template = '''\n", + "text: \n", + "Return the entities as a list with a new line between each entity.\n", + "'''\n", + "system_prompt = 'You are an excellent entity extractor reading newspapers and extracting orgs, people and locations. Extract the entities from the follow sentence.'\n", + "extracted, elapsed = inference(labelled_examples[:1], system_prompt, template, model, temperature)\n", + "print(extracted[0]) \n", + "print(labelled_examples[0]['text'])" + ] + }, + { + "cell_type": "markdown", + "id": "7a615f80", + "metadata": {}, + "source": [ + "# Evaluate extracted entities, save in W&B Table for inspection later" + ] + }, + { + "cell_type": "markdown", + "id": "5f7622cd", + "metadata": {}, + "source": [ + "Here, we calcualte our metric across all of our predictions and log them to a `wandb.Table` for later inspection." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "fb2da2ed", + "metadata": {}, + "outputs": [], + "source": [ + "def evaluate(extracted, labelled_examples):\n", + " total_tp, total_fp, total_fn = 0,0,0\n", + " eval_table = wandb.Table(columns=['pred', 'truth', 'f1', 'tp', 'fp', 'fn', \n", + " 'prompt_tokens', 'completion_tokens', 'total_tokens'])\n", + " for pred, gt in zip(extracted, labelled_examples):\n", + " f1, tp, fp, fn = calculate_f1(pred['entities'], gt['entities'])\n", + " total_tp += tp\n", + " total_fp += fp\n", + " total_fn += f1\n", + " eval_table.add_data(\n", + " pred['entities'], gt['entities'], f1, tp, fp, fn, \n", + " pred['prompt_tokens'], pred['completion_tokens'], pred['total_tokens']\n", + " )\n", + " wandb.log({'eval_table': eval_table})\n", + " overall_precision = total_tp / (total_tp + total_fp) if (total_tp + total_fp) else 0\n", + " overall_recall = total_tp / (total_tp + total_fn) if (total_tp + total_fn) else 0\n", + " overall_f1 = 2 * overall_precision * overall_recall / (overall_precision + overall_recall) if (overall_precision + overall_recall) else 0\n", + " return overall_precision, overall_recall, overall_f1" + ] + }, + { + "cell_type": "markdown", + "id": "763055df", + "metadata": {}, + "source": [ + "# Run our pipeline:" + ] + }, + { + "cell_type": "markdown", + "id": "6bc8d556", + "metadata": {}, + "source": [ + "To start logging to W&B, you can call `wandb.init` and pass in the config to track the configurations you're experimenting with currently.\n", + "\n", + "As you experiment, you can call `wandb.log` to track your work. This will log the metrics to W&B. Finally, we'll call `wandb.finish` to stop tracking. This will be tracked as one \"Run\" in W&B. \n", + "\n", + "You'll be given a link to W&B to see all of your logs." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "530c96fb", + "metadata": {}, + "outputs": [], + "source": [ + "NUM_EXAMPLES = 50\n", + "wandb.init(project='prompts_eval', config={\n", + " 'system_prompt': system_prompt,\n", + " 'template': template,\n", + " 'model': model,\n", + " 'temperature': temperature\n", + " })\n", + "extracted, elapsed = inference(labelled_examples[:NUM_EXAMPLES],\n", + " system_prompt, template, model, temperature)\n", + "overall_precision, overall_recall, overall_f1 = evaluate(extracted, \n", + " labelled_examples[:NUM_EXAMPLES])\n", + "total_tokens_sum = sum([pred['total_tokens'] for pred in extracted])\n", + "completion_tokens_sum = sum([pred['completion_tokens'] for pred in extracted])\n", + "prompt_tokens_sum = sum([pred['prompt_tokens'] for pred in extracted])\n", + "wandb.log({'precision': overall_precision,\n", + " 'recall': overall_recall,\n", + " 'f1': overall_f1,\n", + " 'time_elapsed_total': elapsed,\n", + " 'prompt_tokens': prompt_tokens_sum,\n", + " 'completion_tokens': completion_tokens_sum,\n", + " 'total_tokens': total_tokens_sum\n", + " })\n", + "wandb.finish()" + ] + }, + { + "cell_type": "markdown", + "id": "ad4af593", + "metadata": {}, + "source": [ + "# Set up experiments\n", + "\n", + "Start a W&B run per experiment with `wandb.init`, store experiment details in `config` arg. Log results with `wandb.log`. Call `wandb.finish` to end experiment. Loop over all options in grid search to find best configuration." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "149b0700", + "metadata": {}, + "outputs": [], + "source": [ + "system_prompts = ['Extract the entities from the follow sentence.', \n", + " 'You are an excellent entity extractor reading newspapers and extracting orgs, people and locations. Extract the entities from the follow sentence.']\n", + "for system_prompt in system_prompts:\n", + " for temperature in [0.2, 0.6, 0.9]:\n", + " for model in ['gpt-3.5-turbo', 'gpt-3.5-turbo-1106']:\n", + " wandb.init(project='prompts_eval', config={\n", + " 'system_prompt':system_prompt,\n", + " 'template': template,\n", + " 'model': model,\n", + " 'temperature': temperature\n", + " })\n", + " extracted, elapsed = inference(labelled_examples[:NUM_EXAMPLES],\n", + " system_prompt, template, model, temperature)\n", + " overall_precision, overall_recall, overall_f1 = evaluate(extracted, \n", + " labelled_examples[:NUM_EXAMPLES])\n", + " total_tokens_sum = sum([pred['total_tokens'] for pred in extracted])\n", + " completion_tokens_sum = sum([pred['completion_tokens'] for pred in extracted])\n", + " prompt_tokens_sum = sum([pred['prompt_tokens'] for pred in extracted])\n", + " wandb.log({'precision': overall_precision,\n", + " 'recall': overall_recall,\n", + " 'f1': overall_f1,\n", + " 'time_elapsed_total': elapsed,\n", + " 'prompt_tokens': prompt_tokens_sum,\n", + " 'completion_tokens': completion_tokens_sum,\n", + " 'total_tokens': total_tokens_sum\n", + " })\n", + " wandb.finish()" + ] + }, + { + "cell_type": "markdown", + "id": "5ba13a8c", + "metadata": {}, + "source": [ + "# Conclusion\n", + "\n", + "You've learned how to use W&B to track evaluations of your LLM applications. \n", + "You've used `wandb.init` to start tracking, `wandb.log` to log summary evaluation metrics and `wandb.Table` to track individual predictions & scores. \n", + "We've also shared some best practices to format your code to make it easier to run evaluations in parallel and track every iteration." + ] + }, + { + "cell_type": "markdown", + "id": "f5cceefe", + "metadata": {}, + "source": [ + "# Trace your LLM application\n", + "\n", + "If you want to learn more and you're using complex pipelines of LLM calls, you can leverage W&B Prompts to view traces of your application and see inputs & ouputs of each LLM or function call. \n", + "\n", + "Learn more about W&B Prompts in the documentation here: [https://docs.wandb.ai/guides/prompts](https://docs.wandb.ai/guides/prompts)" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.12" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} From a94b774fef305950bd728e7ffd7bdc8a89b9e889 Mon Sep 17 00:00:00 2001 From: Thomas Capelle Date: Wed, 31 Jan 2024 13:15:28 +0100 Subject: [PATCH 3/5] clean metadata --- colabs/prompts/prompts_evaluation.ipynb | 12 ------------ 1 file changed, 12 deletions(-) diff --git a/colabs/prompts/prompts_evaluation.ipynb b/colabs/prompts/prompts_evaluation.ipynb index 981fdb30..b1ac7fc5 100644 --- a/colabs/prompts/prompts_evaluation.ipynb +++ b/colabs/prompts/prompts_evaluation.ipynb @@ -487,18 +487,6 @@ "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.12" } }, "nbformat": 4, From 8e1b071b8c139767097baaa6719ff03efe2cc79e Mon Sep 17 00:00:00 2001 From: Scott Date: Wed, 31 Jan 2024 12:48:15 +0000 Subject: [PATCH 4/5] undo deleting custom column tut --- .../W&B_Prompts_with_Custom_Columns.ipynb | 618 ++++++++++++++++++ 1 file changed, 618 insertions(+) diff --git a/colabs/prompts/W&B_Prompts_with_Custom_Columns.ipynb b/colabs/prompts/W&B_Prompts_with_Custom_Columns.ipynb index e69de29b..ebc811cf 100644 --- a/colabs/prompts/W&B_Prompts_with_Custom_Columns.ipynb +++ b/colabs/prompts/W&B_Prompts_with_Custom_Columns.ipynb @@ -0,0 +1,618 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "e-ZYaV5KGVmA" + }, + "source": [ + "\"Open\n", + "" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "gJSVEAGWGVmA" + }, + "source": [ + "\"Weights\n", + "" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "9f7yMKLwGVmA" + }, + "source": [ + "**[Weights & Biases Prompts](https://docs.wandb.ai/guides/prompts?utm_source=code&utm_medium=colab&utm_campaign=prompts)** is a suite of LLMOps tools built for the development of LLM-powered applications.\n", + "\n", + "Use W&B Prompts to visualize and inspect the execution flow of your LLMs, analyze the inputs and outputs of your LLMs, view the intermediate results and securely store and manage your prompts and LLM chain configurations.\n", + "\n", + "#### [🪄 View Prompts In Action](https://wandb.ai/timssweeney/prompts-demo/)\n", + "\n", + "**In this notebook we will demostrate W&B Prompts:**\n", + "\n", + "- Using our 1-line LangChain integration\n", + "- Using our Trace class when building your own LLM Pipelines\n", + "\n", + "See here for the full [W&B Prompts documentation](https://docs.wandb.ai/guides/prompts)\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "A4wI3b_8GVmB" + }, + "source": [ + "## Installation" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": { + "id": "nDoIqQ8_GVmB" + }, + "outputs": [], + "source": [ + "!pip install \"wandb>=0.15.4\" -qqq\n", + "!pip install \"langchain>=0.0.218\" openai -qqq" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": { + "id": "PcGiSWBSGVmB" + }, + "outputs": [], + "source": [ + "import langchain\n", + "assert langchain.__version__ >= \"0.0.218\", \"Please ensure you are using LangChain v0.0.188 or higher\"" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "pbmQIsjJGVmB" + }, + "source": [ + "## Setup\n", + "\n", + "This demo requires that you have an [OpenAI key](https://platform.openai.com)" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "ZH4g2B0lGVmB", + "outputId": "22295db6-5369-474d-a8ea-fb45c4c92085" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Paste your OpenAI key from: https://platform.openai.com/account/api-keys\n", + "··········\n", + "OpenAI API key configured\n" + ] + } + ], + "source": [ + "import os\n", + "from getpass import getpass\n", + "\n", + "if os.getenv(\"OPENAI_API_KEY\") is None:\n", + " os.environ[\"OPENAI_API_KEY\"] = getpass(\"Paste your OpenAI key from: https://platform.openai.com/account/api-keys\\n\")\n", + "assert os.getenv(\"OPENAI_API_KEY\", \"\").startswith(\"sk-\"), \"This doesn't look like a valid OpenAI API key\"\n", + "print(\"OpenAI API key configured\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "79KOB2EhGVmB" + }, + "source": [ + "# W&B Prompts\n", + "\n", + "W&B Prompts consists of three main components:\n", + "\n", + "**Trace table**: Overview of the inputs and outputs of a chain.\n", + "\n", + "**Trace timeline**: Displays the execution flow of the chain and is color-coded according to component types.\n", + "\n", + "**Model architecture**: View details about the structure of the chain and the parameters used to initialize each component of the chain.\n", + "\n", + "After running this section, you will see a new panel automatically created in your workspace, showing each execution, the trace, and the model architecture" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "5kxmdm3zGVmC" + }, + "source": [ + "\"Weights" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "9u97K5vVGVmC" + }, + "source": [ + "## Maths with LangChain" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "oneRFmv6GVmC" + }, + "source": [ + "Set the `LANGCHAIN_WANDB_TRACING` environment variable as well as any other relevant [W&B environment variables](https://docs.wandb.ai/guides/track/environment-variables). This could includes a W&B project name, team name, and more. See [wandb.init](https://docs.wandb.ai/ref/python/init) for a full list of arguments." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": { + "id": "ACl-rMtAGVmC" + }, + "outputs": [], + "source": [ + "os.environ[\"LANGCHAIN_WANDB_TRACING\"] = \"true\"\n", + "os.environ[\"WANDB_PROJECT\"] = \"langchain-testing\"" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": { + "id": "csp3MXG4GVmC" + }, + "outputs": [], + "source": [ + "from langchain.chat_models import ChatOpenAI\n", + "from langchain.agents import load_tools, initialize_agent, AgentType" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "2hWU2GcAGVmC" + }, + "source": [ + "Create a standard math Agent using LangChain" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": { + "id": "l_JkVMlRGVmC" + }, + "outputs": [], + "source": [ + "llm = ChatOpenAI(temperature=0)\n", + "tools = load_tools([\"llm-math\"], llm=llm)\n", + "math_agent = initialize_agent(tools,\n", + " llm,\n", + " agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "9FFviwCPGVmC" + }, + "source": [ + "Use LangChain as normal by calling your Agent.\n", + "\n", + " You will see a Weights & Biases run start and you will be asked for your [Weights & Biases API key](wwww.wandb.ai/authorize). Once your enter your API key, the inputs and outputs of your Agent calls will start to be streamed to the Weights & Biases App." + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 178 + }, + "id": "y-RHjVN4GVmC", + "outputId": "5ccd5f32-6137-46c3-9abd-d458dbdbacca" + }, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "\u001b[34m\u001b[1mwandb\u001b[0m: Streaming LangChain activity to W&B at https://wandb.ai/carey/langchain-testing/runs/lcznj5lg\n", + "\u001b[34m\u001b[1mwandb\u001b[0m: `WandbTracer` is currently in beta.\n", + "\u001b[34m\u001b[1mwandb\u001b[0m: Please report any issues to https://github.com/wandb/wandb/issues with the tag `langchain`.\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "LLMMathChain._evaluate(\"\n", + "import math\n", + "math.sqrt(5.4)\n", + "\") raised error: invalid syntax (, line 1). Please try again with a valid numerical expression\n", + "0.005720801417544866\n", + "0.15096209512635608\n" + ] + } + ], + "source": [ + "# some sample maths questions\n", + "questions = [\n", + " \"Find the square root of 5.4.\",\n", + " \"What is 3 divided by 7.34 raised to the power of pi?\",\n", + " \"What is the sin of 0.47 radians, divided by the cube root of 27?\"\n", + "]\n", + "\n", + "for question in questions:\n", + " try:\n", + " # call your Agent as normal\n", + " answer = math_agent.run(question)\n", + " print(answer)\n", + " except Exception as e:\n", + " # any errors will be also logged to Weights & Biases\n", + " print(e)\n", + " pass" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "SNYFSaUrGVmC" + }, + "source": [ + "Once each Agent execution completes, all calls in your LangChain object will be logged to Weights & Biases" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "m0bL1xpkGVmC" + }, + "source": [ + "### LangChain Context Manager\n", + "Depending on your use case, you might instead prefer to use a context manager to manage your logging to W&B.\n", + "\n", + "**✨ New: Custom columns** can be logged directly to W&B to display in the same Trace Table with this snippet:\n", + "```python\n", + "import wandb\n", + "wandb.log(custom_metrics_dict, commit=False})\n", + "```\n", + "Use `commit=False` to make sure that metadata is logged to the same row of the Trace Table as the LangChain output." + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 35 + }, + "id": "7i9Pj1NKGVmC", + "outputId": "b44f3ae7-fd49-437f-af7b-fb8f82056bd0" + }, + "outputs": [ + { + "data": { + "application/vnd.google.colaboratory.intrinsic+json": { + "type": "string" + }, + "text/plain": [ + "'1.0891804557407723'" + ] + }, + "execution_count": 10, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "from langchain.callbacks import wandb_tracing_enabled\n", + "import wandb # To enable custom column logging with wandb.run.log()\n", + "\n", + "# unset the environment variable and use a context manager instead\n", + "if \"LANGCHAIN_WANDB_TRACING\" in os.environ:\n", + " del os.environ[\"LANGCHAIN_WANDB_TRACING\"]\n", + "\n", + "# enable tracing using a context manager\n", + "with wandb_tracing_enabled():\n", + " for i in range (10):\n", + " # Log any custom columns you'd like to add to the Trace Table\n", + " wandb.log({\"custom_column\": i}, commit=False)\n", + " try:\n", + " math_agent.run(f\"What is {i} raised to .123243 power?\") # this should be traced\n", + " except:\n", + " pass\n", + "\n", + "math_agent.run(\"What is 2 raised to .123243 power?\") # this should not be traced" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "JDLzoorhGVmC" + }, + "source": [ + "# Non-Lang Chain Implementation\n", + "\n", + "\n", + "A W&B Trace is created by logging 1 or more \"spans\". A root span is expected, which can accept nested child spans, which can in turn accept their own child spans. A Span represents a unit of work, Spans can have type `AGENT`, `TOOL`, `LLM` or `CHAIN`\n", + "\n", + "When logging with Trace, a single W&B run can have multiple calls to a LLM, Tool, Chain or Agent logged to it, there is no need to start a new W&B run after each generation from your model or pipeline, instead each call will be appended to the Trace Table.\n", + "\n", + "In this quickstart, we will how to log a single call to an OpenAI model to W&B Trace as a single span. Then we will show how to log a more complex series of nested spans.\n", + "\n", + "## Logging with W&B Trace" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "7z98yfoqGVmD" + }, + "source": [ + "Call wandb.init to start a W&B run. Here you can pass a W&B project name as well as an entity name (if logging to a W&B Team), as well as a config and more. See wandb.init for the full list of arguments.\n", + "\n", + "You will see a Weights & Biases run start and be asked for your [Weights & Biases API key](wwww.wandb.ai/authorize). Once your enter your API key, the inputs and outputs of your Agent calls will start to be streamed to the Weights & Biases App.\n", + "\n", + "**Note:** A W&B run supports logging as many traces you needed to a single run, i.e. you can make multiple calls of `run.log` without the need to create a new run each time" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "ZcvgzZ55GVmD" + }, + "outputs": [], + "source": [ + "import wandb\n", + "\n", + "# start a wandb run to log to\n", + "wandb.init(project=\"trace-example\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "4_3Wrg2YGVmD" + }, + "source": [ + "You can also set the entity argument in wandb.init if logging to a W&B Team.\n", + "\n", + "### Logging a single Span\n", + "Now we will query OpenAI times and log the results to a W&B Trace. We will log the inputs and outputs, start and end times, whether the OpenAI call was successful, the token usage, and additional metadata.\n", + "\n", + "You can see the full description of the arguments to the Trace class [here](https://soumik12345.github.io/wandb-addons/prompts/tracer/)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "q2pkMhpMGVmD" + }, + "outputs": [], + "source": [ + "import openai\n", + "import datetime\n", + "from wandb.sdk.data_types.trace_tree import Trace\n", + "\n", + "openai.api_key = os.environ[\"OPENAI_API_KEY\"]\n", + "\n", + "# define your conifg\n", + "model_name = \"gpt-3.5-turbo\"\n", + "temperature = 0.7\n", + "system_message = \"You are a helpful assistant that always replies in 3 concise bullet points using markdown.\"\n", + "\n", + "queries_ls = [\n", + " \"What is the capital of France?\",\n", + " \"How do I boil an egg?\" * 10000, # deliberately trigger an openai error\n", + " \"What to do if the aliens arrive?\"\n", + "]\n", + "\n", + "for query in queries_ls:\n", + " messages=[\n", + " {\"role\": \"system\", \"content\": system_message},\n", + " {\"role\": \"user\", \"content\": query}\n", + " ]\n", + "\n", + " start_time_ms = datetime.datetime.now().timestamp() * 1000\n", + " try:\n", + " response = openai.ChatCompletion.create(model=model_name,\n", + " messages=messages,\n", + " temperature=temperature\n", + " )\n", + "\n", + " end_time_ms = round(datetime.datetime.now().timestamp() * 1000) # logged in milliseconds\n", + " status=\"success\"\n", + " status_message=None,\n", + " response_text = response[\"choices\"][0][\"message\"][\"content\"]\n", + " token_usage = response[\"usage\"].to_dict()\n", + "\n", + "\n", + " except Exception as e:\n", + " end_time_ms = round(datetime.datetime.now().timestamp() * 1000) # logged in milliseconds\n", + " status=\"error\"\n", + " status_message=str(e)\n", + " response_text = \"\"\n", + " token_usage = {}\n", + "\n", + " # create a span in wandb\n", + " root_span = Trace(\n", + " name=\"root_span\",\n", + " kind=\"llm\", # kind can be \"llm\", \"chain\", \"agent\" or \"tool\"\n", + " status_code=status,\n", + " status_message=status_message,\n", + " metadata={\"temperature\": temperature,\n", + " \"token_usage\": token_usage,\n", + " \"model_name\": model_name},\n", + " start_time_ms=start_time_ms,\n", + " end_time_ms=end_time_ms,\n", + " inputs={\"system_prompt\": system_message, \"query\": query},\n", + " outputs={\"response\": response_text},\n", + " )\n", + "\n", + " # log the span to wandb\n", + " root_span.log(name=\"openai_trace\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "XFcwFgaDGVmD" + }, + "source": [ + "### Logging a LLM pipeline using nested Spans\n", + "\n", + "In this example we will simulate an Agent being called, which then calls a LLM Chain, which calls an OpenAI LLM and then the Agent \"calls\" a Calculator tool.\n", + "\n", + "The inputs, outputs and metadata for each step in the execution of our \"Agent\" is logged in its own span. Spans can have child" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "ACMaGuYUGVmD" + }, + "outputs": [], + "source": [ + "import time\n", + "\n", + "openai.api_key = os.environ[\"OPENAI_API_KEY\"]\n", + "\n", + "# The query our agent has to answer\n", + "query = \"How many days until the next US election?\"\n", + "\n", + "# part 1 - an Agent is started...\n", + "start_time_ms = round(datetime.datetime.now().timestamp() * 1000)\n", + "\n", + "root_span = Trace(\n", + " name=\"MyAgent\",\n", + " kind=\"agent\",\n", + " start_time_ms=start_time_ms,\n", + " metadata={\"user\": \"optimus_12\"})\n", + "\n", + "\n", + "# part 2 - The Agent calls into a LLMChain..\n", + "chain_span = Trace(\n", + " name=\"LLMChain\",\n", + " kind=\"chain\",\n", + " start_time_ms=start_time_ms)\n", + "\n", + "# add the Chain span as a child of the root\n", + "root_span.add_child(chain_span)\n", + "\n", + "\n", + "# part 3 - the LLMChain calls an OpenAI LLM...\n", + "messages=[\n", + " {\"role\": \"system\", \"content\": system_message},\n", + " {\"role\": \"user\", \"content\": query}\n", + "]\n", + "\n", + "response = openai.ChatCompletion.create(model=model_name,\n", + " messages=messages,\n", + " temperature=temperature)\n", + "\n", + "llm_end_time_ms = round(datetime.datetime.now().timestamp() * 1000)\n", + "response_text = response[\"choices\"][0][\"message\"][\"content\"]\n", + "token_usage = response[\"usage\"].to_dict()\n", + "\n", + "llm_span = Trace(\n", + " name=\"OpenAI\",\n", + " kind=\"llm\",\n", + " status_code=\"success\",\n", + " metadata={\"temperature\":temperature,\n", + " \"token_usage\": token_usage,\n", + " \"model_name\":model_name},\n", + " start_time_ms=start_time_ms,\n", + " end_time_ms=llm_end_time_ms,\n", + " inputs={\"system_prompt\":system_message, \"query\":query},\n", + " outputs={\"response\": response_text},\n", + " )\n", + "\n", + "# add the LLM span as a child of the Chain span...\n", + "chain_span.add_child(llm_span)\n", + "\n", + "# update the end time of the Chain span\n", + "chain_span.add_inputs_and_outputs(\n", + " inputs={\"query\":query},\n", + " outputs={\"response\": response_text})\n", + "\n", + "# update the Chain span's end time\n", + "chain_span._span.end_time_ms = llm_end_time_ms\n", + "\n", + "\n", + "# part 4 - the Agent then calls a Tool...\n", + "time.sleep(3)\n", + "days_to_election = 117\n", + "tool_end_time_ms = round(datetime.datetime.now().timestamp() * 1000)\n", + "\n", + "# create a Tool span\n", + "tool_span = Trace(\n", + " name=\"Calculator\",\n", + " kind=\"tool\",\n", + " status_code=\"success\",\n", + " start_time_ms=llm_end_time_ms,\n", + " end_time_ms=tool_end_time_ms,\n", + " inputs={\"input\": response_text},\n", + " outputs={\"result\": days_to_election})\n", + "\n", + "# add the TOOL span as a child of the root\n", + "root_span.add_child(tool_span)\n", + "\n", + "\n", + "# part 5 - the final results from the tool are added\n", + "root_span.add_inputs_and_outputs(inputs={\"query\": query},\n", + " outputs={\"result\": days_to_election})\n", + "root_span._span.end_time_ms = tool_end_time_ms\n", + "\n", + "\n", + "# part 6 - log all spans to W&B by logging the root span\n", + "root_span.log(name=\"openai_trace\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "nBFVwawPGVmD" + }, + "source": [ + "Once each Agent execution completes, all calls in your LangChain object will be logged to Weights & Biases" + ] + } + ], + "metadata": { + "accelerator": "GPU", + "colab": { + "include_colab_link": true, + "provenance": [] + }, + "kernelspec": { + "display_name": "Python 3", + "name": "python3" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} From ab788d63fc61c2150fea4d7d82aec86562602d63 Mon Sep 17 00:00:00 2001 From: Thomas Capelle Date: Wed, 31 Jan 2024 15:30:54 +0100 Subject: [PATCH 5/5] clean up --- .../W&B_Prompts_with_Custom_Columns.ipynb | 1141 ++++++++--------- 1 file changed, 533 insertions(+), 608 deletions(-) diff --git a/colabs/prompts/W&B_Prompts_with_Custom_Columns.ipynb b/colabs/prompts/W&B_Prompts_with_Custom_Columns.ipynb index ebc811cf..55708888 100644 --- a/colabs/prompts/W&B_Prompts_with_Custom_Columns.ipynb +++ b/colabs/prompts/W&B_Prompts_with_Custom_Columns.ipynb @@ -1,618 +1,543 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": { - "id": "e-ZYaV5KGVmA" - }, - "source": [ - "\"Open\n", - "" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "gJSVEAGWGVmA" - }, - "source": [ - "\"Weights\n", - "" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "9f7yMKLwGVmA" - }, - "source": [ - "**[Weights & Biases Prompts](https://docs.wandb.ai/guides/prompts?utm_source=code&utm_medium=colab&utm_campaign=prompts)** is a suite of LLMOps tools built for the development of LLM-powered applications.\n", - "\n", - "Use W&B Prompts to visualize and inspect the execution flow of your LLMs, analyze the inputs and outputs of your LLMs, view the intermediate results and securely store and manage your prompts and LLM chain configurations.\n", - "\n", - "#### [🪄 View Prompts In Action](https://wandb.ai/timssweeney/prompts-demo/)\n", - "\n", - "**In this notebook we will demostrate W&B Prompts:**\n", - "\n", - "- Using our 1-line LangChain integration\n", - "- Using our Trace class when building your own LLM Pipelines\n", - "\n", - "See here for the full [W&B Prompts documentation](https://docs.wandb.ai/guides/prompts)\n" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "A4wI3b_8GVmB" - }, - "source": [ - "## Installation" - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": { - "id": "nDoIqQ8_GVmB" - }, - "outputs": [], - "source": [ - "!pip install \"wandb>=0.15.4\" -qqq\n", - "!pip install \"langchain>=0.0.218\" openai -qqq" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": { - "id": "PcGiSWBSGVmB" - }, - "outputs": [], - "source": [ - "import langchain\n", - "assert langchain.__version__ >= \"0.0.218\", \"Please ensure you are using LangChain v0.0.188 or higher\"" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "pbmQIsjJGVmB" - }, - "source": [ - "## Setup\n", - "\n", - "This demo requires that you have an [OpenAI key](https://platform.openai.com)" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/" - }, - "id": "ZH4g2B0lGVmB", - "outputId": "22295db6-5369-474d-a8ea-fb45c4c92085" - }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Paste your OpenAI key from: https://platform.openai.com/account/api-keys\n", - "··········\n", - "OpenAI API key configured\n" - ] - } - ], - "source": [ - "import os\n", - "from getpass import getpass\n", - "\n", - "if os.getenv(\"OPENAI_API_KEY\") is None:\n", - " os.environ[\"OPENAI_API_KEY\"] = getpass(\"Paste your OpenAI key from: https://platform.openai.com/account/api-keys\\n\")\n", - "assert os.getenv(\"OPENAI_API_KEY\", \"\").startswith(\"sk-\"), \"This doesn't look like a valid OpenAI API key\"\n", - "print(\"OpenAI API key configured\")" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "79KOB2EhGVmB" - }, - "source": [ - "# W&B Prompts\n", - "\n", - "W&B Prompts consists of three main components:\n", - "\n", - "**Trace table**: Overview of the inputs and outputs of a chain.\n", - "\n", - "**Trace timeline**: Displays the execution flow of the chain and is color-coded according to component types.\n", - "\n", - "**Model architecture**: View details about the structure of the chain and the parameters used to initialize each component of the chain.\n", - "\n", - "After running this section, you will see a new panel automatically created in your workspace, showing each execution, the trace, and the model architecture" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "5kxmdm3zGVmC" - }, - "source": [ - "\"Weights" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "9u97K5vVGVmC" - }, - "source": [ - "## Maths with LangChain" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "oneRFmv6GVmC" - }, - "source": [ - "Set the `LANGCHAIN_WANDB_TRACING` environment variable as well as any other relevant [W&B environment variables](https://docs.wandb.ai/guides/track/environment-variables). This could includes a W&B project name, team name, and more. See [wandb.init](https://docs.wandb.ai/ref/python/init) for a full list of arguments." - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": { - "id": "ACl-rMtAGVmC" - }, - "outputs": [], - "source": [ - "os.environ[\"LANGCHAIN_WANDB_TRACING\"] = \"true\"\n", - "os.environ[\"WANDB_PROJECT\"] = \"langchain-testing\"" - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": { - "id": "csp3MXG4GVmC" - }, - "outputs": [], - "source": [ - "from langchain.chat_models import ChatOpenAI\n", - "from langchain.agents import load_tools, initialize_agent, AgentType" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "2hWU2GcAGVmC" - }, - "source": [ - "Create a standard math Agent using LangChain" - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "metadata": { - "id": "l_JkVMlRGVmC" - }, - "outputs": [], - "source": [ - "llm = ChatOpenAI(temperature=0)\n", - "tools = load_tools([\"llm-math\"], llm=llm)\n", - "math_agent = initialize_agent(tools,\n", - " llm,\n", - " agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "9FFviwCPGVmC" - }, - "source": [ - "Use LangChain as normal by calling your Agent.\n", - "\n", - " You will see a Weights & Biases run start and you will be asked for your [Weights & Biases API key](wwww.wandb.ai/authorize). Once your enter your API key, the inputs and outputs of your Agent calls will start to be streamed to the Weights & Biases App." - ] - }, - { - "cell_type": "code", - "execution_count": 7, - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/", - "height": 178 - }, - "id": "y-RHjVN4GVmC", - "outputId": "5ccd5f32-6137-46c3-9abd-d458dbdbacca" - }, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "\u001b[34m\u001b[1mwandb\u001b[0m: Streaming LangChain activity to W&B at https://wandb.ai/carey/langchain-testing/runs/lcznj5lg\n", - "\u001b[34m\u001b[1mwandb\u001b[0m: `WandbTracer` is currently in beta.\n", - "\u001b[34m\u001b[1mwandb\u001b[0m: Please report any issues to https://github.com/wandb/wandb/issues with the tag `langchain`.\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "LLMMathChain._evaluate(\"\n", - "import math\n", - "math.sqrt(5.4)\n", - "\") raised error: invalid syntax (, line 1). Please try again with a valid numerical expression\n", - "0.005720801417544866\n", - "0.15096209512635608\n" - ] - } - ], - "source": [ - "# some sample maths questions\n", - "questions = [\n", - " \"Find the square root of 5.4.\",\n", - " \"What is 3 divided by 7.34 raised to the power of pi?\",\n", - " \"What is the sin of 0.47 radians, divided by the cube root of 27?\"\n", - "]\n", - "\n", - "for question in questions:\n", - " try:\n", - " # call your Agent as normal\n", - " answer = math_agent.run(question)\n", - " print(answer)\n", - " except Exception as e:\n", - " # any errors will be also logged to Weights & Biases\n", - " print(e)\n", - " pass" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "SNYFSaUrGVmC" - }, - "source": [ - "Once each Agent execution completes, all calls in your LangChain object will be logged to Weights & Biases" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "m0bL1xpkGVmC" - }, - "source": [ - "### LangChain Context Manager\n", - "Depending on your use case, you might instead prefer to use a context manager to manage your logging to W&B.\n", - "\n", - "**✨ New: Custom columns** can be logged directly to W&B to display in the same Trace Table with this snippet:\n", - "```python\n", - "import wandb\n", - "wandb.log(custom_metrics_dict, commit=False})\n", - "```\n", - "Use `commit=False` to make sure that metadata is logged to the same row of the Trace Table as the LangChain output." - ] - }, - { - "cell_type": "code", - "execution_count": 10, - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/", - "height": 35 - }, - "id": "7i9Pj1NKGVmC", - "outputId": "b44f3ae7-fd49-437f-af7b-fb8f82056bd0" - }, - "outputs": [ - { - "data": { - "application/vnd.google.colaboratory.intrinsic+json": { - "type": "string" - }, - "text/plain": [ - "'1.0891804557407723'" - ] - }, - "execution_count": 10, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "from langchain.callbacks import wandb_tracing_enabled\n", - "import wandb # To enable custom column logging with wandb.run.log()\n", - "\n", - "# unset the environment variable and use a context manager instead\n", - "if \"LANGCHAIN_WANDB_TRACING\" in os.environ:\n", - " del os.environ[\"LANGCHAIN_WANDB_TRACING\"]\n", - "\n", - "# enable tracing using a context manager\n", - "with wandb_tracing_enabled():\n", - " for i in range (10):\n", - " # Log any custom columns you'd like to add to the Trace Table\n", - " wandb.log({\"custom_column\": i}, commit=False)\n", - " try:\n", - " math_agent.run(f\"What is {i} raised to .123243 power?\") # this should be traced\n", - " except:\n", - " pass\n", - "\n", - "math_agent.run(\"What is 2 raised to .123243 power?\") # this should not be traced" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "JDLzoorhGVmC" - }, - "source": [ - "# Non-Lang Chain Implementation\n", - "\n", - "\n", - "A W&B Trace is created by logging 1 or more \"spans\". A root span is expected, which can accept nested child spans, which can in turn accept their own child spans. A Span represents a unit of work, Spans can have type `AGENT`, `TOOL`, `LLM` or `CHAIN`\n", - "\n", - "When logging with Trace, a single W&B run can have multiple calls to a LLM, Tool, Chain or Agent logged to it, there is no need to start a new W&B run after each generation from your model or pipeline, instead each call will be appended to the Trace Table.\n", - "\n", - "In this quickstart, we will how to log a single call to an OpenAI model to W&B Trace as a single span. Then we will show how to log a more complex series of nested spans.\n", - "\n", - "## Logging with W&B Trace" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "7z98yfoqGVmD" - }, - "source": [ - "Call wandb.init to start a W&B run. Here you can pass a W&B project name as well as an entity name (if logging to a W&B Team), as well as a config and more. See wandb.init for the full list of arguments.\n", - "\n", - "You will see a Weights & Biases run start and be asked for your [Weights & Biases API key](wwww.wandb.ai/authorize). Once your enter your API key, the inputs and outputs of your Agent calls will start to be streamed to the Weights & Biases App.\n", - "\n", - "**Note:** A W&B run supports logging as many traces you needed to a single run, i.e. you can make multiple calls of `run.log` without the need to create a new run each time" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "ZcvgzZ55GVmD" - }, - "outputs": [], - "source": [ - "import wandb\n", - "\n", - "# start a wandb run to log to\n", - "wandb.init(project=\"trace-example\")" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "4_3Wrg2YGVmD" - }, - "source": [ - "You can also set the entity argument in wandb.init if logging to a W&B Team.\n", - "\n", - "### Logging a single Span\n", - "Now we will query OpenAI times and log the results to a W&B Trace. We will log the inputs and outputs, start and end times, whether the OpenAI call was successful, the token usage, and additional metadata.\n", - "\n", - "You can see the full description of the arguments to the Trace class [here](https://soumik12345.github.io/wandb-addons/prompts/tracer/)." - ] - }, + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\"Open\n", + "" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\"Weights\n", + "" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**[Weights & Biases Prompts](https://docs.wandb.ai/guides/prompts?utm_source=code&utm_medium=colab&utm_campaign=prompts)** is a suite of LLMOps tools built for the development of LLM-powered applications.\n", + "\n", + "Use W&B Prompts to visualize and inspect the execution flow of your LLMs, analyze the inputs and outputs of your LLMs, view the intermediate results and securely store and manage your prompts and LLM chain configurations.\n", + "\n", + "#### [🪄 View Prompts In Action](https://wandb.ai/timssweeney/prompts-demo/)\n", + "\n", + "**In this notebook we will demostrate W&B Prompts:**\n", + "\n", + "- Using our 1-line LangChain integration\n", + "- Using our Trace class when building your own LLM Pipelines\n", + "\n", + "See here for the full [W&B Prompts documentation](https://docs.wandb.ai/guides/prompts)\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Installation" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "!pip install \"wandb>=0.15.4\" -qqq\n", + "!pip install \"langchain>=0.0.218\" openai -qqq" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import langchain\n", + "assert langchain.__version__ >= \"0.0.218\", \"Please ensure you are using LangChain v0.0.188 or higher\"" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Setup\n", + "\n", + "This demo requires that you have an [OpenAI key](https://platform.openai.com)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [ { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "q2pkMhpMGVmD" - }, - "outputs": [], - "source": [ - "import openai\n", - "import datetime\n", - "from wandb.sdk.data_types.trace_tree import Trace\n", - "\n", - "openai.api_key = os.environ[\"OPENAI_API_KEY\"]\n", - "\n", - "# define your conifg\n", - "model_name = \"gpt-3.5-turbo\"\n", - "temperature = 0.7\n", - "system_message = \"You are a helpful assistant that always replies in 3 concise bullet points using markdown.\"\n", - "\n", - "queries_ls = [\n", - " \"What is the capital of France?\",\n", - " \"How do I boil an egg?\" * 10000, # deliberately trigger an openai error\n", - " \"What to do if the aliens arrive?\"\n", - "]\n", - "\n", - "for query in queries_ls:\n", - " messages=[\n", - " {\"role\": \"system\", \"content\": system_message},\n", - " {\"role\": \"user\", \"content\": query}\n", - " ]\n", - "\n", - " start_time_ms = datetime.datetime.now().timestamp() * 1000\n", - " try:\n", - " response = openai.ChatCompletion.create(model=model_name,\n", - " messages=messages,\n", - " temperature=temperature\n", - " )\n", - "\n", - " end_time_ms = round(datetime.datetime.now().timestamp() * 1000) # logged in milliseconds\n", - " status=\"success\"\n", - " status_message=None,\n", - " response_text = response[\"choices\"][0][\"message\"][\"content\"]\n", - " token_usage = response[\"usage\"].to_dict()\n", - "\n", - "\n", - " except Exception as e:\n", - " end_time_ms = round(datetime.datetime.now().timestamp() * 1000) # logged in milliseconds\n", - " status=\"error\"\n", - " status_message=str(e)\n", - " response_text = \"\"\n", - " token_usage = {}\n", - "\n", - " # create a span in wandb\n", - " root_span = Trace(\n", - " name=\"root_span\",\n", - " kind=\"llm\", # kind can be \"llm\", \"chain\", \"agent\" or \"tool\"\n", - " status_code=status,\n", - " status_message=status_message,\n", - " metadata={\"temperature\": temperature,\n", - " \"token_usage\": token_usage,\n", - " \"model_name\": model_name},\n", - " start_time_ms=start_time_ms,\n", - " end_time_ms=end_time_ms,\n", - " inputs={\"system_prompt\": system_message, \"query\": query},\n", - " outputs={\"response\": response_text},\n", - " )\n", - "\n", - " # log the span to wandb\n", - " root_span.log(name=\"openai_trace\")" - ] - }, + "name": "stdout", + "output_type": "stream", + "text": [ + "Paste your OpenAI key from: https://platform.openai.com/account/api-keys\n", + "··········\n", + "OpenAI API key configured\n" + ] + } + ], + "source": [ + "import os\n", + "from getpass import getpass\n", + "\n", + "if os.getenv(\"OPENAI_API_KEY\") is None:\n", + " os.environ[\"OPENAI_API_KEY\"] = getpass(\"Paste your OpenAI key from: https://platform.openai.com/account/api-keys\\n\")\n", + "assert os.getenv(\"OPENAI_API_KEY\", \"\").startswith(\"sk-\"), \"This doesn't look like a valid OpenAI API key\"\n", + "print(\"OpenAI API key configured\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# W&B Prompts\n", + "\n", + "W&B Prompts consists of three main components:\n", + "\n", + "**Trace table**: Overview of the inputs and outputs of a chain.\n", + "\n", + "**Trace timeline**: Displays the execution flow of the chain and is color-coded according to component types.\n", + "\n", + "**Model architecture**: View details about the structure of the chain and the parameters used to initialize each component of the chain.\n", + "\n", + "After running this section, you will see a new panel automatically created in your workspace, showing each execution, the trace, and the model architecture" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\"Weights" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Maths with LangChain" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Set the `LANGCHAIN_WANDB_TRACING` environment variable as well as any other relevant [W&B environment variables](https://docs.wandb.ai/guides/track/environment-variables). This could includes a W&B project name, team name, and more. See [wandb.init](https://docs.wandb.ai/ref/python/init) for a full list of arguments." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "os.environ[\"LANGCHAIN_WANDB_TRACING\"] = \"true\"\n", + "os.environ[\"WANDB_PROJECT\"] = \"langchain-testing\"" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from langchain.chat_models import ChatOpenAI\n", + "from langchain.agents import load_tools, initialize_agent, AgentType" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Create a standard math Agent using LangChain" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "llm = ChatOpenAI(temperature=0)\n", + "tools = load_tools([\"llm-math\"], llm=llm)\n", + "math_agent = initialize_agent(tools,\n", + " llm,\n", + " agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Use LangChain as normal by calling your Agent.\n", + "\n", + " You will see a Weights & Biases run start and you will be asked for your [Weights & Biases API key](wwww.wandb.ai/authorize). Once your enter your API key, the inputs and outputs of your Agent calls will start to be streamed to the Weights & Biases App." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [ { - "cell_type": "markdown", - "metadata": { - "id": "XFcwFgaDGVmD" - }, - "source": [ - "### Logging a LLM pipeline using nested Spans\n", - "\n", - "In this example we will simulate an Agent being called, which then calls a LLM Chain, which calls an OpenAI LLM and then the Agent \"calls\" a Calculator tool.\n", - "\n", - "The inputs, outputs and metadata for each step in the execution of our \"Agent\" is logged in its own span. Spans can have child" - ] + "name": "stderr", + "output_type": "stream", + "text": [ + "\u001b[34m\u001b[1mwandb\u001b[0m: Streaming LangChain activity to W&B at https://wandb.ai/carey/langchain-testing/runs/lcznj5lg\n", + "\u001b[34m\u001b[1mwandb\u001b[0m: `WandbTracer` is currently in beta.\n", + "\u001b[34m\u001b[1mwandb\u001b[0m: Please report any issues to https://github.com/wandb/wandb/issues with the tag `langchain`.\n" + ] }, { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "ACMaGuYUGVmD" - }, - "outputs": [], - "source": [ - "import time\n", - "\n", - "openai.api_key = os.environ[\"OPENAI_API_KEY\"]\n", - "\n", - "# The query our agent has to answer\n", - "query = \"How many days until the next US election?\"\n", - "\n", - "# part 1 - an Agent is started...\n", - "start_time_ms = round(datetime.datetime.now().timestamp() * 1000)\n", - "\n", - "root_span = Trace(\n", - " name=\"MyAgent\",\n", - " kind=\"agent\",\n", - " start_time_ms=start_time_ms,\n", - " metadata={\"user\": \"optimus_12\"})\n", - "\n", - "\n", - "# part 2 - The Agent calls into a LLMChain..\n", - "chain_span = Trace(\n", - " name=\"LLMChain\",\n", - " kind=\"chain\",\n", - " start_time_ms=start_time_ms)\n", - "\n", - "# add the Chain span as a child of the root\n", - "root_span.add_child(chain_span)\n", - "\n", - "\n", - "# part 3 - the LLMChain calls an OpenAI LLM...\n", - "messages=[\n", - " {\"role\": \"system\", \"content\": system_message},\n", - " {\"role\": \"user\", \"content\": query}\n", - "]\n", - "\n", - "response = openai.ChatCompletion.create(model=model_name,\n", - " messages=messages,\n", - " temperature=temperature)\n", - "\n", - "llm_end_time_ms = round(datetime.datetime.now().timestamp() * 1000)\n", - "response_text = response[\"choices\"][0][\"message\"][\"content\"]\n", - "token_usage = response[\"usage\"].to_dict()\n", - "\n", - "llm_span = Trace(\n", - " name=\"OpenAI\",\n", - " kind=\"llm\",\n", - " status_code=\"success\",\n", - " metadata={\"temperature\":temperature,\n", - " \"token_usage\": token_usage,\n", - " \"model_name\":model_name},\n", - " start_time_ms=start_time_ms,\n", - " end_time_ms=llm_end_time_ms,\n", - " inputs={\"system_prompt\":system_message, \"query\":query},\n", - " outputs={\"response\": response_text},\n", - " )\n", - "\n", - "# add the LLM span as a child of the Chain span...\n", - "chain_span.add_child(llm_span)\n", - "\n", - "# update the end time of the Chain span\n", - "chain_span.add_inputs_and_outputs(\n", - " inputs={\"query\":query},\n", - " outputs={\"response\": response_text})\n", - "\n", - "# update the Chain span's end time\n", - "chain_span._span.end_time_ms = llm_end_time_ms\n", - "\n", - "\n", - "# part 4 - the Agent then calls a Tool...\n", - "time.sleep(3)\n", - "days_to_election = 117\n", - "tool_end_time_ms = round(datetime.datetime.now().timestamp() * 1000)\n", - "\n", - "# create a Tool span\n", - "tool_span = Trace(\n", - " name=\"Calculator\",\n", - " kind=\"tool\",\n", - " status_code=\"success\",\n", - " start_time_ms=llm_end_time_ms,\n", - " end_time_ms=tool_end_time_ms,\n", - " inputs={\"input\": response_text},\n", - " outputs={\"result\": days_to_election})\n", - "\n", - "# add the TOOL span as a child of the root\n", - "root_span.add_child(tool_span)\n", - "\n", - "\n", - "# part 5 - the final results from the tool are added\n", - "root_span.add_inputs_and_outputs(inputs={\"query\": query},\n", - " outputs={\"result\": days_to_election})\n", - "root_span._span.end_time_ms = tool_end_time_ms\n", - "\n", - "\n", - "# part 6 - log all spans to W&B by logging the root span\n", - "root_span.log(name=\"openai_trace\")" - ] - }, + "name": "stdout", + "output_type": "stream", + "text": [ + "LLMMathChain._evaluate(\"\n", + "import math\n", + "math.sqrt(5.4)\n", + "\") raised error: invalid syntax (, line 1). Please try again with a valid numerical expression\n", + "0.005720801417544866\n", + "0.15096209512635608\n" + ] + } + ], + "source": [ + "# some sample maths questions\n", + "questions = [\n", + " \"Find the square root of 5.4.\",\n", + " \"What is 3 divided by 7.34 raised to the power of pi?\",\n", + " \"What is the sin of 0.47 radians, divided by the cube root of 27?\"\n", + "]\n", + "\n", + "for question in questions:\n", + " try:\n", + " # call your Agent as normal\n", + " answer = math_agent.run(question)\n", + " print(answer)\n", + " except Exception as e:\n", + " # any errors will be also logged to Weights & Biases\n", + " print(e)\n", + " pass" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Once each Agent execution completes, all calls in your LangChain object will be logged to Weights & Biases" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### LangChain Context Manager\n", + "Depending on your use case, you might instead prefer to use a context manager to manage your logging to W&B.\n", + "\n", + "**✨ New: Custom columns** can be logged directly to W&B to display in the same Trace Table with this snippet:\n", + "```python\n", + "import wandb\n", + "wandb.log(custom_metrics_dict, commit=False})\n", + "```\n", + "Use `commit=False` to make sure that metadata is logged to the same row of the Trace Table as the LangChain output." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [ { - "cell_type": "markdown", - "metadata": { - "id": "nBFVwawPGVmD" - }, - "source": [ - "Once each Agent execution completes, all calls in your LangChain object will be logged to Weights & Biases" + "data": { + "text/plain": [ + "'1.0891804557407723'" ] + }, + "execution_count": null, + "metadata": {}, + "output_type": "execute_result" } - ], - "metadata": { - "accelerator": "GPU", - "colab": { - "include_colab_link": true, - "provenance": [] - }, - "kernelspec": { - "display_name": "Python 3", - "name": "python3" - } + ], + "source": [ + "from langchain.callbacks import wandb_tracing_enabled\n", + "import wandb # To enable custom column logging with wandb.run.log()\n", + "\n", + "# unset the environment variable and use a context manager instead\n", + "if \"LANGCHAIN_WANDB_TRACING\" in os.environ:\n", + " del os.environ[\"LANGCHAIN_WANDB_TRACING\"]\n", + "\n", + "# enable tracing using a context manager\n", + "with wandb_tracing_enabled():\n", + " for i in range (10):\n", + " # Log any custom columns you'd like to add to the Trace Table\n", + " wandb.log({\"custom_column\": i}, commit=False)\n", + " try:\n", + " math_agent.run(f\"What is {i} raised to .123243 power?\") # this should be traced\n", + " except:\n", + " pass\n", + "\n", + "math_agent.run(\"What is 2 raised to .123243 power?\") # this should not be traced" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Non-Lang Chain Implementation\n", + "\n", + "\n", + "A W&B Trace is created by logging 1 or more \"spans\". A root span is expected, which can accept nested child spans, which can in turn accept their own child spans. A Span represents a unit of work, Spans can have type `AGENT`, `TOOL`, `LLM` or `CHAIN`\n", + "\n", + "When logging with Trace, a single W&B run can have multiple calls to a LLM, Tool, Chain or Agent logged to it, there is no need to start a new W&B run after each generation from your model or pipeline, instead each call will be appended to the Trace Table.\n", + "\n", + "In this quickstart, we will how to log a single call to an OpenAI model to W&B Trace as a single span. Then we will show how to log a more complex series of nested spans.\n", + "\n", + "## Logging with W&B Trace" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Call wandb.init to start a W&B run. Here you can pass a W&B project name as well as an entity name (if logging to a W&B Team), as well as a config and more. See wandb.init for the full list of arguments.\n", + "\n", + "You will see a Weights & Biases run start and be asked for your [Weights & Biases API key](wwww.wandb.ai/authorize). Once your enter your API key, the inputs and outputs of your Agent calls will start to be streamed to the Weights & Biases App.\n", + "\n", + "**Note:** A W&B run supports logging as many traces you needed to a single run, i.e. you can make multiple calls of `run.log` without the need to create a new run each time" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import wandb\n", + "\n", + "# start a wandb run to log to\n", + "wandb.init(project=\"trace-example\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You can also set the entity argument in wandb.init if logging to a W&B Team.\n", + "\n", + "### Logging a single Span\n", + "Now we will query OpenAI times and log the results to a W&B Trace. We will log the inputs and outputs, start and end times, whether the OpenAI call was successful, the token usage, and additional metadata.\n", + "\n", + "You can see the full description of the arguments to the Trace class [here](https://soumik12345.github.io/wandb-addons/prompts/tracer/)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import openai\n", + "import datetime\n", + "from wandb.sdk.data_types.trace_tree import Trace\n", + "\n", + "openai.api_key = os.environ[\"OPENAI_API_KEY\"]\n", + "\n", + "# define your conifg\n", + "model_name = \"gpt-3.5-turbo\"\n", + "temperature = 0.7\n", + "system_message = \"You are a helpful assistant that always replies in 3 concise bullet points using markdown.\"\n", + "\n", + "queries_ls = [\n", + " \"What is the capital of France?\",\n", + " \"How do I boil an egg?\" * 10000, # deliberately trigger an openai error\n", + " \"What to do if the aliens arrive?\"\n", + "]\n", + "\n", + "for query in queries_ls:\n", + " messages=[\n", + " {\"role\": \"system\", \"content\": system_message},\n", + " {\"role\": \"user\", \"content\": query}\n", + " ]\n", + "\n", + " start_time_ms = datetime.datetime.now().timestamp() * 1000\n", + " try:\n", + " response = openai.ChatCompletion.create(model=model_name,\n", + " messages=messages,\n", + " temperature=temperature\n", + " )\n", + "\n", + " end_time_ms = round(datetime.datetime.now().timestamp() * 1000) # logged in milliseconds\n", + " status=\"success\"\n", + " status_message=None,\n", + " response_text = response[\"choices\"][0][\"message\"][\"content\"]\n", + " token_usage = response[\"usage\"].to_dict()\n", + "\n", + "\n", + " except Exception as e:\n", + " end_time_ms = round(datetime.datetime.now().timestamp() * 1000) # logged in milliseconds\n", + " status=\"error\"\n", + " status_message=str(e)\n", + " response_text = \"\"\n", + " token_usage = {}\n", + "\n", + " # create a span in wandb\n", + " root_span = Trace(\n", + " name=\"root_span\",\n", + " kind=\"llm\", # kind can be \"llm\", \"chain\", \"agent\" or \"tool\"\n", + " status_code=status,\n", + " status_message=status_message,\n", + " metadata={\"temperature\": temperature,\n", + " \"token_usage\": token_usage,\n", + " \"model_name\": model_name},\n", + " start_time_ms=start_time_ms,\n", + " end_time_ms=end_time_ms,\n", + " inputs={\"system_prompt\": system_message, \"query\": query},\n", + " outputs={\"response\": response_text},\n", + " )\n", + "\n", + " # log the span to wandb\n", + " root_span.log(name=\"openai_trace\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Logging a LLM pipeline using nested Spans\n", + "\n", + "In this example we will simulate an Agent being called, which then calls a LLM Chain, which calls an OpenAI LLM and then the Agent \"calls\" a Calculator tool.\n", + "\n", + "The inputs, outputs and metadata for each step in the execution of our \"Agent\" is logged in its own span. Spans can have child" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import time\n", + "\n", + "openai.api_key = os.environ[\"OPENAI_API_KEY\"]\n", + "\n", + "# The query our agent has to answer\n", + "query = \"How many days until the next US election?\"\n", + "\n", + "# part 1 - an Agent is started...\n", + "start_time_ms = round(datetime.datetime.now().timestamp() * 1000)\n", + "\n", + "root_span = Trace(\n", + " name=\"MyAgent\",\n", + " kind=\"agent\",\n", + " start_time_ms=start_time_ms,\n", + " metadata={\"user\": \"optimus_12\"})\n", + "\n", + "\n", + "# part 2 - The Agent calls into a LLMChain..\n", + "chain_span = Trace(\n", + " name=\"LLMChain\",\n", + " kind=\"chain\",\n", + " start_time_ms=start_time_ms)\n", + "\n", + "# add the Chain span as a child of the root\n", + "root_span.add_child(chain_span)\n", + "\n", + "\n", + "# part 3 - the LLMChain calls an OpenAI LLM...\n", + "messages=[\n", + " {\"role\": \"system\", \"content\": system_message},\n", + " {\"role\": \"user\", \"content\": query}\n", + "]\n", + "\n", + "response = openai.ChatCompletion.create(model=model_name,\n", + " messages=messages,\n", + " temperature=temperature)\n", + "\n", + "llm_end_time_ms = round(datetime.datetime.now().timestamp() * 1000)\n", + "response_text = response[\"choices\"][0][\"message\"][\"content\"]\n", + "token_usage = response[\"usage\"].to_dict()\n", + "\n", + "llm_span = Trace(\n", + " name=\"OpenAI\",\n", + " kind=\"llm\",\n", + " status_code=\"success\",\n", + " metadata={\"temperature\":temperature,\n", + " \"token_usage\": token_usage,\n", + " \"model_name\":model_name},\n", + " start_time_ms=start_time_ms,\n", + " end_time_ms=llm_end_time_ms,\n", + " inputs={\"system_prompt\":system_message, \"query\":query},\n", + " outputs={\"response\": response_text},\n", + " )\n", + "\n", + "# add the LLM span as a child of the Chain span...\n", + "chain_span.add_child(llm_span)\n", + "\n", + "# update the end time of the Chain span\n", + "chain_span.add_inputs_and_outputs(\n", + " inputs={\"query\":query},\n", + " outputs={\"response\": response_text})\n", + "\n", + "# update the Chain span's end time\n", + "chain_span._span.end_time_ms = llm_end_time_ms\n", + "\n", + "\n", + "# part 4 - the Agent then calls a Tool...\n", + "time.sleep(3)\n", + "days_to_election = 117\n", + "tool_end_time_ms = round(datetime.datetime.now().timestamp() * 1000)\n", + "\n", + "# create a Tool span\n", + "tool_span = Trace(\n", + " name=\"Calculator\",\n", + " kind=\"tool\",\n", + " status_code=\"success\",\n", + " start_time_ms=llm_end_time_ms,\n", + " end_time_ms=tool_end_time_ms,\n", + " inputs={\"input\": response_text},\n", + " outputs={\"result\": days_to_election})\n", + "\n", + "# add the TOOL span as a child of the root\n", + "root_span.add_child(tool_span)\n", + "\n", + "\n", + "# part 5 - the final results from the tool are added\n", + "root_span.add_inputs_and_outputs(inputs={\"query\": query},\n", + " outputs={\"result\": days_to_election})\n", + "root_span._span.end_time_ms = tool_end_time_ms\n", + "\n", + "\n", + "# part 6 - log all spans to W&B by logging the root span\n", + "root_span.log(name=\"openai_trace\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Once each Agent execution completes, all calls in your LangChain object will be logged to Weights & Biases" + ] + } + ], + "metadata": { + "accelerator": "GPU", + "colab": { + "include_colab_link": true, + "provenance": [] }, - "nbformat": 4, - "nbformat_minor": 0 + "kernelspec": { + "display_name": "Python 3", + "name": "python3" + } + }, + "nbformat": 4, + "nbformat_minor": 0 }