diff --git a/.github/vale/styles/Vocab/OpenSearch/Products/accept.txt b/.github/vale/styles/Vocab/OpenSearch/Products/accept.txt index 16e7562dcc..83e9aee603 100644 --- a/.github/vale/styles/Vocab/OpenSearch/Products/accept.txt +++ b/.github/vale/styles/Vocab/OpenSearch/Products/accept.txt @@ -8,6 +8,7 @@ Amazon SageMaker Ansible Auditbeat AWS Cloud +Cohere Command Cognito Dashboards Query Language Data Prepper diff --git a/_dashboards/dashboards-assistant/index.md b/_dashboards/dashboards-assistant/index.md index d44e6b58e8..d928f58659 100644 --- a/_dashboards/dashboards-assistant/index.md +++ b/_dashboards/dashboards-assistant/index.md @@ -122,3 +122,4 @@ The following screenshot shows a saved conversation, along with actions you can - [Getting started guide for OpenSearch Assistant in OpenSearch Dashboards](https://github.com/opensearch-project/dashboards-assistant/blob/main/GETTING_STARTED_GUIDE.md) - [OpenSearch Assistant configuration through the REST API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/opensearch-assistant/) +- [Build your own chatbot]({{site.url}}{{site.baseurl}}/ml-commons-plugin/tutorials/build-chatbot/) \ No newline at end of file diff --git a/_ml-commons-plugin/agents-tools/index.md b/_ml-commons-plugin/agents-tools/index.md index ba88edef2f..009906d4cf 100644 --- a/_ml-commons-plugin/agents-tools/index.md +++ b/_ml-commons-plugin/agents-tools/index.md @@ -18,7 +18,7 @@ An _agent_ is a coordinator that uses a large language model (LLM) to solve a pr - [_Flow agent_](#flow-agents): Runs tools sequentially, in the order specified in its configuration. The workflow of a flow agent is fixed. Useful for retrieval-augmented generation (RAG). - [_Conversational flow agent_](#conversational-flow-agents): Runs tools sequentially, in the order specified in its configuration. The workflow of a conversational flow agent is fixed. Stores conversation history so that users can ask follow-up questions. Useful for creating a chatbot. -- [_Conversational agent_](#conversational-agents): Reasons in order to provide a response based on the available knowledge, including the LLM knowledge base and a set of tools provided to the LLM. Stores conversation history so that users can ask follow-up questions. The workflow of a conversational agent is variable, based on follow-up questions. For specific questions, uses the Chain-of-Thought (CoT) process to select the best tool from the configured tools for providing a response to the question. Useful for creating a chatbot that employs RAG. +- [_Conversational agent_](#conversational-agents): Reasons in order to provide a response based on the available knowledge, including the LLM knowledge base and a set of tools provided to the LLM. The LLM reasons iteratively to decide what action to take until it obtains the final answer or reaches the iteration limit. Stores conversation history so that users can ask follow-up questions. The workflow of a conversational agent is variable, based on follow-up questions. For specific questions, uses the Chain-of-Thought (CoT) process to select the best tool from the configured tools for providing a response to the question. Useful for creating a chatbot that employs RAG. ### Flow agents diff --git a/_ml-commons-plugin/index.md b/_ml-commons-plugin/index.md index c936f36161..f0355b6be3 100644 --- a/_ml-commons-plugin/index.md +++ b/_ml-commons-plugin/index.md @@ -30,4 +30,8 @@ ML Commons supports various algorithms to help train ML models and make predicti ## ML Commons API -ML Commons provides its own set of REST APIs. For more information, see [ML Commons API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/index/). \ No newline at end of file +ML Commons provides its own set of REST APIs. For more information, see [ML Commons API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/index/). + +## Tutorials + +Using the OpenSearch ML framework, you can build various applications, from implementing conversational search to building your own chatbot. For more information, see [Tutorials]({{site.url}}{{site.baseurl}}/ml-commons-plugin/tutorials/index/). \ No newline at end of file diff --git a/_ml-commons-plugin/tutorials/build-chatbot.md b/_ml-commons-plugin/tutorials/build-chatbot.md new file mode 100644 index 0000000000..1e51298106 --- /dev/null +++ b/_ml-commons-plugin/tutorials/build-chatbot.md @@ -0,0 +1,910 @@ +--- +layout: default +title: Build your own chatbot +parent: Tutorials +nav_order: 60 +--- + +# Build your own chatbot + +This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [GitHub issue](https://github.com/opensearch-project/ml-commons/issues/1161). +{: .warning} + +Sometimes a large language model (LLM) cannot answer a question right away. For example, an LLM can't tell you how many errors there are in your log index for last week because its knowledge base does not contain your proprietary data. In this case, you need to provide additional information to an LLM in a subsequent call. You can use an agent to solve such complex problems. The agent can run tools to obtain more information from configured data sources and send the additional information to the LLM as context. + +This tutorial describes how to build your own chatbot in OpenSearch using a `conversational` agent. For more information about agents, see [Agents and tools]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/index/). + +Replace the placeholders starting with the prefix `your_` with your own values. +{: .note} + +## Prerequisite + +Log in to the OpenSearch Dashboards home page, select **Add sample data**, and add the **Sample eCommerce orders** data. + +## Step 1: Configure a knowledge base + +Meet the prerequisite and follow Step 1 of the [RAG with a conversational flow agent tutorial]({{site.url}}{{site.baseurl}}/ml-commons-plugin/tutorials/rag-conversational-agent/) to configure the `test_population_data` knowledge base index, which contains US city population data. + +Create an ingest pipeline: + +```json +PUT /_ingest/pipeline/test_stock_price_data_pipeline +{ + "description": "text embedding pipeline", + "processors": [ + { + "text_embedding": { + "model_id": "your_text_embedding_model_id", + "field_map": { + "stock_price_history": "stock_price_history_embedding" + } + } + } + ] +} +``` +{% include copy-curl.html %} + +Create the `test_stock_price_data` index, which contains historical stock price data: + +```json +PUT test_stock_price_data +{ + "mappings": { + "properties": { + "stock_price_history": { + "type": "text" + }, + "stock_price_history_embedding": { + "type": "knn_vector", + "dimension": 384 + } + } + }, + "settings": { + "index": { + "knn.space_type": "cosinesimil", + "default_pipeline": "test_stock_price_data_pipeline", + "knn": "true" + } + } +} +``` +{% include copy-curl.html %} + +Ingest data into the index: + +```json +POST _bulk +{"index": {"_index": "test_stock_price_data"}} +{"stock_price_history": "This is the historical montly stock price record for Amazon.com, Inc. (AMZN) with CSV format.\nDate,Open,High,Low,Close,Adj Close,Volume\n2023-03-01,93.870003,103.489998,88.120003,103.290001,103.290001,1349240300\n2023-04-01,102.300003,110.860001,97.709999,105.449997,105.449997,1224083600\n2023-05-01,104.949997,122.919998,101.150002,120.580002,120.580002,1432891600\n2023-06-01,120.690002,131.490005,119.930000,130.360001,130.360001,1242648800\n2023-07-01,130.820007,136.649994,125.919998,133.679993,133.679993,1058754800\n2023-08-01,133.550003,143.630005,126.410004,138.009995,138.009995,1210426200\n2023-09-01,139.460007,145.860001,123.040001,127.120003,127.120003,1120271900\n2023-10-01,127.279999,134.479996,118.349998,133.089996,133.089996,1224564700\n2023-11-01,133.960007,149.259995,133.710007,146.089996,146.089996,1025986900\n2023-12-01,146.000000,155.630005,142.809998,151.940002,151.940002,931128600\n2024-01-01,151.539993,161.729996,144.050003,155.199997,155.199997,953344900\n2024-02-01,155.869995,175.000000,155.619995,174.449997,174.449997,437720800\n"} +{"index": {"_index": "test_stock_price_data"}} +{"stock_price_history": "This is the historical montly stock price record for Apple Inc. (AAPL) with CSV format.\nDate,Open,High,Low,Close,Adj Close,Volume\n2023-03-01,146.830002,165.000000,143.899994,164.899994,164.024475,1520266600\n2023-04-01,164.270004,169.850006,159.779999,169.679993,168.779099,969709700\n2023-05-01,169.279999,179.350006,164.309998,177.250000,176.308914,1275155500\n2023-06-01,177.699997,194.479996,176.929993,193.970001,193.207016,1297101100\n2023-07-01,193.779999,198.229996,186.600006,196.449997,195.677261,996066400\n2023-08-01,196.240005,196.729996,171.960007,187.869995,187.130997,1322439400\n2023-09-01,189.490005,189.979996,167.619995,171.210007,170.766846,1337586600\n2023-10-01,171.220001,182.339996,165.669998,170.770004,170.327972,1172719600\n2023-11-01,171.000000,192.929993,170.119995,189.949997,189.458313,1099586100\n2023-12-01,190.330002,199.619995,187.449997,192.529999,192.284637,1062774800\n2024-01-01,187.149994,196.380005,180.169998,184.399994,184.164993,1187219300\n2024-02-01,183.990005,191.050003,179.250000,188.850006,188.609329,420063900\n"} +{"index": {"_index": "test_stock_price_data"}} +{"stock_price_history": "This is the historical montly stock price record for NVIDIA Corporation (NVDA) with CSV format.\nDate,Open,High,Low,Close,Adj Close,Volume\n2023-03-01,231.919998,278.339996,222.970001,277.769989,277.646820,1126373100\n2023-04-01,275.089996,281.100006,262.200012,277.489990,277.414032,743592100\n2023-05-01,278.399994,419.380005,272.399994,378.339996,378.236420,1169636000\n2023-06-01,384.890015,439.899994,373.559998,423.019989,422.904175,1052209200\n2023-07-01,425.170013,480.880005,413.459991,467.290009,467.210449,870489500\n2023-08-01,464.600006,502.660004,403.109985,493.549988,493.465942,1363143600\n2023-09-01,497.619995,498.000000,409.799988,434.989990,434.915924,857510100\n2023-10-01,440.299988,476.089996,392.299988,407.799988,407.764130,1013917700\n2023-11-01,408.839996,505.480011,408.690002,467.700012,467.658905,914386300\n2023-12-01,465.250000,504.329987,450.100006,495.220001,495.176453,740951700\n2024-01-01,492.440002,634.929993,473.200012,615.270020,615.270020,970385300\n2024-02-01,621.000000,721.849976,616.500000,721.330017,721.330017,355346500\n"} +{"index": {"_index": "test_stock_price_data"}} +{"stock_price_history": "This is the historical montly stock price record for Meta Platforms, Inc. (META) with CSV format.\n\nDate,Open,High,Low,Close,Adj Close,Volume\n2023-03-01,174.589996,212.169998,171.429993,211.940002,211.940002,690053000\n2023-04-01,208.839996,241.690002,207.130005,240.320007,240.320007,446687900\n2023-05-01,238.619995,268.649994,229.850006,264.720001,264.720001,486968500\n2023-06-01,265.899994,289.790009,258.880005,286.980011,286.980011,480979900\n2023-07-01,286.700012,326.200012,284.850006,318.600006,318.600006,624605100\n2023-08-01,317.540009,324.140015,274.380005,295.890015,295.890015,423147800\n2023-09-01,299.369995,312.869995,286.790009,300.209991,300.209991,406686600\n2023-10-01,302.739990,330.540009,279.399994,301.269989,301.269989,511307900\n2023-11-01,301.850006,342.920013,301.850006,327.149994,327.149994,329270500\n2023-12-01,325.480011,361.899994,313.660004,353.959991,353.959991,332813800\n2024-01-01,351.320007,406.359985,340.010010,390.140015,390.140015,347020200\n2024-02-01,393.940002,485.959991,393.049988,473.279999,473.279999,294260900\n"} +{"index": {"_index": "test_stock_price_data"}} +{"stock_price_history": "This is the historical montly stock price record for Microsoft Corporation (MSFT) with CSV format.\n\nDate,Open,High,Low,Close,Adj Close,Volume\n2023-03-01,250.759995,289.269989,245.610001,288.299988,285.953064,747635000\n2023-04-01,286.519989,308.929993,275.369995,307.260010,304.758759,551497100\n2023-05-01,306.970001,335.940002,303.399994,328.390015,325.716766,600807200\n2023-06-01,325.929993,351.470001,322.500000,340.540009,338.506226,547588700\n2023-07-01,339.190002,366.779999,327.000000,335.920013,333.913818,666764400\n2023-08-01,335.190002,338.540009,311.549988,327.760010,325.802582,479456700\n2023-09-01,331.309998,340.859985,309.450012,315.750000,314.528809,416680700\n2023-10-01,316.279999,346.200012,311.209991,338.109985,336.802307,540907000\n2023-11-01,339.790009,384.299988,339.649994,378.910004,377.444519,563880300\n2023-12-01,376.760010,378.160004,362.899994,376.040009,375.345886,522003700\n2024-01-01,373.859985,415.320007,366.500000,397.579987,396.846130,528399000\n2024-02-01,401.829987,420.820007,401.799988,409.489990,408.734131,237639700\n"} +{"index": {"_index": "test_stock_price_data"}} +{"stock_price_history": "This is the historical montly stock price record for Alphabet Inc. (GOOG) with CSV format.\n\nDate,Open,High,Low,Close,Adj Close,Volume\n2023-03-01,90.160004,107.510002,89.769997,104.000000,104.000000,725477100\n2023-04-01,102.669998,109.629997,102.379997,108.220001,108.220001,461670700\n2023-05-01,107.720001,127.050003,104.500000,123.370003,123.370003,620317400\n2023-06-01,123.500000,129.550003,116.910004,120.970001,120.970001,521386300\n2023-07-01,120.320000,134.070007,115.830002,133.110001,133.110001,525456900\n2023-08-01,130.854996,138.399994,127.000000,137.350006,137.350006,463482000\n2023-09-01,138.429993,139.929993,128.190002,131.850006,131.850006,389593900\n2023-10-01,132.154999,142.380005,121.459999,125.300003,125.300003,514877100\n2023-11-01,125.339996,141.100006,124.925003,133.919998,133.919998,405635900\n2023-12-01,133.320007,143.945007,129.399994,140.929993,140.929993,482059400\n2024-01-01,139.600006,155.199997,136.850006,141.800003,141.800003,428771200\n2024-02-01,143.690002,150.695007,138.169998,147.139999,147.139999,231934100\n"} +``` +{% include copy-curl.html %} + +## Step 2: Prepare an LLM + +This tutorial uses the [Amazon Bedrock Claude model](https://aws.amazon.com/bedrock/claude/). You can also use other LLMs. For more information, see [Connecting to externally hosted models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/remote-models/index/). + +Create a connector for the model: + +```json +POST /_plugins/_ml/connectors/_create +{ + "name": "BedRock Claude instant-v1 Connector ", + "description": "The connector to BedRock service for claude model", + "version": 1, + "protocol": "aws_sigv4", + "parameters": { + "region": "us-east-1", + "service_name": "bedrock", + "anthropic_version": "bedrock-2023-05-31", + "max_tokens_to_sample": 8000, + "temperature": 0.0001, + "response_filter": "$.completion", + "stop_sequences": ["\n\nHuman:","\nObservation:","\n\tObservation:","\nObservation","\n\tObservation","\n\nQuestion"] + }, + "credential": { + "access_key": "your_aws_access_key", + "secret_key": "your_aws_secret_key", + "session_token": "your_aws_session_token" + }, + "actions": [ + { + "action_type": "predict", + "method": "POST", + "url": "https://bedrock-runtime.us-east-1.amazonaws.com/model/anthropic.claude-instant-v1/invoke", + "headers": { + "content-type": "application/json", + "x-amz-content-sha256": "required" + }, + "request_body": "{\"prompt\":\"${parameters.prompt}\", \"stop_sequences\": ${parameters.stop_sequences}, \"max_tokens_to_sample\":${parameters.max_tokens_to_sample}, \"temperature\":${parameters.temperature}, \"anthropic_version\":\"${parameters.anthropic_version}\" }" + } + ] +} +``` +{% include copy-curl.html %} + +Note the connector ID; you'll use it to register the model. + +Register the model: + +```json +POST /_plugins/_ml/models/_register +{ + "name": "Bedrock Claude Instant model", + "function_name": "remote", + "description": "Bedrock Claude instant-v1 model", + "connector_id": "your_connector_id" +} +``` +{% include copy-curl.html %} + +Note the LLM model ID; you'll use it in the following steps. + +Deploy the model: + +```json +POST /_plugins/_ml/models/your_LLM_model_id/_deploy +``` +{% include copy-curl.html %} + +Test the model: + +```json +POST /_plugins/_ml/models/your_LLM_model_id/_predict +{ + "parameters": { + "prompt": "\n\nHuman: how are you? \n\nAssistant:" + } +} +``` +{% include copy-curl.html %} + +## Step 3: Create an agent with the default prompt + +Next, create and test an agent. + +### Create an agent + +Create an agent of the `conversational` type. + +The agent is configured with the following information: + +- Meta information: `name`, `type`, `description`. +- LLM information: The agent uses an LLM to reason and select the next step, including choosing an appropriate tool and preparing the tool input. +- Tools: A tool is a function that can be executed by the agent. Each tool can define its own `name`, `description`, and `parameters`. +- Memory: Stores chat messages. Currently, OpenSearch only supports one memory type: `conversation_index`. + +The agent contains the following parameters: + +- `conversational`: This agent type has a built-in prompt. To override it with your own prompt, see [Step 4](#step-4-optional-create-an-agent-with-a-custom-prompt). +- `app_type`: Specify this parameter for reference purposes in order to differentiate between multiple agents. +- `llm`: Defines the LLM configuration: + - `"max_iteration": 5`: The agent runs the LLM a maximum of five times. + - `"response_filter": "$.completion"`: Needed to retrieve the LLM answer from the Bedrock Claude model response. + - `"message_history_limit": 5`: The agent retrieves a maximum of the five most recent historical messages and adds them to the LLM context. Set this parameter to `0` to omit message history in the context. + - `disable_trace`: If `true`, then the agent does not store trace data in memory. Trace data is included in each message and provides a detailed recount of steps performed while generating the message. +- `memory`: Defines how to store messages. Currently, OpenSearch only supports the `conversation_index` memory, which stores messages in a memory index. +- Tools: + - An LLM will reason to decide which tool to run and will prepare the tool's input. + - To include the tool's output in the response, specify `"include_output_in_agent_response": true`. In this tutorial, you will include the `PPLTool` output in the response (see the example response in [Test the agent](#test-the-agent)). + - By default, the tool's `name` is the same as the tool's `type`, and each tool has a default description. You can override the tool's `name` and `description`. + - Each tool in the `tools` list must have a unique name. For example, the following demo agent defines two tools of the `VectorDBTool` type with different names (`population_data_knowledge_base` and `stock_price_data_knowledge_base`). Each tool has a custom description so that the LLM can easily understand what the tool does. + + For more information about tools, see [Tools]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/tools/index/). + +This example request configures several sample tools in an agent. You can configure other tools that are relevant to your use case as needed. +{: .note} + +Register the agent: + +```json +POST _plugins/_ml/agents/_register +{ + "name": "Chat Agent with Claude", + "type": "conversational", + "description": "this is a test agent", + "app_type": "os_chat", + "llm": { + "model_id": "your_llm_model_id_from_step2", + "parameters": { + "max_iteration": 5, + "response_filter": "$.completion", + "message_history_limit": 5, + "disable_trace": false + } + }, + "memory": { + "type": "conversation_index" + }, + "tools": [ + { + "type": "PPLTool", + "parameters": { + "model_id": "your_llm_model_id_from_step2", + "model_type": "CLAUDE", + "execute": true + }, + "include_output_in_agent_response": true + }, + { + "type": "VisualizationTool", + "parameters": { + "index": ".kibana" + }, + "include_output_in_agent_response": true + }, + { + "type": "VectorDBTool", + "name": "population_data_knowledge_base", + "description": "This tool provide population data of US cities.", + "parameters": { + "input": "${parameters.question}", + "index": "test_population_data", + "source_field": [ + "population_description" + ], + "model_id": "your_embedding_model_id_from_step1", + "embedding_field": "population_description_embedding", + "doc_size": 3 + } + }, + { + "type": "VectorDBTool", + "name": "stock_price_data_knowledge_base", + "description": "This tool provide stock price data.", + "parameters": { + "input": "${parameters.question}", + "index": "test_stock_price_data", + "source_field": [ + "stock_price_history" + ], + "model_id": "your_embedding_model_id_from_step1", + "embedding_field": "stock_price_history_embedding", + "doc_size": 3 + } + }, + { + "type": "CatIndexTool", + "description": "Use this tool to get OpenSearch index information: (health, status, index, uuid, primary count, replica count, docs.count, docs.deleted, store.size, primary.store.size). \nIt takes 2 optional arguments named `index` which is a comma-delimited list of one or more indices to get information from (default is an empty list meaning all indices), and `local` which means whether to return information from the local node only instead of the cluster manager node (default is false)." + }, + { + "type": "SearchAnomalyDetectorsTool" + }, + { + "type": "SearchAnomalyResultsTool" + }, + { + "type": "SearchMonitorsTool" + }, + { + "type": "SearchAlertsTool" + } + ] +} +``` +{% include copy-curl.html %} + +Note the agent ID; you'll use it in the next step. + +### Test the agent + +Note the following testing tips: + +- You can view the detailed steps of an agent execution in one of the following ways: + - Enable verbose mode: `"verbose": true`. + - Call the Get Trace API: `GET _plugins/_ml/memory/message/your_message_id/traces`. + +- An LLM may hallucinate. It may choose the wrong tool to solve your problem, especially when you have configured many tools. To avoid hallucinations, try the following options: + - Avoid configuring many tools in an agent. + - Provide a detailed tool description clarifying what the tool can do. + - Specify the tool to use in the LLM question, for example, `Can you use the PPLTool to query the opensearch_dashboards_sample_data_ecommerce index so it can calculate how many orders were placed last week?`. + - Specify the tool to use when executing an agent. For example, specify that only `PPLTool` and `CatIndexTool` should be used to process the current request. + +Test the agent: + +```json +POST _plugins/_ml/agents/your_agent_id/_execute +{ + "parameters": { + "question": "Can you query with index opensearch_dashboards_sample_data_ecommerce to calculate how many orders in last week?", + "verbose": false, + "selected_tools": ["PPLTool", "CatIndexTool"] + } +} +``` +{% include copy-curl.html %} + + +#### Test the PPLTool + + +```json +POST _plugins/_ml/agents/your_agent_id/_execute +{ + "parameters": { + "question": "Can you query with index opensearch_dashboards_sample_data_ecommerce to calculate how many orders in last week?", + "verbose": false + } +} +``` +{% include copy-curl.html %} + +Because you specified `"include_output_in_agent_response": true` for the `PPLTool`, the response contains `PPLTool.output` in the `additional_info` object: + +```json +{ + "inference_results": [ + { + "output": [ + { + "name": "memory_id", + "result": "TkJwyI0Bn3OCesyvzuH9" + }, + { + "name": "parent_interaction_id", + "result": "T0JwyI0Bn3OCesyvz-EI" + }, + { + "name": "response", + "dataAsMap": { + "response": "The tool response from the PPLTool shows that there were 3812 orders in the opensearch_dashboards_sample_data_ecommerce index within the last week.", + "additional_info": { + "PPLTool.output": [ + """{"ppl":"source\u003dopensearch_dashboards_sample_data_ecommerce| where order_date \u003e DATE_SUB(NOW(), INTERVAL 1 WEEK) | stats COUNT() AS count","executionResult":"{\n \"schema\": [\n {\n \"name\": \"count\",\n \"type\": \"integer\"\n }\n ],\n \"datarows\": [\n [\n 3812\n ]\n ],\n \"total\": 1,\n \"size\": 1\n}"}""" + ] + } + } + } + ] + } + ] +} +``` +{% include copy-curl.html %} + +Obtain trace data: + +```json +GET _plugins/_ml/memory/message/T0JwyI0Bn3OCesyvz-EI/traces +``` +{% include copy-curl.html %} + + +#### Test the population_data_knowledge_base VectorDBTool + + +To view detailed steps, set `verbose` to `true` when executing the agent: + +```json +POST _plugins/_ml/agents/your_agent_id/_execute +{ + "parameters": { + "question": "What's the population increase of Seattle from 2021 to 2023?", + "verbose": true + } +} +``` +{% include copy-curl.html %} + +The response contains the execution steps: + +```json +{ + "inference_results": [ + { + "output": [ + { + "name": "memory_id", + "result": "LkJuyI0Bn3OCesyv3-Ef" + }, + { + "name": "parent_interaction_id", + "result": "L0JuyI0Bn3OCesyv3-Er" + }, + { + "name": "response", + "result": """{ + "thought": "Let me check the population data tool", + "action": "population_data_knowledge_base", + "action_input": "{'question': 'What is the population increase of Seattle from 2021 to 2023?', 'cities': ['Seattle']}" +}""" + }, + { + "name": "response", + "result": """{"_index":"test_population_data","_source":{"population_description":"Chart and table of population level and growth rate for the Seattle metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of Seattle in 2023 is 3,519,000, a 0.86% increase from 2022.\\nThe metro area population of Seattle in 2022 was 3,489,000, a 0.81% increase from 2021.\\nThe metro area population of Seattle in 2021 was 3,461,000, a 0.82% increase from 2020.\\nThe metro area population of Seattle in 2020 was 3,433,000, a 0.79% increase from 2019."},"_id":"9EJsyI0Bn3OCesyvU-B7","_score":0.75154537} +{"_index":"test_population_data","_source":{"population_description":"Chart and table of population level and growth rate for the Austin metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of Austin in 2023 is 2,228,000, a 2.39% increase from 2022.\\nThe metro area population of Austin in 2022 was 2,176,000, a 2.79% increase from 2021.\\nThe metro area population of Austin in 2021 was 2,117,000, a 3.12% increase from 2020.\\nThe metro area population of Austin in 2020 was 2,053,000, a 3.43% increase from 2019."},"_id":"80JsyI0Bn3OCesyvU-B7","_score":0.6689899} +{"_index":"test_population_data","_source":{"population_description":"Chart and table of population level and growth rate for the New York City metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of New York City in 2023 is 18,937,000, a 0.37% increase from 2022.\\nThe metro area population of New York City in 2022 was 18,867,000, a 0.23% increase from 2021.\\nThe metro area population of New York City in 2021 was 18,823,000, a 0.1% increase from 2020.\\nThe metro area population of New York City in 2020 was 18,804,000, a 0.01% decline from 2019."},"_id":"8EJsyI0Bn3OCesyvU-B7","_score":0.66782206} +""" + }, + { + "name": "response", + "result": "According to the population data tool, the population of Seattle increased by approximately 28,000 people from 2021 to 2023, which is a 0.82% increase from 2021 to 2022 and a 0.86% increase from 2022 to 2023." + } + ] + } + ] +} +``` + +Obtain trace data: + +```json +GET _plugins/_ml/memory/message/L0JuyI0Bn3OCesyv3-Er/traces +``` +{% include copy-curl.html %} + +#### Test conversational memory + +To continue the same conversation, specify the conversation's `memory_id` when executing the agent: + +```json +POST _plugins/_ml/agents/your_agent_id/_execute +{ + "parameters": { + "question": "What's the population of Austin 2023, compare with Seattle", + "memory_id": "LkJuyI0Bn3OCesyv3-Ef", + "verbose": true + } +} +``` +{% include copy-curl.html %} + +In the response, note that the `population_data_knowledge_base` doesn't return the population of Seattle. Instead, the agent learns the population of Seattle by referencing historical messages: + +```json +{ + "inference_results": [ + { + "output": [ + { + "name": "memory_id", + "result": "LkJuyI0Bn3OCesyv3-Ef" + }, + { + "name": "parent_interaction_id", + "result": "00J6yI0Bn3OCesyvIuGZ" + }, + { + "name": "response", + "result": """{ + "thought": "Let me check the population data tool first", + "action": "population_data_knowledge_base", + "action_input": "{\"city\":\"Austin\",\"year\":2023}" +}""" + }, + { + "name": "response", + "result": """{"_index":"test_population_data","_source":{"population_description":"Chart and table of population level and growth rate for the Austin metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of Austin in 2023 is 2,228,000, a 2.39% increase from 2022.\\nThe metro area population of Austin in 2022 was 2,176,000, a 2.79% increase from 2021.\\nThe metro area population of Austin in 2021 was 2,117,000, a 3.12% increase from 2020.\\nThe metro area population of Austin in 2020 was 2,053,000, a 3.43% increase from 2019."},"_id":"BhF5vo0BubpYKX5ER0fT","_score":0.69129956} +{"_index":"test_population_data","_source":{"population_description":"Chart and table of population level and growth rate for the Austin metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of Austin in 2023 is 2,228,000, a 2.39% increase from 2022.\\nThe metro area population of Austin in 2022 was 2,176,000, a 2.79% increase from 2021.\\nThe metro area population of Austin in 2021 was 2,117,000, a 3.12% increase from 2020.\\nThe metro area population of Austin in 2020 was 2,053,000, a 3.43% increase from 2019."},"_id":"6zrZvo0BVR2NrurbRIAE","_score":0.69129956} +{"_index":"test_population_data","_source":{"population_description":"Chart and table of population level and growth rate for the New York City metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of New York City in 2023 is 18,937,000, a 0.37% increase from 2022.\\nThe metro area population of New York City in 2022 was 18,867,000, a 0.23% increase from 2021.\\nThe metro area population of New York City in 2021 was 18,823,000, a 0.1% increase from 2020.\\nThe metro area population of New York City in 2020 was 18,804,000, a 0.01% decline from 2019."},"_id":"AxF5vo0BubpYKX5ER0fT","_score":0.61015373} +""" + }, + { + "name": "response", + "result": "According to the population data tool, the population of Austin in 2023 is approximately 2,228,000 people, a 2.39% increase from 2022. This is lower than the population of Seattle in 2023 which is approximately 3,519,000 people, a 0.86% increase from 2022." + } + ] + } + ] +} +``` + +View all messages: + +```json +GET _plugins/_ml/memory/LkJuyI0Bn3OCesyv3-Ef/messages +``` +{% include copy-curl.html %} + +Obtain trace data: + +```json +GET _plugins/_ml/memory/message/00J6yI0Bn3OCesyvIuGZ/traces +``` +{% include copy-curl.html %} + +## Step 4 (Optional): Create an agent with a custom prompt + +All agents have the following default prompt: + +```json +"prompt": """ + +Human:${parameters.prompt.prefix} + +${parameters.prompt.suffix} + +Human: follow RESPONSE FORMAT INSTRUCTIONS + +Assistant:""" +``` + +The prompt consists of two parts: + +- `${parameters.prompt.prefix}`: A prompt prefix that describes what the AI assistant can do. You can change this parameter based on your use case, for example, `You are a professional data analyst. You will always answer questions based on the tool response first. If you don't know the answer, just say you don't know.` +- `${parameters.prompt.suffix}`: The main part of the prompt that defines the tools, chat history, prompt format instructions, a question, and a scratchpad. + +The default `prompt.suffix` is the following: + +```json +"prompt.suffix": """Human:TOOLS +------ +Assistant can ask Human to use tools to look up information that may be helpful in answering the users original question. The tool response will be listed in "TOOL RESPONSE of {tool name}:". If TOOL RESPONSE is enough to answer human's question, Assistant should avoid rerun the same tool. +Assistant should NEVER suggest run a tool with same input if it's already in TOOL RESPONSE. +The tools the human can use are: + +${parameters.tool_descriptions} + +${parameters.chat_history} + +${parameters.prompt.format_instruction} + + +Human:USER'S INPUT +-------------------- +Here is the user's input : +${parameters.question} + +${parameters.scratchpad}""" +``` + +The `prompt.suffix` consists of the following placeholders: + +- `${parameters.tool_descriptions}`: This placeholder will be filled with the agent's tool information: the tool name and description. If you omit this placeholder, the agent will not use any tools. +- `${parameters.prompt.format_instruction}`: This placeholder defines the LLM response format. This placeholder is critical, and we do not recommend removing it. +- `${parameters.chat_history}`: This placeholder will be filled with the message history of the current memory. If you don't set the `memory_id` when you run the agent, or if there are no history messages, then this placeholder will be empty. If you don't need chat history, you can remove this placeholder. +- `${parameters.question}`: This placeholder will be filled with the user question. +- `${parameters.scratchpad}`: This placeholder will be filled with the detailed agent execution steps. These steps are the same as those you can view by specifying verbose mode or obtaining trace data (see an example in [Test the agent](#test-the-agent)). This placeholder is critical in order for the LLM to reason and select the next step based on the outcome of the previous steps. We do not recommend removing this placeholder. + +### Custom prompt examples + +The following examples demonstrate how to customize the prompt. + +#### Example 1: Customize `prompt.prefix` + +Register an agent with a custom `prompt.prefix`: + +```json +POST _plugins/_ml/agents/_register +{ + "name": "Chat Agent with Custom Prompt", + "type": "conversational", + "description": "this is a test agent", + "app_type": "os_chat", + "llm": { + "model_id": "P0L8xI0Bn3OCesyvPsif", + "parameters": { + "max_iteration": 3, + "response_filter": "$.completion", + "prompt.prefix": "Assistant is a professional data analyst. You will always answer question based on the tool response first. If you don't know the answer, just say don't know.\n" + } + }, + "memory": { + "type": "conversation_index" + }, + "tools": [ + { + "type": "VectorDBTool", + "name": "population_data_knowledge_base", + "description": "This tool provide population data of US cities.", + "parameters": { + "input": "${parameters.question}", + "index": "test_population_data", + "source_field": [ + "population_description" + ], + "model_id": "xkJLyI0Bn3OCesyvf94S", + "embedding_field": "population_description_embedding", + "doc_size": 3 + } + }, + { + "type": "VectorDBTool", + "name": "stock_price_data_knowledge_base", + "description": "This tool provide stock price data.", + "parameters": { + "input": "${parameters.question}", + "index": "test_stock_price_data", + "source_field": [ + "stock_price_history" + ], + "model_id": "xkJLyI0Bn3OCesyvf94S", + "embedding_field": "stock_price_history_embedding", + "doc_size": 3 + } + } + ] +} +``` +{% include copy-curl.html %} + +Test the agent: + +```json +POST _plugins/_ml/agents/o0LDyI0Bn3OCesyvr-Zq/_execute +{ + "parameters": { + "question": "What's the stock price increase of Amazon from May 2023 to Feb 2023?", + "verbose": true + } +} +``` +{% include copy-curl.html %} + +#### Example 2: OpenAI model with a custom prompt + +Create a connector for the OpenAI `gpt-3.5-turbo` model: + +```json +POST _plugins/_ml/connectors/_create +{ + "name": "My openai connector: gpt-3.5-turbo", + "description": "The connector to openai chat model", + "version": 1, + "protocol": "http", + "parameters": { + "model": "gpt-3.5-turbo", + "response_filter": "$.choices[0].message.content", + "stop": ["\n\nHuman:","\nObservation:","\n\tObservation:","\n\tObservation","\n\nQuestion"], + "system_instruction": "You are an Assistant which can answer kinds of questions." + }, + "credential": { + "openAI_key": "your_openAI_key" + }, + "actions": [ + { + "action_type": "predict", + "method": "POST", + "url": "https://api.openai.com/v1/chat/completions", + "headers": { + "Authorization": "Bearer ${credential.openAI_key}" + }, + "request_body": "{ \"model\": \"${parameters.model}\", \"messages\": [{\"role\":\"system\",\"content\":\"${parameters.system_instruction}\"},{\"role\":\"user\",\"content\":\"${parameters.prompt}\"}] }" + } + ] +} +``` +{% include copy-curl.html %} + +Create a model using the connector ID from the response: + +```json +POST /_plugins/_ml/models/_register?deploy=true +{ + "name": "My OpenAI model", + "function_name": "remote", + "description": "test model", + "connector_id": "your_connector_id" +} +``` +{% include copy-curl.html %} + +Note the model ID and test the model by calling the Predict API: + +```json +POST /_plugins/_ml/models/your_openai_model_id/_predict +{ + "parameters": { + "system_instruction": "You are an Assistant which can answer kinds of questions.", + "prompt": "hello" + } +} +``` +{% include copy-curl.html %} + +Create an agent with a custom `system_instruction` and `prompt`. The `prompt` customizes the `tool_descriptions`, `chat_history`, `format_instruction`, `question`, and `scratchpad` placeholders: + +```json +POST _plugins/_ml/agents/_register +{ + "name": "My Chat Agent with OpenAI GPT 3.5", + "type": "conversational", + "description": "this is a test agent", + "app_type": "os_chat", + "llm": { + "model_id": "your_openai_model_id", + "parameters": { + "max_iteration": 3, + "response_filter": "$.choices[0].message.content", + "system_instruction": "You are an assistant which is designed to be able to assist with a wide range of tasks, from answering simple questions to providing in-depth explanations and discussions on a wide range of topics.", + "prompt": "Assistant can ask Human to use tools to look up information that may be helpful in answering the users original question.\n${parameters.tool_descriptions}\n\n${parameters.chat_history}\n\n${parameters.prompt.format_instruction}\n\nHuman: ${parameters.question}\n\n${parameters.scratchpad}\n\nHuman: follow RESPONSE FORMAT INSTRUCTIONS\n\nAssistant:", + "disable_trace": true + } + }, + "memory": { + "type": "conversation_index" + }, + "tools": [ + { + "type": "VectorDBTool", + "name": "population_data_knowledge_base", + "description": "This tool provide population data of US cities.", + "parameters": { + "input": "${parameters.question}", + "index": "test_population_data", + "source_field": [ + "population_description" + ], + "model_id": "your_embedding_model_id_from_step1", + "embedding_field": "population_description_embedding", + "doc_size": 3 + } + }, + { + "type": "VectorDBTool", + "name": "stock_price_data_knowledge_base", + "description": "This tool provide stock price data.", + "parameters": { + "input": "${parameters.question}", + "index": "test_stock_price_data", + "source_field": [ + "stock_price_history" + ], + "model_id": "your_embedding_model_id_from_step1", + "embedding_field": "stock_price_history_embedding", + "doc_size": 3 + } + } + ] +} +``` +{% include copy-curl.html %} + +Note the agent ID from the response and test the model by running the agent: + +```json +POST _plugins/_ml/agents/your_agent_id/_execute +{ + "parameters": { + "question": "What's the stock price increase of Amazon from May 2023 to Feb 2023?", + "verbose": true + } +} +``` +{% include copy-curl.html %} + +Test the agent by asking a question that requires the agent to use both configured tools: + +```json +POST _plugins/_ml/agents/your_agent_id/_execute +{ + "parameters": { + "question": "What's the population increase of Seattle from 2021 to 2023? Then check what's the stock price increase of Amazon from May 2023 to Feb 2023?", + "verbose": true + } +} +``` +{% include copy-curl.html %} + +The response shows that the agent runs both the `population_data_knowledge_base` and `stock_price_data_knowledge_base` tools to obtain the answer: + +```json +{ + "inference_results": [ + { + "output": [ + { + "name": "memory_id", + "result": "_0IByY0Bn3OCesyvJenb" + }, + { + "name": "parent_interaction_id", + "result": "AEIByY0Bn3OCesyvJerm" + }, + { + "name": "response", + "result": """{ + "thought": "I need to use a tool to find the population increase of Seattle from 2021 to 2023", + "action": "population_data_knowledge_base", + "action_input": "{\"city\": \"Seattle\", \"start_year\": 2021, \"end_year\": 2023}" +}""" + }, + { + "name": "response", + "result": """{"_index":"test_population_data","_source":{"population_description":"Chart and table of population level and growth rate for the Seattle metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of Seattle in 2023 is 3,519,000, a 0.86% increase from 2022.\\nThe metro area population of Seattle in 2022 was 3,489,000, a 0.81% increase from 2021.\\nThe metro area population of Seattle in 2021 was 3,461,000, a 0.82% increase from 2020.\\nThe metro area population of Seattle in 2020 was 3,433,000, a 0.79% increase from 2019."},"_id":"9EJsyI0Bn3OCesyvU-B7","_score":0.6542084} +{"_index":"test_population_data","_source":{"population_description":"Chart and table of population level and growth rate for the New York City metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of New York City in 2023 is 18,937,000, a 0.37% increase from 2022.\\nThe metro area population of New York City in 2022 was 18,867,000, a 0.23% increase from 2021.\\nThe metro area population of New York City in 2021 was 18,823,000, a 0.1% increase from 2020.\\nThe metro area population of New York City in 2020 was 18,804,000, a 0.01% decline from 2019."},"_id":"8EJsyI0Bn3OCesyvU-B7","_score":0.5966786} +{"_index":"test_population_data","_source":{"population_description":"Chart and table of population level and growth rate for the Austin metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of Austin in 2023 is 2,228,000, a 2.39% increase from 2022.\\nThe metro area population of Austin in 2022 was 2,176,000, a 2.79% increase from 2021.\\nThe metro area population of Austin in 2021 was 2,117,000, a 3.12% increase from 2020.\\nThe metro area population of Austin in 2020 was 2,053,000, a 3.43% increase from 2019."},"_id":"80JsyI0Bn3OCesyvU-B7","_score":0.5883104} +""" + }, + { + "name": "response", + "result": """{ + "thought": "I need to use a tool to find the stock price increase of Amazon from May 2023 to Feb 2023", + "action": "stock_price_data_knowledge_base", + "action_input": "{\"company\": \"Amazon\", \"start_date\": \"May 2023\", \"end_date\": \"Feb 2023\"}" +}""" + }, + { + "name": "response", + "result": """{"_index":"test_stock_price_data","_source":{"stock_price_history":"This is the historical montly stock price record for Amazon.com, Inc. (AMZN) with CSV format.\nDate,Open,High,Low,Close,Adj Close,Volume\n2023-03-01,93.870003,103.489998,88.120003,103.290001,103.290001,1349240300\n2023-04-01,102.300003,110.860001,97.709999,105.449997,105.449997,1224083600\n2023-05-01,104.949997,122.919998,101.150002,120.580002,120.580002,1432891600\n2023-06-01,120.690002,131.490005,119.930000,130.360001,130.360001,1242648800\n2023-07-01,130.820007,136.649994,125.919998,133.679993,133.679993,1058754800\n2023-08-01,133.550003,143.630005,126.410004,138.009995,138.009995,1210426200\n2023-09-01,139.460007,145.860001,123.040001,127.120003,127.120003,1120271900\n2023-10-01,127.279999,134.479996,118.349998,133.089996,133.089996,1224564700\n2023-11-01,133.960007,149.259995,133.710007,146.089996,146.089996,1025986900\n2023-12-01,146.000000,155.630005,142.809998,151.940002,151.940002,931128600\n2024-01-01,151.539993,161.729996,144.050003,155.199997,155.199997,953344900\n2024-02-01,155.869995,175.000000,155.619995,174.449997,174.449997,437720800\n"},"_id":"BUJsyI0Bn3OCesyvveHo","_score":0.63949186} +{"_index":"test_stock_price_data","_source":{"stock_price_history":"This is the historical montly stock price record for Alphabet Inc. (GOOG) with CSV format.\n\nDate,Open,High,Low,Close,Adj Close,Volume\n2023-03-01,90.160004,107.510002,89.769997,104.000000,104.000000,725477100\n2023-04-01,102.669998,109.629997,102.379997,108.220001,108.220001,461670700\n2023-05-01,107.720001,127.050003,104.500000,123.370003,123.370003,620317400\n2023-06-01,123.500000,129.550003,116.910004,120.970001,120.970001,521386300\n2023-07-01,120.320000,134.070007,115.830002,133.110001,133.110001,525456900\n2023-08-01,130.854996,138.399994,127.000000,137.350006,137.350006,463482000\n2023-09-01,138.429993,139.929993,128.190002,131.850006,131.850006,389593900\n2023-10-01,132.154999,142.380005,121.459999,125.300003,125.300003,514877100\n2023-11-01,125.339996,141.100006,124.925003,133.919998,133.919998,405635900\n2023-12-01,133.320007,143.945007,129.399994,140.929993,140.929993,482059400\n2024-01-01,139.600006,155.199997,136.850006,141.800003,141.800003,428771200\n2024-02-01,143.690002,150.695007,138.169998,147.139999,147.139999,231934100\n"},"_id":"CkJsyI0Bn3OCesyvveHo","_score":0.6056718} +{"_index":"test_stock_price_data","_source":{"stock_price_history":"This is the historical montly stock price record for Apple Inc. (AAPL) with CSV format.\nDate,Open,High,Low,Close,Adj Close,Volume\n2023-03-01,146.830002,165.000000,143.899994,164.899994,164.024475,1520266600\n2023-04-01,164.270004,169.850006,159.779999,169.679993,168.779099,969709700\n2023-05-01,169.279999,179.350006,164.309998,177.250000,176.308914,1275155500\n2023-06-01,177.699997,194.479996,176.929993,193.970001,193.207016,1297101100\n2023-07-01,193.779999,198.229996,186.600006,196.449997,195.677261,996066400\n2023-08-01,196.240005,196.729996,171.960007,187.869995,187.130997,1322439400\n2023-09-01,189.490005,189.979996,167.619995,171.210007,170.766846,1337586600\n2023-10-01,171.220001,182.339996,165.669998,170.770004,170.327972,1172719600\n2023-11-01,171.000000,192.929993,170.119995,189.949997,189.458313,1099586100\n2023-12-01,190.330002,199.619995,187.449997,192.529999,192.284637,1062774800\n2024-01-01,187.149994,196.380005,180.169998,184.399994,184.164993,1187219300\n2024-02-01,183.990005,191.050003,179.250000,188.850006,188.609329,420063900\n"},"_id":"BkJsyI0Bn3OCesyvveHo","_score":0.5960163} +""" + }, + { + "name": "response", + "result": "The population increase of Seattle from 2021 to 2023 is 0.86%. The stock price increase of Amazon from May 2023 to Feb 2023 is from $120.58 to $174.45, which is a percentage increase." + } + ] + } + ] +} +``` + +## Step 5: Configure a root chatbot agent in OpenSearch Dashboards + +To use the [OpenSearch Assistant for OpenSearch Dashboards]({{site.url}}{{site.baseurl}}/dashboards/dashboards-assistant/index/), you need to configure a root chatbot agent. + +A root chatbot agent consists of the following parts: + +- A `conversational` agent: Within the `AgentTool`, you can use any `conversational` agent created in the previous steps. +- An `MLModelTool`: This tool is used for suggesting new questions based on your current question and the model response. + +Configure a root agent: + +```json +POST /_plugins/_ml/agents/_register +{ + "name": "Chatbot agent", + "type": "flow", + "description": "this is a test chatbot agent", + "tools": [ + { + "type": "AgentTool", + "name": "LLMResponseGenerator", + "parameters": { + "agent_id": "your_conversational_agent_created_in_prevous_steps" + }, + "include_output_in_agent_response": true + }, + { + "type": "MLModelTool", + "name": "QuestionSuggestor", + "description": "A general tool to answer any question", + "parameters": { + "model_id": "your_llm_model_id_created_in_previous_steps", + "prompt": "Human: You are an AI that only speaks JSON. Do not write normal text. Output should follow example JSON format: \n\n {\"response\": [\"question1\", \"question2\"]}\n\n. \n\nHuman:You will be given a chat history between OpenSearch Assistant and a Human.\nUse the context provided to generate follow up questions the Human would ask to the Assistant.\nThe Assistant can answer general questions about logs, traces and metrics.\nAssistant can access a set of tools listed below to answer questions given by the Human:\nQuestion suggestions generator tool\nHere's the chat history between the human and the Assistant.\n${parameters.LLMResponseGenerator.output}\nUse the following steps to generate follow up questions Human may ask after the response of the Assistant:\nStep 1. Use the chat history to understand what human is trying to search and explore.\nStep 2. Understand what capabilities the assistant has with the set of tools it has access to.\nStep 3. Use the above context and generate follow up questions.Step4:You are an AI that only speaks JSON. Do not write normal text. Output should follow example JSON format: \n\n {\"response\": [\"question1\", \"question2\"]} \n \n----------------\n\nAssistant:" + }, + "include_output_in_agent_response": true + } + ], + "memory": { + "type": "conversation_index" + } +} +``` +{% include copy-curl.html %} + +Note the root chatbot agent ID, log in to your OpenSearch server, go to the OpenSearch config folder (`$OS_HOME/config`), and run the following command: + +```bashx + curl -k --cert ./kirk.pem --key ./kirk-key.pem -X PUT https://localhost:9200/.plugins-ml-config/_doc/os_chat -H 'Content-Type: application/json' -d' + { + "type":"os_chat_root_agent", + "configuration":{ + "agent_id": "your_root_chatbot_agent_id" + } + }' +``` +{% include copy.html %} + +Go to your OpenSearch Dashboards config folder (`$OSD_HOME/config`) and edit `opensearch_dashboards.yml` by adding the following line to the end of the file: `assistant.chat.enabled: true`. + +Restart OpenSearch Dashboards and then select the chat icon in the upper-right corner, shown in the following image. + +OpenSearch Assistant icon + +You can now chat in OpenSearch Dashboards, as shown in the following image. + +OpenSearch Assistant chat \ No newline at end of file diff --git a/_ml-commons-plugin/tutorials/conversational-search-cohere.md b/_ml-commons-plugin/tutorials/conversational-search-cohere.md new file mode 100644 index 0000000000..e02f576b7c --- /dev/null +++ b/_ml-commons-plugin/tutorials/conversational-search-cohere.md @@ -0,0 +1,228 @@ +--- +layout: default +title: Conversational search with Cohere Command +parent: Tutorials +nav_order: 20 +--- + +# Conversational search using the Cohere Command model + +This tutorial illustrates how to configure conversational search using the Cohere Command model. For more information, see [Conversational search]({{site.url}}{{site.baseurl}}/search-plugins/conversational-search/). + +Replace the placeholders beginning with the prefix `your_` with your own values. +{: .note} + +Alternatively, you can build a RAG/conversational search using agents and tools. For more information, see [Retrieval-augmented generation chatbot]({{site.url}}{{site.baseurl}}/ml-commons-plugin/tutorials/rag-conversational-agent/). + +## Prerequisite + +Ingest test data: + +```json +POST _bulk +{"index": {"_index": "qa_demo", "_id": "1"}} +{"text": "Chart and table of population level and growth rate for the Ogden-Layton metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\nThe current metro area population of Ogden-Layton in 2023 is 750,000, a 1.63% increase from 2022.\nThe metro area population of Ogden-Layton in 2022 was 738,000, a 1.79% increase from 2021.\nThe metro area population of Ogden-Layton in 2021 was 725,000, a 1.97% increase from 2020.\nThe metro area population of Ogden-Layton in 2020 was 711,000, a 2.16% increase from 2019."} +{"index": {"_index": "qa_demo", "_id": "2"}} +{"text": "Chart and table of population level and growth rate for the New York City metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of New York City in 2023 is 18,937,000, a 0.37% increase from 2022.\\nThe metro area population of New York City in 2022 was 18,867,000, a 0.23% increase from 2021.\\nThe metro area population of New York City in 2021 was 18,823,000, a 0.1% increase from 2020.\\nThe metro area population of New York City in 2020 was 18,804,000, a 0.01% decline from 2019."} +{"index": {"_index": "qa_demo", "_id": "3"}} +{"text": "Chart and table of population level and growth rate for the Chicago metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of Chicago in 2023 is 8,937,000, a 0.4% increase from 2022.\\nThe metro area population of Chicago in 2022 was 8,901,000, a 0.27% increase from 2021.\\nThe metro area population of Chicago in 2021 was 8,877,000, a 0.14% increase from 2020.\\nThe metro area population of Chicago in 2020 was 8,865,000, a 0.03% increase from 2019."} +{"index": {"_index": "qa_demo", "_id": "4"}} +{"text": "Chart and table of population level and growth rate for the Miami metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of Miami in 2023 is 6,265,000, a 0.8% increase from 2022.\\nThe metro area population of Miami in 2022 was 6,215,000, a 0.78% increase from 2021.\\nThe metro area population of Miami in 2021 was 6,167,000, a 0.74% increase from 2020.\\nThe metro area population of Miami in 2020 was 6,122,000, a 0.71% increase from 2019."} +{"index": {"_index": "qa_demo", "_id": "5"}} +{"text": "Chart and table of population level and growth rate for the Austin metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of Austin in 2023 is 2,228,000, a 2.39% increase from 2022.\\nThe metro area population of Austin in 2022 was 2,176,000, a 2.79% increase from 2021.\\nThe metro area population of Austin in 2021 was 2,117,000, a 3.12% increase from 2020.\\nThe metro area population of Austin in 2020 was 2,053,000, a 3.43% increase from 2019."} +{"index": {"_index": "qa_demo", "_id": "6"}} +{"text": "Chart and table of population level and growth rate for the Seattle metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of Seattle in 2023 is 3,519,000, a 0.86% increase from 2022.\\nThe metro area population of Seattle in 2022 was 3,489,000, a 0.81% increase from 2021.\\nThe metro area population of Seattle in 2021 was 3,461,000, a 0.82% increase from 2020.\\nThe metro area population of Seattle in 2020 was 3,433,000, a 0.79% increase from 2019."} +``` +{% include copy-curl.html %} + +## Step 1: Create a connector and register a model + +Conversational search only supports the [OpenAI](https://github.com/opensearch-project/ml-commons/blob/2.x/docs/remote_inference_blueprints/open_ai_connector_chat_blueprint.md) +and [Amazon Bedrock Claude](https://github.com/opensearch-project/ml-commons/blob/2.x/docs/remote_inference_blueprints/bedrock_connector_anthropic_claude_blueprint.md) input/output styles. +{: .important} + +This tutorial follows the Amazon Bedrock Claude model input/output style by: +- Mapping the Cohere Command `message` input parameter to the `inputs` parameter in order to match the Cohere Claude model input style. +- Using a post-processing function to convert the Cohere Command model output to the Claude model output style. + +Create a connector for the Cohere Command model: + +```json +POST _plugins/_ml/connectors/_create +{ + "name": "Cohere Chat Model", + "description": "The connector to Cohere's public chat API", + "version": "1", + "protocol": "http", + "credential": { + "cohere_key": "your_cohere_api_key" + }, + "parameters": { + "model": "command" + }, + "actions": [ + { + "action_type": "predict", + "method": "POST", + "url": "https://api.cohere.ai/v1/chat", + "headers": { + "Authorization": "Bearer ${credential.cohere_key}", + "Request-Source": "unspecified:opensearch" + }, + "request_body": "{ \"message\": \"${parameters.inputs}\", \"model\": \"${parameters.model}\" }", + "post_process_function": "\n String escape(def input) { \n if (input.contains(\"\\\\\")) {\n input = input.replace(\"\\\\\", \"\\\\\\\\\");\n }\n if (input.contains(\"\\\"\")) {\n input = input.replace(\"\\\"\", \"\\\\\\\"\");\n }\n if (input.contains('\r')) {\n input = input = input.replace('\r', '\\\\r');\n }\n if (input.contains(\"\\\\t\")) {\n input = input.replace(\"\\\\t\", \"\\\\\\\\\\\\t\");\n }\n if (input.contains('\n')) {\n input = input.replace('\n', '\\\\n');\n }\n if (input.contains('\b')) {\n input = input.replace('\b', '\\\\b');\n }\n if (input.contains('\f')) {\n input = input.replace('\f', '\\\\f');\n }\n return input;\n }\n def name = 'response';\n def result = params.text;\n def json = '{ \"name\": \"' + name + '\",' +\n '\"dataAsMap\": { \"completion\": \"' + escape(result) +\n '\"}}';\n return json;\n \n " + } + ] +} +``` +{% include copy-curl.html %} + +Starting in OpenSearch 2.12, you can use the default `escape` function directly in the `post_process_function`: + +```json +"post_process_function": " \n def name = 'response';\n def result = params.text;\n def json = '{ \"name\": \"' + name + '\",' +\n '\"dataAsMap\": { \"completion\": \"' + escape(result) +\n '\"}}';\n return json;" +``` +{% include copy-curl.html %} + +Note the connector ID; you'll use it to register the model. + +Register the Cohere Command model: + +```json +POST /_plugins/_ml/models/_register?deploy=true +{ + "name": "Cohere command model", + "function_name": "remote", + "description": "Cohere command model", + "connector_id": "your_connector_id" +} +``` +{% include copy-curl.html %} + +Note the model ID; you'll use it in the following steps. + +Test the model: + +```json +POST /_plugins/_ml/models/your_model_id/_predict +{ + "parameters": { + "inputs": "What is the weather like in Seattle?" + } +} +``` +{% include copy-curl.html %} + +The response contains the LLM completion: + +```json +{ + "inference_results": [ + { + "output": [ + { + "name": "response", + "dataAsMap": { + "completion": """It is difficult to provide a comprehensive answer without a specific location or time frame in mind. + +As an AI language model, I have no access to real-time data or the ability to provide live weather reports. Instead, I can offer some general information about Seattle's weather, which is known for its mild, wet climate. + +Located in the Pacific Northwest region of the United States, Seattle experiences a maritime climate with cool, dry summers and mild, wet winters. While it is best known for its rainy days, Seattle's annual rainfall is actually less than New York City and Boston. + +Would you like me to provide more details on Seattle's weather? Or, if you have a specific date or location in mind, I can try to retrieve real-time or historical weather information for you.""" + } + } + ], + "status_code": 200 + } + ] +} +``` + +## Step 2: Configure conversational search + +Create a search pipeline containing a RAG processor: + +```json +PUT /_search/pipeline/my-conversation-search-pipeline-cohere +{ + "response_processors": [ + { + "retrieval_augmented_generation": { + "tag": "Demo pipeline", + "description": "Demo pipeline Using Cohere", + "model_id": "your_model_id_created_in_step1", + "context_field_list": [ + "text" + ], + "system_prompt": "You are a helpful assistant", + "user_instructions": "Generate a concise and informative answer in less than 100 words for the given question" + } + } + ] +} +``` +{% include copy-curl.html %} + +To run a conversational search, specify its parameters in the `generative_qa_parameters` object: + +```json +GET /qa_demo/_search?search_pipeline=my-conversation-search-pipeline-cohere +{ + "query": { + "match": { + "text": "What's the population increase of New York City from 2021 to 2023?" + } + }, + "size": 1, + "_source": [ + "text" + ], + "ext": { + "generative_qa_parameters": { + "llm_model": "bedrock/claude", + "llm_question": "What's the population increase of New York City from 2021 to 2023?", + "context_size": 5, + "timeout": 15 + } + } +} +``` +{% include copy-curl.html %} + +The response contains the model's answer: + +```json +{ + "took": 1, + "timed_out": false, + "_shards": { + "total": 1, + "successful": 1, + "skipped": 0, + "failed": 0 + }, + "hits": { + "total": { + "value": 6, + "relation": "eq" + }, + "max_score": 9.042081, + "hits": [ + { + "_index": "qa_demo", + "_id": "2", + "_score": 9.042081, + "_source": { + "text": """Chart and table of population level and growth rate for the New York City metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\nThe current metro area population of New York City in 2023 is 18,937,000, a 0.37% increase from 2022.\nThe metro area population of New York City in 2022 was 18,867,000, a 0.23% increase from 2021.\nThe metro area population of New York City in 2021 was 18,823,000, a 0.1% increase from 2020.\nThe metro area population of New York City in 2020 was 18,804,000, a 0.01% decline from 2019.""" + } + } + ] + }, + "ext": { + "retrieval_augmented_generation": { + "answer": "The population of the New York City metro area increased by about 210,000 from 2021 to 2023. The 2021 population was 18,823,000, and in 2023 it was 18,937,000. The average growth rate is 0.23% yearly." + } + } +} +``` \ No newline at end of file diff --git a/_ml-commons-plugin/tutorials/generate-embeddings.md b/_ml-commons-plugin/tutorials/generate-embeddings.md new file mode 100644 index 0000000000..92b62b9fe8 --- /dev/null +++ b/_ml-commons-plugin/tutorials/generate-embeddings.md @@ -0,0 +1,336 @@ +--- +layout: default +title: Generating embeddings +parent: Tutorials +nav_order: 5 +--- + +# Generating embeddings for arrays of objects + +This tutorial illustrates how to generate embeddings for arrays of objects. + +Replace the placeholders beginning with the prefix `your_` with your own values. +{: .note} + +## Step 1: Register an embedding model + +For this tutorial, you will use the [Amazon Bedrock Titan Embedding model](https://docs.aws.amazon.com/bedrock/latest/userguide/titan-embedding-models.html). + +First, follow the [Amazon Bedrock Titan blueprint example](https://github.com/opensearch-project/ml-commons/blob/2.x/docs/remote_inference_blueprints/bedrock_connector_titan_embedding_blueprint.md) to register and deploy the model. + +Test the model, providing the model ID: + +```json +POST /_plugins/_ml/models/your_embedding_model_id/_predict +{ + "parameters": { + "inputText": "hello world" + } +} +``` +{% include copy-curl.html %} + +The response contains inference results: + +```json +{ + "inference_results": [ + { + "output": [ + { + "name": "sentence_embedding", + "data_type": "FLOAT32", + "shape": [ 1536 ], + "data": [0.7265625, -0.0703125, 0.34765625, ...] + } + ], + "status_code": 200 + } + ] +} +``` + +## Step 2: Create an ingest pipeline + +Follow the next set of steps to create an ingest pipeline for generating embeddings. + +### Step 2.1: Create a k-NN index + +First, create a k-NN index: + +```json +PUT my_books +{ + "settings" : { + "index.knn" : "true", + "default_pipeline": "bedrock_embedding_foreach_pipeline" + }, + "mappings": { + "properties": { + "books": { + "type": "nested", + "properties": { + "title_embedding": { + "type": "knn_vector", + "dimension": 1536 + }, + "title": { + "type": "text" + }, + "description": { + "type": "text" + } + } + } + } + } +} +``` +{% include copy-curl.html %} + +### Step 2.2: Create an ingest pipeline + +Then create an inner ingest pipeline to generate an embedding for one array element. + +This pipeline contains three processors: + +- `set` processor: The `text_embedding` processor is unable to identify the `_ingest._value.title` field. You must copy `_ingest._value.title` to a non-existing temporary field so that the `text_embedding` processor can process it. +- `text_embedding` processor: Converts the value of the temporary field to an embedding. +- `remove` processor: Removes the temporary field. + +To create such a pipeline, send the following request: + +```json +PUT _ingest/pipeline/bedrock_embedding_pipeline +{ + "processors": [ + { + "set": { + "field": "title_tmp", + "value": "{{_ingest._value.title}}" + } + }, + { + "text_embedding": { + "model_id": your_embedding_model_id, + "field_map": { + "title_tmp": "_ingest._value.title_embedding" + } + } + }, + { + "remove": { + "field": "title_tmp" + } + } + ] +} +``` +{% include copy-curl.html %} + +Create an ingest pipeline with a `foreach` processor that will apply the `bedrock_embedding_pipeline` to each element of the `books` array: + +```json +PUT _ingest/pipeline/bedrock_embedding_foreach_pipeline +{ + "description": "Test nested embeddings", + "processors": [ + { + "foreach": { + "field": "books", + "processor": { + "pipeline": { + "name": "bedrock_embedding_pipeline" + } + }, + "ignore_failure": true + } + } + ] +} +``` +{% include copy-curl.html %} + +### Step 2.3: Simulate the pipeline + +First, you'll test the pipeline on an array that contains two book objects, both with a `title` field: + +```json +POST _ingest/pipeline/bedrock_embedding_foreach_pipeline/_simulate +{ + "docs": [ + { + "_index": "my_books", + "_id": "1", + "_source": { + "books": [ + { + "title": "first book", + "description": "This is first book" + }, + { + "title": "second book", + "description": "This is second book" + } + ] + } + } + ] +} +``` +{% include copy-curl.html %} + +The response contains generated embeddings for both objects in their `title_embedding` fields: + +```json +{ + "docs": [ + { + "doc": { + "_index": "my_books", + "_id": "1", + "_source": { + "books": [ + { + "title": "first book", + "title_embedding": [-1.1015625, 0.65234375, 0.7578125, ...], + "description": "This is first book" + }, + { + "title": "second book", + "title_embedding": [-0.65234375, 0.21679688, 0.7265625, ...], + "description": "This is second book" + } + ] + }, + "_ingest": { + "_value": null, + "timestamp": "2024-05-28T16:16:50.538929413Z" + } + } + } + ] +} +``` + +Next, you'll test the pipeline on an array that contains two book objects, one with a `title` field and one without: + +```json +POST _ingest/pipeline/bedrock_embedding_foreach_pipeline/_simulate +{ + "docs": [ + { + "_index": "my_books", + "_id": "1", + "_source": { + "books": [ + { + "title": "first book", + "description": "This is first book" + }, + { + "description": "This is second book" + } + ] + } + } + ] +} +``` +{% include copy-curl.html %} + +The response contains generated embeddings for the object that contains the `title` field: + +```json +{ + "docs": [ + { + "doc": { + "_index": "my_books", + "_id": "1", + "_source": { + "books": [ + { + "title": "first book", + "title_embedding": [-1.1015625, 0.65234375, 0.7578125, ...], + "description": "This is first book" + }, + { + "description": "This is second book" + } + ] + }, + "_ingest": { + "_value": null, + "timestamp": "2024-05-28T16:19:03.942644042Z" + } + } + } + ] +} +``` +### Step 2.4: Test data ingestion + +Ingest one document: + +```json +PUT my_books/_doc/1 +{ + "books": [ + { + "title": "first book", + "description": "This is first book" + }, + { + "title": "second book", + "description": "This is second book" + } + ] +} +``` +{% include copy-curl.html %} + +Get the document: + +```json +GET my_books/_doc/1 +``` +{% include copy-curl.html %} + +The response contains the generated embeddings: + +```json +{ + "_index": "my_books", + "_id": "1", + "_version": 1, + "_seq_no": 0, + "_primary_term": 1, + "found": true, + "_source": { + "books": [ + { + "description": "This is first book", + "title": "first book", + "title_embedding": [-1.1015625, 0.65234375, 0.7578125, ...] + }, + { + "description": "This is second book", + "title": "second book", + "title_embedding": [-0.65234375, 0.21679688, 0.7265625, ...] + } + ] + } +} +``` + +You can also ingest several documents in bulk and test the generated embeddings by calling the Get Document API: + +```json +POST _bulk +{ "index" : { "_index" : "my_books" } } +{ "books" : [{"title": "first book", "description": "This is first book"}, {"title": "second book", "description": "This is second book"}] } +{ "index" : { "_index" : "my_books" } } +{ "books" : [{"title": "third book", "description": "This is third book"}, {"description": "This is fourth book"}] } +``` +{% include copy-curl.html %} \ No newline at end of file diff --git a/_ml-commons-plugin/tutorials/index.md b/_ml-commons-plugin/tutorials/index.md new file mode 100644 index 0000000000..4479d0878f --- /dev/null +++ b/_ml-commons-plugin/tutorials/index.md @@ -0,0 +1,26 @@ +--- +layout: default +title: Tutorials +has_children: true +has_toc: false +nav_order: 140 +--- + +# Tutorials + +Using the OpenSearch machine learning (ML) framework, you can build various applications, from implementing conversational search to building your own chatbot. To learn more, explore the following ML tutorials: + +- **Semantic search**: + - [Generating embeddings for arrays of objects]({{site.url}}{{site.baseurl}}/ml-commons-plugin/tutorials/generate-embeddings/) + - [Semantic search using byte-quantized vectors]({{site.url}}{{site.baseurl}}/ml-commons-plugin/tutorials/semantic-search-byte-vectors/) + +- **Conversational search**: + - [Conversational search using the Cohere Command model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/tutorials/conversational-search-cohere/) + +- **Reranking search results**: + - [Reranking search results using the Cohere Rerank model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/tutorials/reranking-cohere/) + +- **Agents and tools**: + - [Retrieval-augmented generation (RAG) chatbot]({{site.url}}{{site.baseurl}}/ml-commons-plugin/tutorials/rag-chatbot/) + - [RAG with a conversational flow agent]({{site.url}}{{site.baseurl}}/ml-commons-plugin/tutorials/rag-conversational-agent/) + - [Build your own chatbot]({{site.url}}{{site.baseurl}}/ml-commons-plugin/tutorials/build-chatbot/) \ No newline at end of file diff --git a/_ml-commons-plugin/tutorials/rag-chatbot.md b/_ml-commons-plugin/tutorials/rag-chatbot.md new file mode 100644 index 0000000000..5dddded23a --- /dev/null +++ b/_ml-commons-plugin/tutorials/rag-chatbot.md @@ -0,0 +1,346 @@ +--- +layout: default +title: RAG chatbot +parent: Tutorials +nav_order: 50 +--- + +# RAG chatbot + +One of the known limitations of large language models (LLMs) is that their knowledge base only contains information from the period of time during which they were trained. LLMs have no knowledge of recent events or of your internal data. You can augment the LLM knowledge base by using retrieval-augmented generation (RAG). + +This tutorial illustrates how to build your own chatbot using [agents and tools](https://opensearch.org/docs/latest/ml-commons-plugin/agents-tools/index/) and RAG. RAG supplements the LLM knowledge base with information contained in OpenSearch indexes. + +Replace the placeholders beginning with the prefix `your_` with your own values. +{: .note} + +## Prerequisite + +Meet the prerequisite and follow Step 1 of the [RAG with a conversational flow agent tutorial]({{site.url}}{{site.baseurl}}/ml-commons-plugin/tutorials/rag-conversational-agent/) to set up the `test_population_data` knowledge base index, which contains US city population data. + +Note the embedding model ID; you'll use it in the following steps. + +## Step 1: Set up a knowledge base + +First, create an ingest pipeline: + +```json +PUT /_ingest/pipeline/test_tech_news_pipeline +{ + "description": "text embedding pipeline for tech news", + "processors": [ + { + "text_embedding": { + "model_id": "your_text_embedding_model_id", + "field_map": { + "passage": "passage_embedding" + } + } + } + ] +} +``` +{% include copy-curl.html %} + +Next, create an index named `test_tech_news`, which contains recent tech news: + +```json +PUT test_tech_news +{ + "mappings": { + "properties": { + "passage": { + "type": "text" + }, + "passage_embedding": { + "type": "knn_vector", + "dimension": 384 + } + } + }, + "settings": { + "index": { + "knn.space_type": "cosinesimil", + "default_pipeline": "test_tech_news_pipeline", + "knn": "true" + } + } +} +``` +{% include copy-curl.html %} + +Ingest data into the index: + +```json +POST _bulk +{"index":{"_index":"test_tech_news"}} +{"c":"Apple Vision Pro is a mixed-reality headset developed by Apple Inc. It was announced on June 5, 2023, at Apple's Worldwide Developers Conference, and pre-orders began on January 19, 2024. It became available for purchase on February 2, 2024, in the United States.[10] A worldwide launch has yet to be scheduled. The Vision Pro is Apple's first new major product category since the release of the Apple Watch in 2015.[11]\n\nApple markets the Vision Pro as a \"spatial computer\" where digital media is integrated with the real world. Physical inputs—such as motion gestures, eye tracking, and speech recognition—can be used to interact with the system.[10] Apple has avoided marketing the device as a virtual reality headset, along with the use of the terms \"virtual reality\" and \"augmented reality\" when discussing the product in presentations and marketing.[12]\n\nThe device runs visionOS,[13] a mixed-reality operating system derived from iOS frameworks using a 3D user interface; it supports multitasking via windows that appear to float within the user's surroundings,[14] as seen by cameras built into the headset. A dial on the top of the headset can be used to mask the camera feed with a virtual environment to increase immersion. The OS supports avatars (officially called \"Personas\"), which are generated by scanning the user's face; a screen on the front of the headset displays a rendering of the avatar's eyes (\"EyeSight\"), which are used to indicate the user's level of immersion to bystanders, and assist in communication.[15]"} +{"index":{"_index":"test_tech_news"}} +{"passage":"LLaMA (Large Language Model Meta AI) is a family of autoregressive large language models (LLMs), released by Meta AI starting in February 2023.\n\nFor the first version of LLaMA, four model sizes were trained: 7, 13, 33, and 65 billion parameters. LLaMA's developers reported that the 13B parameter model's performance on most NLP benchmarks exceeded that of the much larger GPT-3 (with 175B parameters) and that the largest model was competitive with state of the art models such as PaLM and Chinchilla.[1] Whereas the most powerful LLMs have generally been accessible only through limited APIs (if at all), Meta released LLaMA's model weights to the research community under a noncommercial license.[2] Within a week of LLaMA's release, its weights were leaked to the public on 4chan via BitTorrent.[3]\n\nIn July 2023, Meta released several models as Llama 2, using 7, 13 and 70 billion parameters.\n\nLLaMA-2\n\nOn July 18, 2023, in partnership with Microsoft, Meta announced LLaMA-2, the next generation of LLaMA. Meta trained and released LLaMA-2 in three model sizes: 7, 13, and 70 billion parameters.[4] The model architecture remains largely unchanged from that of LLaMA-1 models, but 40% more data was used to train the foundational models.[5] The accompanying preprint[5] also mentions a model with 34B parameters that might be released in the future upon satisfying safety targets.\n\nLLaMA-2 includes both foundational models and models fine-tuned for dialog, called LLaMA-2 Chat. In further departure from LLaMA-1, all models are released with weights, and are free for many commercial use cases. However, due to some remaining restrictions, the description of LLaMA as open source has been disputed by the Open Source Initiative (known for maintaining the Open Source Definition).[6]\n\nIn November 2023, research conducted by Patronus AI, an artificial intelligence startup company, compared performance of LLaMA-2, OpenAI's GPT-4 and GPT-4-Turbo, and Anthropic's Claude2 on two versions of a 150-question test about information in SEC filings (e.g. Form 10-K, Form 10-Q, Form 8-K, earnings reports, earnings call transcripts) submitted by public companies to the agency where one version of the test required the generative AI models to use a retrieval system to locate the specific SEC filing to answer the questions while the other version provided the specific SEC filing to the models to answer the question (i.e. in a long context window). On the retrieval system version, GPT-4-Turbo and LLaMA-2 both failed to produce correct answers to 81% of the questions, while on the long context window version, GPT-4-Turbo and Claude-2 failed to produce correct answers to 21% and 24% of the questions respectively.[7][8]"} +{"index":{"_index":"test_tech_news"}} +{"passage":"Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Stability AI, and Amazon via a single API, along with a broad set of capabilities you need to build generative AI applications with security, privacy, and responsible AI. Using Amazon Bedrock, you can easily experiment with and evaluate top FMs for your use case, privately customize them with your data using techniques such as fine-tuning and Retrieval Augmented Generation (RAG), and build agents that execute tasks using your enterprise systems and data sources. Since Amazon Bedrock is serverless, you don't have to manage any infrastructure, and you can securely integrate and deploy generative AI capabilities into your applications using the AWS services you are already familiar with."} +``` +{% include copy-curl.html %} + +## Step 2: Prepare an LLM + +Follow [step 2 of the RAG with a conversational flow agent tutorial]({{site.url}}{{site.baseurl}}/ml-commons-plugin/tutorials/rag-conversational-agent/#step-2-prepare-an-llm) to configure the Amazon Bedrock Claude model. + +Note the model ID; you'll use it in the following steps. + +## Step 3: Create an agent + +For this tutorial, you will create an agent of the `conversational` type. + +Both the `conversational_flow` and `conversational` agents support conversation history. + +The `conversational_flow` and `conversational` agents differ in the following ways: + +- A `conversational_flow` agent runs tools sequentially, in a predefined order. +- A `conversational` agent dynamically chooses which tool to run next. + +In this tutorial, the agent includes two tools: One provides recent population data, and the other contains tech news. + +The agent has the following parameters: + +- `"max_iteration": 5`: The agent runs the LLM a maximum of five times. +- `"response_filter": "$.completion"`: Needed to retrieve the LLM answer from the Amazon Bedrock Claude model response. +- `"doc_size": 3` (in `population_data_knowledge_base`): Specifies to return the top three documents. + +Create an agent with the preceding specifications: + +```json +POST _plugins/_ml/agents/_register +{ + "name": "Chat Agent with RAG", + "type": "conversational", + "description": "this is a test agent", + "llm": { + "model_id": "your_llm_model_id", + "parameters": { + "max_iteration": 5, + "response_filter": "$.completion" + } + }, + "memory": { + "type": "conversation_index" + }, + "tools": [ + { + "type": "VectorDBTool", + "name": "population_data_knowledge_base", + "description": "This tool provides population data of US cities.", + "parameters": { + "input": "${parameters.question}", + "index": "test_population_data", + "source_field": [ + "population_description" + ], + "model_id": "your_text_embedding_model_id", + "embedding_field": "population_description_embedding", + "doc_size": 3 + } + }, + { + "type": "VectorDBTool", + "name": "tech_news_knowledge_base", + "description": "This tool provides recent tech news.", + "parameters": { + "input": "${parameters.question}", + "index": "test_tech_news", + "source_field": [ + "passage" + ], + "model_id": "your_text_embedding_model_id", + "embedding_field": "passage_embedding", + "doc_size": 2 + } + } + ], + "app_type": "chat_with_rag" +} +``` +{% include copy-curl.html %} + +Note the agent ID; you'll use it in the next step. + +## Step 4: Test the agent + +The `conversational` agent supports a `verbose` option. You can set `verbose` to `true` to obtain detailed steps. + +Alternatively, you can call the [Get Message Traces API](ml-commons-plugin/api/memory-apis/get-message-traces/): + +```json +GET _plugins/_ml/memory/message/message_id/traces +``` +{% include copy-curl.html %} + +### Start a conversation + +Ask a question related to tech news: + +```json +POST _plugins/_ml/agents/your_agent_id/_execute +{ + "parameters": { + "question": "What's vision pro", + "verbose": true + } +} +``` +{% include copy-curl.html %} + +In the response, note that the agent runs the `tech_news_knowledge_base` tool to obtain the top two documents. The agent then passes these documents as context to the LLM. The LLM uses the context to produce the answer: + +```json +{ + "inference_results": [ + { + "output": [ + { + "name": "memory_id", + "result": "eLVSxI0B8vrNLhb9nxto" + }, + { + "name": "parent_interaction_id", + "result": "ebVSxI0B8vrNLhb9nxty" + }, + { + "name": "response", + "result": """{ + "thought": "I don't have enough context to answer the question directly. Let me check the tech_news_knowledge_base tool to see if it can provide more information.", + "action": "tech_news_knowledge_base", + "action_input": "{\"query\":\"What's vision pro\"}" +}""" + }, + { + "name": "response", + "result": """{"_index":"test_tech_news","_source":{"passage":"Apple Vision Pro is a mixed-reality headset developed by Apple Inc. It was announced on June 5, 2023, at Apple\u0027s Worldwide Developers Conference, and pre-orders began on January 19, 2024. It became available for purchase on February 2, 2024, in the United States.[10] A worldwide launch has yet to be scheduled. The Vision Pro is Apple\u0027s first new major product category since the release of the Apple Watch in 2015.[11]\n\nApple markets the Vision Pro as a \"spatial computer\" where digital media is integrated with the real world. Physical inputs—such as motion gestures, eye tracking, and speech recognition—can be used to interact with the system.[10] Apple has avoided marketing the device as a virtual reality headset, along with the use of the terms \"virtual reality\" and \"augmented reality\" when discussing the product in presentations and marketing.[12]\n\nThe device runs visionOS,[13] a mixed-reality operating system derived from iOS frameworks using a 3D user interface; it supports multitasking via windows that appear to float within the user\u0027s surroundings,[14] as seen by cameras built into the headset. A dial on the top of the headset can be used to mask the camera feed with a virtual environment to increase immersion. The OS supports avatars (officially called \"Personas\"), which are generated by scanning the user\u0027s face; a screen on the front of the headset displays a rendering of the avatar\u0027s eyes (\"EyeSight\"), which are used to indicate the user\u0027s level of immersion to bystanders, and assist in communication.[15]"},"_id":"lrU8xI0B8vrNLhb9yBpV","_score":0.6700683} +{"_index":"test_tech_news","_source":{"passage":"Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Stability AI, and Amazon via a single API, along with a broad set of capabilities you need to build generative AI applications with security, privacy, and responsible AI. Using Amazon Bedrock, you can easily experiment with and evaluate top FMs for your use case, privately customize them with your data using techniques such as fine-tuning and Retrieval Augmented Generation (RAG), and build agents that execute tasks using your enterprise systems and data sources. Since Amazon Bedrock is serverless, you don\u0027t have to manage any infrastructure, and you can securely integrate and deploy generative AI capabilities into your applications using the AWS services you are already familiar with."},"_id":"mLU8xI0B8vrNLhb9yBpV","_score":0.5604863} +""" + }, + { + "name": "response", + "result": "Vision Pro is a mixed-reality headset developed by Apple that was announced in 2023. It uses cameras and sensors to overlay digital objects and information on the real world. The device runs an operating system called visionOS that allows users to interact with windows and apps in a 3D environment using gestures, eye tracking, and voice commands." + } + ] + } + ] +} +``` + +You can trace the detailed steps by using the Get Traces API: + +``` +GET _plugins/_ml/memory/message/ebVSxI0B8vrNLhb9nxty/traces +``` +{% include copy-curl.html %} + +Ask a question related to the population data: + +```json +POST _plugins/_ml/agents/your_agent_id/_execute +{ + "parameters": { + "question": "What's the population of Seattle 2023", + "verbose": true + } +} +``` +{% include copy-curl.html %} + +In the response, note that the agent runs the `population_data_knowledge_base` tool to obtain the top three documents. The agent then passes these documents as context to the LLM. The LLM uses the context to produce the answer: + +```json +{ + "inference_results": [ + { + "output": [ + { + "name": "memory_id", + "result": "l7VUxI0B8vrNLhb9sRuQ" + }, + { + "name": "parent_interaction_id", + "result": "mLVUxI0B8vrNLhb9sRub" + }, + { + "name": "response", + "result": """{ + "thought": "Let me check the population data tool to find the most recent population estimate for Seattle", + "action": "population_data_knowledge_base", + "action_input": "{\"city\":\"Seattle\"}" +}""" + }, + { + "name": "response", + "result": """{"_index":"test_population_data","_source":{"population_description":"Chart and table of population level and growth rate for the Seattle metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of Seattle in 2023 is 3,519,000, a 0.86% increase from 2022.\\nThe metro area population of Seattle in 2022 was 3,489,000, a 0.81% increase from 2021.\\nThe metro area population of Seattle in 2021 was 3,461,000, a 0.82% increase from 2020.\\nThe metro area population of Seattle in 2020 was 3,433,000, a 0.79% increase from 2019."},"_id":"BxF5vo0BubpYKX5ER0fT","_score":0.65775126} +{"_index":"test_population_data","_source":{"population_description":"Chart and table of population level and growth rate for the Seattle metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of Seattle in 2023 is 3,519,000, a 0.86% increase from 2022.\\nThe metro area population of Seattle in 2022 was 3,489,000, a 0.81% increase from 2021.\\nThe metro area population of Seattle in 2021 was 3,461,000, a 0.82% increase from 2020.\\nThe metro area population of Seattle in 2020 was 3,433,000, a 0.79% increase from 2019."},"_id":"7DrZvo0BVR2NrurbRIAE","_score":0.65775126} +{"_index":"test_population_data","_source":{"population_description":"Chart and table of population level and growth rate for the New York City metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of New York City in 2023 is 18,937,000, a 0.37% increase from 2022.\\nThe metro area population of New York City in 2022 was 18,867,000, a 0.23% increase from 2021.\\nThe metro area population of New York City in 2021 was 18,823,000, a 0.1% increase from 2020.\\nThe metro area population of New York City in 2020 was 18,804,000, a 0.01% decline from 2019."},"_id":"AxF5vo0BubpYKX5ER0fT","_score":0.56461215} +""" + }, + { + "name": "response", + "result": "According to the population data tool, the population of Seattle in 2023 is approximately 3,519,000 people, a 0.86% increase from 2022." + } + ] + } + ] +} +``` + +### Continue a conversation + +To continue a previous conversation, provide its conversation ID in the `memory_id` parameter: + +```json +POST _plugins/_ml/agents/your_agent_id/_execute +{ + "parameters": { + "question": "What's the population of Austin 2023, compared with Seattle", + "memory_id": "l7VUxI0B8vrNLhb9sRuQ", + "verbose": true + } +} +``` +{% include copy-curl.html %} + +In the response, note that the `population_data_knowledge_base` doesn't return the population of Seattle. Instead, the agent learns the population of Seattle by referencing historical messages: + +```json +{ + "inference_results": [ + { + "output": [ + { + "name": "memory_id", + "result": "l7VUxI0B8vrNLhb9sRuQ" + }, + { + "name": "parent_interaction_id", + "result": "B7VkxI0B8vrNLhb9mxy0" + }, + { + "name": "response", + "result": """{ + "thought": "Let me check the population data tool first", + "action": "population_data_knowledge_base", + "action_input": "{\"city\":\"Austin\",\"year\":2023}" +}""" + }, + { + "name": "response", + "result": """{"_index":"test_population_data","_source":{"population_description":"Chart and table of population level and growth rate for the Austin metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of Austin in 2023 is 2,228,000, a 2.39% increase from 2022.\\nThe metro area population of Austin in 2022 was 2,176,000, a 2.79% increase from 2021.\\nThe metro area population of Austin in 2021 was 2,117,000, a 3.12% increase from 2020.\\nThe metro area population of Austin in 2020 was 2,053,000, a 3.43% increase from 2019."},"_id":"BhF5vo0BubpYKX5ER0fT","_score":0.69129956} +{"_index":"test_population_data","_source":{"population_description":"Chart and table of population level and growth rate for the Austin metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of Austin in 2023 is 2,228,000, a 2.39% increase from 2022.\\nThe metro area population of Austin in 2022 was 2,176,000, a 2.79% increase from 2021.\\nThe metro area population of Austin in 2021 was 2,117,000, a 3.12% increase from 2020.\\nThe metro area population of Austin in 2020 was 2,053,000, a 3.43% increase from 2019."},"_id":"6zrZvo0BVR2NrurbRIAE","_score":0.69129956} +{"_index":"test_population_data","_source":{"population_description":"Chart and table of population level and growth rate for the New York City metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of New York City in 2023 is 18,937,000, a 0.37% increase from 2022.\\nThe metro area population of New York City in 2022 was 18,867,000, a 0.23% increase from 2021.\\nThe metro area population of New York City in 2021 was 18,823,000, a 0.1% increase from 2020.\\nThe metro area population of New York City in 2020 was 18,804,000, a 0.01% decline from 2019."},"_id":"AxF5vo0BubpYKX5ER0fT","_score":0.61015373} +""" + }, + { + "name": "response", + "result": "According to the population data tool, the population of Austin in 2023 is approximately 2,228,000 people, a 2.39% increase from 2022. This is lower than the population of Seattle in 2023 which is approximately 3,519,000 people, a 0.86% increase from 2022." + } + ] + } + ] +} +``` \ No newline at end of file diff --git a/_ml-commons-plugin/tutorials/rag-conversational-agent.md b/_ml-commons-plugin/tutorials/rag-conversational-agent.md new file mode 100644 index 0000000000..86fe38416a --- /dev/null +++ b/_ml-commons-plugin/tutorials/rag-conversational-agent.md @@ -0,0 +1,838 @@ +--- +layout: default +title: RAG chatbot with a conversational flow agent +parent: Tutorials +nav_order: 40 +--- + +# RAG chatbot with a conversational flow agent + +This tutorial explains how to use a conversational flow agent to build a retrieval-augmented generation (RAG) application with your OpenSearch data as a knowledge base. + +Replace the placeholders beginning with the prefix `your_` with your own values. +{: .note} + +An alternative way to build RAG conversational search is to use a RAG pipeline. For more information, see [Conversational search using the Cohere Command model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/tutorials/conversational-search-cohere/). + +## Prerequisite + +In this tutorial, you'll build a RAG application that provides an OpenSearch [k-NN index]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index/) as a knowledge base for a large language model (LLM). For data retrieval, you'll use [semantic search]({{site.url}}{{site.baseurl}}/search-plugins/semantic-search/). For a comprehensive semantic search tutorial, see [Neural search tutorial]({{site.url}}{{site.baseurl}}/search-plugins/neural-search-tutorial/). + +First, you'll need to update your cluster settings. If you don't have a dedicated machine learning (ML) node, set `"plugins.ml_commons.only_run_on_ml_node": false`. To avoid triggering a native memory circuit breaker, set `"plugins.ml_commons.native_memory_threshold"` to 100%: + +```json +PUT _cluster/settings +{ + "persistent": { + "plugins.ml_commons.only_run_on_ml_node": false, + "plugins.ml_commons.native_memory_threshold": 100, + "plugins.ml_commons.agent_framework_enabled": true + } +} +``` +{% include copy-curl.html %} + +## Step 1: Prepare the knowledge base + +Use the following steps to prepare the knowledge base that will supplement the LLM's knowledge. + +### Step 1.1: Register a text embedding model + +Register a text embedding model that will translate text into vector embeddings: + +```json +POST /_plugins/_ml/models/_register +{ + "name": "huggingface/sentence-transformers/all-MiniLM-L12-v2", + "version": "1.0.1", + "model_format": "TORCH_SCRIPT" +} +``` +{% include copy-curl.html %} + +Note the text embedding model ID; you'll use it in the following steps. + +As an alternative, you can get the model ID by calling the [Get Task API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/tasks-apis/get-task/): + +```json +GET /_plugins/_ml/tasks/your_task_id +``` +{% include copy-curl.html %} + +Deploy the model: + +```json +POST /_plugins/_ml/models/your_text_embedding_model_id/_deploy +``` +{% include copy-curl.html %} + +Test the model: + +```json +POST /_plugins/_ml/models/your_text_embedding_model_id/_predict +{ + "text_docs":[ "today is sunny"], + "return_number": true, + "target_response": ["sentence_embedding"] +} +``` +{% include copy-curl.html %} + +For more information about using models within your OpenSearch cluster, see [Pretrained models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/pretrained-models/). + +### Step 1.2: Create an ingest pipeline + +Create an ingest pipeline with a text embedding processor, which can invoke the model created in the previous step to generate embeddings from text fields: + +```json +PUT /_ingest/pipeline/test_population_data_pipeline +{ + "description": "text embedding pipeline", + "processors": [ + { + "text_embedding": { + "model_id": "your_text_embedding_model_id", + "field_map": { + "population_description": "population_description_embedding" + } + } + } + ] +} +``` +{% include copy-curl.html %} + +For more information about ingest pipelines, see [Ingest pipelines]({{site.url}}{{site.baseurl}}/ingest-pipelines/). + +### Step 1.3: Create a k-NN index + +Create a k-NN index specifying the ingest pipeline as a default pipeline: + +```json +PUT test_population_data +{ + "mappings": { + "properties": { + "population_description": { + "type": "text" + }, + "population_description_embedding": { + "type": "knn_vector", + "dimension": 384 + } + } + }, + "settings": { + "index": { + "knn.space_type": "cosinesimil", + "default_pipeline": "test_population_data_pipeline", + "knn": "true" + } + } +} +``` +{% include copy-curl.html %} + +For more information about k-NN indexes, see [k-NN index]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index/). + +### Step 1.4: Ingest data + +Ingest test data into the k-NN index: + +```json +POST _bulk +{"index": {"_index": "test_population_data"}} +{"population_description": "Chart and table of population level and growth rate for the Ogden-Layton metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\nThe current metro area population of Ogden-Layton in 2023 is 750,000, a 1.63% increase from 2022.\nThe metro area population of Ogden-Layton in 2022 was 738,000, a 1.79% increase from 2021.\nThe metro area population of Ogden-Layton in 2021 was 725,000, a 1.97% increase from 2020.\nThe metro area population of Ogden-Layton in 2020 was 711,000, a 2.16% increase from 2019."} +{"index": {"_index": "test_population_data"}} +{"population_description": "Chart and table of population level and growth rate for the New York City metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of New York City in 2023 is 18,937,000, a 0.37% increase from 2022.\\nThe metro area population of New York City in 2022 was 18,867,000, a 0.23% increase from 2021.\\nThe metro area population of New York City in 2021 was 18,823,000, a 0.1% increase from 2020.\\nThe metro area population of New York City in 2020 was 18,804,000, a 0.01% decline from 2019."} +{"index": {"_index": "test_population_data"}} +{"population_description": "Chart and table of population level and growth rate for the Chicago metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of Chicago in 2023 is 8,937,000, a 0.4% increase from 2022.\\nThe metro area population of Chicago in 2022 was 8,901,000, a 0.27% increase from 2021.\\nThe metro area population of Chicago in 2021 was 8,877,000, a 0.14% increase from 2020.\\nThe metro area population of Chicago in 2020 was 8,865,000, a 0.03% increase from 2019."} +{"index": {"_index": "test_population_data"}} +{"population_description": "Chart and table of population level and growth rate for the Miami metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of Miami in 2023 is 6,265,000, a 0.8% increase from 2022.\\nThe metro area population of Miami in 2022 was 6,215,000, a 0.78% increase from 2021.\\nThe metro area population of Miami in 2021 was 6,167,000, a 0.74% increase from 2020.\\nThe metro area population of Miami in 2020 was 6,122,000, a 0.71% increase from 2019."} +{"index": {"_index": "test_population_data"}} +{"population_description": "Chart and table of population level and growth rate for the Austin metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of Austin in 2023 is 2,228,000, a 2.39% increase from 2022.\\nThe metro area population of Austin in 2022 was 2,176,000, a 2.79% increase from 2021.\\nThe metro area population of Austin in 2021 was 2,117,000, a 3.12% increase from 2020.\\nThe metro area population of Austin in 2020 was 2,053,000, a 3.43% increase from 2019."} +{"index": {"_index": "test_population_data"}} +{"population_description": "Chart and table of population level and growth rate for the Seattle metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of Seattle in 2023 is 3,519,000, a 0.86% increase from 2022.\\nThe metro area population of Seattle in 2022 was 3,489,000, a 0.81% increase from 2021.\\nThe metro area population of Seattle in 2021 was 3,461,000, a 0.82% increase from 2020.\\nThe metro area population of Seattle in 2020 was 3,433,000, a 0.79% increase from 2019."} +``` +{% include copy-curl.html %} + +## Step 2: Prepare an LLM + +This tutorial uses the [Amazon Bedrock Claude model](https://aws.amazon.com/bedrock/claude/) for conversational search. You can also use other LLMs. For more information about using externally hosted models, see [Connecting to externally hosted models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/remote-models/index/). + +### Step 2.1: Create a connector + +Create a connector for the Claude model: + +```json +POST /_plugins/_ml/connectors/_create +{ + "name": "BedRock Claude instant-v1 Connector ", + "description": "The connector to BedRock service for claude model", + "version": 1, + "protocol": "aws_sigv4", + "parameters": { + "region": "us-east-1", + "service_name": "bedrock", + "anthropic_version": "bedrock-2023-05-31", + "max_tokens_to_sample": 8000, + "temperature": 0.0001, + "response_filter": "$.completion" + }, + "credential": { + "access_key": "your_aws_access_key", + "secret_key": "your_aws_secret_key", + "session_token": "your_aws_session_token" + }, + "actions": [ + { + "action_type": "predict", + "method": "POST", + "url": "https://bedrock-runtime.us-east-1.amazonaws.com/model/anthropic.claude-instant-v1/invoke", + "headers": { + "content-type": "application/json", + "x-amz-content-sha256": "required" + }, + "request_body": "{\"prompt\":\"${parameters.prompt}\", \"max_tokens_to_sample\":${parameters.max_tokens_to_sample}, \"temperature\":${parameters.temperature}, \"anthropic_version\":\"${parameters.anthropic_version}\" }" + } + ] +} +``` +{% include copy-curl.html %} + +Note the connector ID; you'll use it to register the model. + +### Step 2.2: Register the model + +Register the Claude model hosted on Amazon Bedrock: + +```json +POST /_plugins/_ml/models/_register +{ + "name": "Bedrock Claude Instant model", + "function_name": "remote", + "description": "Bedrock Claude instant-v1 model", + "connector_id": "your_LLM_connector_id" +} +``` +{% include copy-curl.html %} + +Note the LLM model ID; you'll use it in the following steps. + +### Step 2.3: Deploy the model + +Deploy the Claude model: + +```json +POST /_plugins/_ml/models/your_LLM_model_id/_deploy +``` +{% include copy-curl.html %} + +### Step 2.4: Test the model + +To test the model, send a Predict API request: + +```json +POST /_plugins/_ml/models/your_LLM_model_id/_predict +{ + "parameters": { + "prompt": "\n\nHuman: how are you? \n\nAssistant:" + } +} +``` +{% include copy-curl.html %} + +## Step 3: Register an agent + +OpenSearch provides the following agent types: `flow`, `conversational_flow`, and `conversational`. For more information about agents, see [Agents]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/index/#agents). + +You will use a `conversational_flow` agent in this tutorial. The agent consists of the following: + +- Meta info: `name`, `type`, and `description`. +- `app_type`: Differentiates between application types. +- `memory`: Stores user questions and LLM responses as a conversation so that an agent can retrieve conversation history from memory and continue the same conversation. +- `tools`: Defines a list of tools to use. The agent will run these tools sequentially. + +To register an agent, send the following request: + +```json +POST /_plugins/_ml/agents/_register +{ + "name": "population data analysis agent", + "type": "conversational_flow", + "description": "This is a demo agent for population data analysis", + "app_type": "rag", + "memory": { + "type": "conversation_index" + }, + "tools": [ + { + "type": "VectorDBTool", + "name": "population_knowledge_base", + "parameters": { + "model_id": "your_text_embedding_model_id", + "index": "test_population_data", + "embedding_field": "population_description_embedding", + "source_field": [ + "population_description" + ], + "input": "${parameters.question}" + } + }, + { + "type": "MLModelTool", + "name": "bedrock_claude_model", + "description": "A general tool to answer any question", + "parameters": { + "model_id": "your_LLM_model_id", + "prompt": "\n\nHuman:You are a professional data analysist. You will always answer question based on the given context first. If the answer is not directly shown in the context, you will analyze the data and find the answer. If you don't know the answer, just say don't know. \n\nContext:\n${parameters.population_knowledge_base.output:-}\n\n${parameters.chat_history:-}\n\nHuman:${parameters.question}\n\nAssistant:" + } + } + ] +} +``` +{% include copy-curl.html %} + +OpenSearch responds with an agent ID: + +```json +{ + "agent_id": "fQ75lI0BHcHmo_czdqcJ" +} +``` + +Note the agent ID; you'll use it in the next step. + +## Step 4: Run the agent + +You'll run the agent to analyze the increase in Seattle's population. When you run this agent, the agent will create a new conversation. Later, you can continue this conversation by asking other questions. + +### Step 4.1: Start a new conversation + +First, start a new conversation by asking the LLM a question: + +```json +POST /_plugins/_ml/agents/your_agent_id/_execute +{ + "parameters": { + "question": "what's the population increase of Seattle from 2021 to 2023?" + } +} +``` +{% include copy-curl.html %} + +The response contains the answer generated by the LLM: + +```json +{ + "inference_results": [ + { + "output": [ + { + "name": "memory_id", + "result": "gQ75lI0BHcHmo_cz2acL" + }, + { + "name": "parent_message_id", + "result": "gg75lI0BHcHmo_cz2acZ" + }, + { + "name": "bedrock_claude_model", + "result": """ Based on the context given: +- The metro area population of Seattle in 2021 was 3,461,000 +- The current metro area population of Seattle in 2023 is 3,519,000 +- So the population increase of Seattle from 2021 to 2023 is 3,519,000 - 3,461,000 = 58,000""" + } + ] + } + ] +} +``` + +The response contains the following fields: + +- `memory_id` is the identifier for the memory (conversation) that groups all messages within a single conversation. Note this ID; you'll use it in the next step. +- `parent_message_id` is the identifier for the current message (one question/answer) between the human and the LLM. One memory can contain multiple messages. + +To obtain memory details, call the [Get Memory API](ml-commons-plugin/api/memory-apis/get-memory/): + +```json +GET /_plugins/_ml/memory/gQ75lI0BHcHmo_cz2acL +``` +{% include copy-curl.html %} + +To obtain all messages within a memory, call the [Get Messages API](ml-commons-plugin/api/memory-apis/get-message/): + +```json +GET /_plugins/_ml/memory/gQ75lI0BHcHmo_cz2acL/messages +``` +{% include copy-curl.html %} + +To obtain message details, call the [Get Message API](ml-commons-plugin/api/memory-apis/get-message/): + +```json +GET /_plugins/_ml/memory/message/gg75lI0BHcHmo_cz2acZ +``` +{% include copy-curl.html %} + +For debugging purposes, you can obtain trace data for a message by calling the [Get Message Traces API](ml-commons-plugin/api/memory-apis/get-message-traces/): + +```json +GET /_plugins/_ml/memory/message/gg75lI0BHcHmo_cz2acZ/traces +``` +{% include copy-curl.html %} + +### 4.2 Continue a conversation by asking new questions + +To continue the same conversation, provide the memory ID from the previous step. + +Additionally, you can provide the following parameters: + +- `message_history_limit`: Specify how many historical messages you want included in the new question/answer round for an agent. +- `prompt`: Use this parameter to customize the LLM prompt. For example, the following example adds a new instruction `always learn useful information from chat history` +and a new parameter `next_action`: + +```json +POST /_plugins/_ml/agents/your_agent_id/_execute +{ + "parameters": { + "question": "What's the population of New York City in 2023?", + "next_action": "then compare with Seattle population of 2023", + "memory_id": "gQ75lI0BHcHmo_cz2acL", + "message_history_limit": 5, + "prompt": "\n\nHuman:You are a professional data analysist. You will always answer question based on the given context first. If the answer is not directly shown in the context, you will analyze the data and find the answer. If you don't know the answer, just say don't know. \n\nContext:\n${parameters.population_knowledge_base.output:-}\n\n${parameters.chat_history:-}\n\nHuman:always learn useful information from chat history\nHuman:${parameters.question}, ${parameters.next_action}\n\nAssistant:" + } +} +``` +{% include copy-curl.html %} + +The response contains the answer generated by the LLM: + +```json +{ + "inference_results": [ + { + "output": [ + { + "name": "memory_id", + "result": "gQ75lI0BHcHmo_cz2acL" + }, + { + "name": "parent_message_id", + "result": "wQ4JlY0BHcHmo_cz8Kc-" + }, + { + "name": "bedrock_claude_model", + "result": """ Based on the context given: +- The current metro area population of New York City in 2023 is 18,937,000 +- The current metro area population of Seattle in 2023 is 3,519,000 +- So the population of New York City in 2023 (18,937,000) is much higher than the population of Seattle in 2023 (3,519,000)""" + } + ] + } + ] +} +``` + +If you know which tool the agent should use to execute a particular Predict API request, you can specify the tool when executing the agent. For example, if you want to translate the preceding answer into Chinese, you don't need to retrieve any data from the knowledge base. To run only the Claude model, specify the `bedrock_claude_model` tool in the `selected_tools` parameter: + +```json +POST /_plugins/_ml/agents/your_agent_id/_execute +{ + "parameters": { + "question": "Translate last answer into Chinese?", + "selected_tools": ["bedrock_claude_model"] + } +} +``` +{% include copy-curl.html %} + +The agent will run the tools one by one in the new order defined in `selected_tools`. +{: .note} + +## Configuring multiple knowledge bases + +You can configure multiple knowledge bases for an agent. For example, if you have both product description and comment data, you can configure the agent with the following two tools: + +```json +{ + "name": "My product agent", + "type": "conversational_flow", + "description": "This is an agent with product description and comments knowledge bases.", + "memory": { + "type": "conversation_index" + }, + "app_type": "rag", + "tools": [ + { + "type": "VectorDBTool", + "name": "product_description_vectordb", + "parameters": { + "model_id": "your_embedding_model_id", + "index": "product_description_data", + "embedding_field": "product_description_embedding", + "source_field": [ + "product_description" + ], + "input": "${parameters.question}" + } + }, + { + "type": "VectorDBTool", + "name": "product_comments_vectordb", + "parameters": { + "model_id": "your_embedding_model_id", + "index": "product_comments_data", + "embedding_field": "product_comment_embedding", + "source_field": [ + "product_comment" + ], + "input": "${parameters.question}" + } + }, + { + "type": "MLModelTool", + "description": "A general tool to answer any question", + "parameters": { + "model_id": "{{llm_model_id}}", + "prompt": "\n\nHuman:You are a professional product recommendation engine. You will always recommend product based on the given context. If you don't have enough context, you will ask Human to provide more information. If you don't see any related product to recommend, just say we don't have such product. \n\n Context:\n${parameters.product_description_vectordb.output}\n\n${parameters.product_comments_vectordb.output}\n\nHuman:${parameters.question}\n\nAssistant:" + } + } + ] +} +``` +{% include copy-curl.html %} + +When you run the agent, the agent will query product description and comment data and then send the query results and the question to the LLM. + +To query a specific knowledge base, specify it in `selected_tools`. For example, if the question relates only to product comments, you can retrieve information only from `product_comments_vectordb`: + +```json +POST /_plugins/_ml/agents/your_agent_id/_execute +{ + "parameters": { + "question": "What feature people like the most for Amazon Echo Dot", + "selected_tools": ["product_comments_vectordb", "MLModelTool"] + } +} +``` +{% include copy-curl.html %} + +## Running queries on an index + +Use `SearchIndexTool` to run any OpenSearch query on any index. + +### Setup: Register an agent + +```json +POST /_plugins/_ml/agents/_register +{ + "name": "Demo agent", + "type": "conversational_flow", + "description": "This agent supports running any search query", + "memory": { + "type": "conversation_index" + }, + "app_type": "rag", + "tools": [ + { + "type": "SearchIndexTool", + "parameters": { + "input": "{\"index\": \"${parameters.index}\", \"query\": ${parameters.query} }" + } + }, + { + "type": "MLModelTool", + "description": "A general tool to answer any question", + "parameters": { + "model_id": "your_llm_model_id", + "prompt": "\n\nHuman:You are a professional data analysist. You will always answer question based on the given context first. If the answer is not directly shown in the context, you will analyze the data and find the answer. If you don't know the answer, just say don't know. \n\n Context:\n${parameters.SearchIndexTool.output:-}\n\nHuman:${parameters.question}\n\nAssistant:" + } + } + ] +} +``` +{% include copy-curl.html %} + +### Run a BM25 query + +```json +POST /_plugins/_ml/agents/your_agent_id/_execute +{ + "parameters": { + "question": "what's the population increase of Seattle from 2021 to 2023?", + "index": "test_population_data", + "query": { + "query": { + "match": { + "population_description": "${parameters.question}" + } + }, + "size": 2, + "_source": "population_description" + } + } +} +``` +{% include copy-curl.html %} + +### Exposing only the `question` parameter + +To expose only the `question` parameter, define the agent as follows: + +```json +POST /_plugins/_ml/agents/_register +{ + "name": "Demo agent", + "type": "conversational_flow", + "description": "This is a test agent support running any search query", + "memory": { + "type": "conversation_index" + }, + "app_type": "rag", + "tools": [ + { + "type": "SearchIndexTool", + "parameters": { + "input": "{\"index\": \"${parameters.index}\", \"query\": ${parameters.query} }", + "index": "test_population_data", + "query": { + "query": { + "match": { + "population_description": "${parameters.question}" + } + }, + "size": 2, + "_source": "population_description" + } + } + }, + { + "type": "MLModelTool", + "description": "A general tool to answer any question", + "parameters": { + "model_id": "your_llm_model_id", + "prompt": "\n\nHuman:You are a professional data analyst. You will always answer question based on the given context first. If the answer is not directly shown in the context, you will analyze the data and find the answer. If you don't know the answer, just say don't know. \n\n Context:\n${parameters.SearchIndexTool.output:-}\n\nHuman:${parameters.question}\n\nAssistant:" + } + } + ] +} +``` +{% include copy-curl.html %} + +Now you can run the agent specifying only the `question` parameter: + +```json +POST /_plugins/_ml/agents/your_agent_id/_execute +{ + "parameters": { + "question": "what's the population increase of Seattle from 2021 to 2023?" + } +} +``` +{% include copy-curl.html %} + +### Run a neural search query + +```json +POST /_plugins/_ml/agents/your_agent_id/_execute +{ + "parameters": { + "question": "what's the population increase of Seattle from 2021 to 2023??", + "index": "test_population_data", + "query": { + "query": { + "neural": { + "population_description_embedding": { + "query_text": "${parameters.question}", + "model_id": "your_embedding_model_id", + "k": 10 + } + } + }, + "size": 2, + "_source": ["population_description"] + } + } +} +``` +{% include copy-curl.html %} + +To expose the `question` parameter, see [Exposing only the `question` parameter](#exposing-only-the-question-parameter). + +### Run a hybrid search query + +Hybrid search combines keyword and neural search to improve search relevance. For more information, see [Hybrid search]({{site.url}}{{site.baseurl}}/search-plugins/hybrid-search/). + +Configure a search pipeline: + +```json +PUT /_search/pipeline/nlp-search-pipeline +{ + "description": "Post processor for hybrid search", + "phase_results_processors": [ + { + "normalization-processor": { + "normalization": { + "technique": "min_max" + }, + "combination": { + "technique": "arithmetic_mean", + "parameters": { + "weights": [ + 0.3, + 0.7 + ] + } + } + } + } + ] + } +``` +{% include copy-curl.html %} + +Run an agent with a hybrid query: + +```json +POST /_plugins/_ml/agents/your_agent_id/_execute +{ + "parameters": { + "question": "what's the population increase of Seattle from 2021 to 2023??", + "index": "test_population_data", + "query": { + "_source": { + "exclude": [ + "population_description_embedding" + ] + }, + "size": 2, + "query": { + "hybrid": { + "queries": [ + { + "match": { + "population_description": { + "query": "${parameters.question}" + } + } + }, + { + "neural": { + "population_description_embedding": { + "query_text": "${parameters.question}", + "model_id": "your_embedding_model_id", + "k": 10 + } + } + } + ] + } + } + } + } +} +``` +{% include copy-curl.html %} + +To expose the `question` parameter, see [Exposing only the `question` parameter](#exposing-only-the-question-parameter). + +### Natural language query + +The `PPLTool` can translate a natural language query (NLQ) to [Piped Processing Language (PPL)]({{site.url}}{{site.baseurl}}/search-plugins/sql/ppl/index/) and execute the generated PPL query. + +#### Setup + +Before you start, go to the OpenSearch Dashboards home page, select `Add sample data`, and then add `Sample eCommerce orders`. + + +#### Step 1: Register an agent with the PPLTool + + +The `PPLTool` has the following parameters: + +- `model_type` (Enum): `CLAUDE`, `OPENAI`, or `FINETUNE`. +- `execute` (Boolean): If `true`, executes the generated PPL query. +- `input` (String): You must provide the `index` and `question` as inputs. + +For this tutorial, you'll use Bedrock Claude, so set the `model_type` to `CLAUDE`: + +```json +POST /_plugins/_ml/agents/_register +{ + "name": "Demo agent for NLQ", + "type": "conversational_flow", + "description": "This is a test flow agent for NLQ", + "memory": { + "type": "conversation_index" + }, + "app_type": "rag", + "tools": [ + { + "type": "PPLTool", + "parameters": { + "model_id": "your_ppl_model_id", + "model_type": "CLAUDE", + "execute": true, + "input": "{\"index\": \"${parameters.index}\", \"question\": ${parameters.question} }" + } + }, + { + "type": "MLModelTool", + "description": "A general tool to answer any question", + "parameters": { + "model_id": "your_llm_model_id", + "prompt": "\n\nHuman:You are a professional data analysist. You will always answer question based on the given context first. If the answer is not directly shown in the context, you will analyze the data and find the answer. If you don't know the answer, just say don't know. \n\n Context:\n${parameters.PPLTool.output:-}\n\nHuman:${parameters.question}\n\nAssistant:" + } + } + ] +} +``` +{% include copy-curl.html %} + +### Step 2: Run the agent with an NLQ + +Run the agent: + +```json +POST /_plugins/_ml/agents/your_agent_id/_execute +{ + "parameters": { + "question": "How many orders do I have in last week", + "index": "opensearch_dashboards_sample_data_ecommerce" + } +} +``` +{% include copy-curl.html %} + +The response contains the answer generated by the LLM: + +```json +{ + "inference_results": [ + { + "output": [ + { + "name": "memory_id", + "result": "sqIioI0BJhBwrVXYeYOM" + }, + { + "name": "parent_message_id", + "result": "s6IioI0BJhBwrVXYeYOW" + }, + { + "name": "MLModelTool", + "result": " Based on the given context, the number of orders in the last week is 3992. The data shows a query that counts the number of orders where the order date is greater than 1 week ago. The query result shows the count as 3992." + } + ] + } + ] +} +``` + +For more information, obtain trace data by calling the [Get Message Traces API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/memory-apis/get-message-traces/): + +```json +GET _plugins/_ml/memory/message/s6IioI0BJhBwrVXYeYOW/traces +``` +{% include copy-curl.html %} \ No newline at end of file diff --git a/_ml-commons-plugin/tutorials/reranking-cohere.md b/_ml-commons-plugin/tutorials/reranking-cohere.md new file mode 100644 index 0000000000..412180066f --- /dev/null +++ b/_ml-commons-plugin/tutorials/reranking-cohere.md @@ -0,0 +1,344 @@ +--- +layout: default +title: Reranking with Cohere Rerank +parent: Tutorials +nav_order: 30 +--- + +# Reranking search results using the Cohere Rerank model + +A [reranking pipeline]({{site.url}}{{site.baseurl}}/search-plugins/search-relevance/reranking-search-results/) can rerank search results, providing a relevance score for each document in the search results with respect to the search query. The relevance score is calculated by a cross-encoder model. + +This tutorial illustrates how to use the [Cohere Rerank](https://docs.cohere.com/reference/rerank-1) model in a reranking pipeline. + +Replace the placeholders beginning with the prefix `your_` with your own values. +{: .note} + +## Step 1: Register a Cohere Rerank model + +Create a connector for the Cohere Rerank model: + +```json +POST /_plugins/_ml/connectors/_create +{ + "name": "cohere-rerank", + "description": "The connector to Cohere reanker model", + "version": "1", + "protocol": "http", + "credential": { + "cohere_key": "your_cohere_api_key" + }, + "parameters": { + "model": "rerank-english-v2.0" + }, + "actions": [ + { + "action_type": "predict", + "method": "POST", + "url": "https://api.cohere.ai/v1/rerank", + "headers": { + "Authorization": "Bearer ${credential.cohere_key}" + }, + "request_body": "{ \"documents\": ${parameters.documents}, \"query\": \"${parameters.query}\", \"model\": \"${parameters.model}\", \"top_n\": ${parameters.top_n} }", + "pre_process_function": "connector.pre_process.cohere.rerank", + "post_process_function": "connector.post_process.cohere.rerank" + } + ] +} +``` +{% include copy-curl.html %} + +Use the connector ID from the response to register a Cohere Rerank model: + +```json +POST /_plugins/_ml/models/_register?deploy=true +{ + "name": "cohere rerank model", + "function_name": "remote", + "description": "test rerank model", + "connector_id": "your_connector_id" +} +``` +{% include copy-curl.html %} + +Note the model ID in the response; you'll use it in the following steps. + +Test the model by calling the Predict API: + +```json +POST _plugins/_ml/models/your_model_id/_predict +{ + "parameters": { + "query": "What is the capital of the United States?", + "documents": [ + "Carson City is the capital city of the American state of Nevada.", + "The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean. Its capital is Saipan.", + "Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) is the capital of the United States. It is a federal district.", + "Capital punishment (the death penalty) has existed in the United States since beforethe United States was a country. As of 2017, capital punishment is legal in 30 of the 50 states." + ], + "top_n": 4 + } +} +``` + +To ensure compatibility with the rerank pipeline, the `top_n` value must be the same as the length of the `documents` list. +{: .important} + +You can customize the number of top documents returned in the response by providing the `size` parameter. For more information, see [Step 2.3](#step-23-test-the-reranking). + +OpenSearch responds with the inference results: + +```json +{ + "inference_results": [ + { + "output": [ + { + "name": "similarity", + "data_type": "FLOAT32", + "shape": [ + 1 + ], + "data": [ + 0.10194652 + ] + }, + { + "name": "similarity", + "data_type": "FLOAT32", + "shape": [ + 1 + ], + "data": [ + 0.0721122 + ] + }, + { + "name": "similarity", + "data_type": "FLOAT32", + "shape": [ + 1 + ], + "data": [ + 0.98005307 + ] + }, + { + "name": "similarity", + "data_type": "FLOAT32", + "shape": [ + 1 + ], + "data": [ + 0.27904198 + ] + } + ], + "status_code": 200 + } + ] +} +``` + +The response contains four `similarity` objects. For each `similarity` object, the `data` array contains a relevance score for each document with respect to the query. The `similarity` objects are provided in the order of the input documents; the first object pertains to the first document. This differs from the default output of the Cohere Rerank model, which orders documents by relevance score. The document order is changed in the `connector.post_process.cohere.rerank` post-processing function in order to make the output compatible with a reranking pipeline. + +## Step 2: Configure a reranking pipeline + +Follow these steps to configure a reranking pipeline. + +### Step 2.1: Ingest test data + +Send a bulk request to ingest test data: + +```json +POST _bulk +{ "index": { "_index": "my-test-data" } } +{ "passage_text" : "Carson City is the capital city of the American state of Nevada." } +{ "index": { "_index": "my-test-data" } } +{ "passage_text" : "The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean. Its capital is Saipan." } +{ "index": { "_index": "my-test-data" } } +{ "passage_text" : "Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) is the capital of the United States. It is a federal district." } +{ "index": { "_index": "my-test-data" } } +{ "passage_text" : "Capital punishment (the death penalty) has existed in the United States since beforethe United States was a country. As of 2017, capital punishment is legal in 30 of the 50 states." } +``` +{% include copy-curl.html %} + +### Step 2.2: Create a reranking pipeline + +Create a reranking pipeline with the Cohere Rerank model: + +```json +PUT /_search/pipeline/rerank_pipeline_cohere +{ + "description": "Pipeline for reranking with Cohere Rerank model", + "response_processors": [ + { + "rerank": { + "ml_opensearch": { + "model_id": "your_model_id_created_in_step1" + }, + "context": { + "document_fields": ["passage_text"] + } + } + } + ] +} +``` +{% include copy-curl.html %} + +### Step 2.3: Test the reranking + +To limit the number of returned results, you can specify the `size` parameter. For example, set `"size": 2` to return the top two documents: + +```json +GET my-test-data/_search?search_pipeline=rerank_pipeline_cohere +{ + "query": { + "match_all": {} + }, + "size": 4, + "ext": { + "rerank": { + "query_context": { + "query_text": "What is the capital of the United States?" + } + } + } +} +``` +{% include copy-curl.html %} + +The response contains the two most relevant documents: + +```json +{ + "took": 0, + "timed_out": false, + "_shards": { + "total": 1, + "successful": 1, + "skipped": 0, + "failed": 0 + }, + "hits": { + "total": { + "value": 4, + "relation": "eq" + }, + "max_score": 0.98005307, + "hits": [ + { + "_index": "my-test-data", + "_id": "zbUOw40B8vrNLhb9vBif", + "_score": 0.98005307, + "_source": { + "passage_text": "Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) is the capital of the United States. It is a federal district." + } + }, + { + "_index": "my-test-data", + "_id": "zrUOw40B8vrNLhb9vBif", + "_score": 0.27904198, + "_source": { + "passage_text": "Capital punishment (the death penalty) has existed in the United States since beforethe United States was a country. As of 2017, capital punishment is legal in 30 of the 50 states." + } + }, + { + "_index": "my-test-data", + "_id": "y7UOw40B8vrNLhb9vBif", + "_score": 0.10194652, + "_source": { + "passage_text": "Carson City is the capital city of the American state of Nevada." + } + }, + { + "_index": "my-test-data", + "_id": "zLUOw40B8vrNLhb9vBif", + "_score": 0.0721122, + "_source": { + "passage_text": "The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean. Its capital is Saipan." + } + } + ] + }, + "profile": { + "shards": [] + } +} +``` + +To compare these results to results without reranking, run the search without a reranking pipeline: + +```json +GET my-test-data/_search +{ + "query": { + "match_all": {} + }, + "ext": { + "rerank": { + "query_context": { + "query_text": "What is the capital of the United States?" + } + } + } +} +``` +{% include copy-curl.html %} + +The first document in the response pertains to Carson City, which is not the capital of the United States: + +```json +{ + "took": 0, + "timed_out": false, + "_shards": { + "total": 1, + "successful": 1, + "skipped": 0, + "failed": 0 + }, + "hits": { + "total": { + "value": 4, + "relation": "eq" + }, + "max_score": 1, + "hits": [ + { + "_index": "my-test-data", + "_id": "y7UOw40B8vrNLhb9vBif", + "_score": 1, + "_source": { + "passage_text": "Carson City is the capital city of the American state of Nevada." + } + }, + { + "_index": "my-test-data", + "_id": "zLUOw40B8vrNLhb9vBif", + "_score": 1, + "_source": { + "passage_text": "The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean. Its capital is Saipan." + } + }, + { + "_index": "my-test-data", + "_id": "zbUOw40B8vrNLhb9vBif", + "_score": 1, + "_source": { + "passage_text": "Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) is the capital of the United States. It is a federal district." + } + }, + { + "_index": "my-test-data", + "_id": "zrUOw40B8vrNLhb9vBif", + "_score": 1, + "_source": { + "passage_text": "Capital punishment (the death penalty) has existed in the United States since beforethe United States was a country. As of 2017, capital punishment is legal in 30 of the 50 states." + } + } + ] + } +} +``` \ No newline at end of file diff --git a/_ml-commons-plugin/tutorials/semantic-search-byte-vectors.md b/_ml-commons-plugin/tutorials/semantic-search-byte-vectors.md new file mode 100644 index 0000000000..7061d3cb5a --- /dev/null +++ b/_ml-commons-plugin/tutorials/semantic-search-byte-vectors.md @@ -0,0 +1,314 @@ +--- +layout: default +title: Semantic search using byte vectors +parent: Tutorials +nav_order: 10 +--- + +# Semantic search using byte-quantized vectors + +This tutorial illustrates how to build a semantic search using the [Cohere Embed model](https://docs.cohere.com/reference/embed) and byte-quantized vectors. For more information about using byte-quantized vectors, see [Lucene byte vector]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-vector/#lucene-byte-vector). + +The Cohere Embed v3 model supports several `embedding_types`. For this tutorial, you'll use the `INT8` type to encode byte-quantized vectors. + +The Cohere Embed v3 model supports several input types. This tutorial uses the following input types: + +- `search_document`: Use this input type when you have text (in the form of documents) that you want to store in a vector database. +- `search_query`: Use this input type when structuring search queries to find the most relevant documents in your vector database. + +For more information about input types, see the [Cohere documentation](https://docs.cohere.com/docs/embed-api#the-input_type-parameter). + +In this tutorial, you will create two models: + +- A model used for ingestion, whose `input_type` is `search_document` +- A model used for search, whose `input_type` is `search_query` + +Replace the placeholders beginning with the prefix `your_` with your own values. +{: .note} + +## Step 1: Create an embedding model for ingestion + +Create a connector for the Cohere model, specifying the `search_document` input type: + +```json +POST /_plugins/_ml/connectors/_create +{ + "name": "Cohere embedding connector with int8 embedding type for ingestion", + "description": "Test connector for Cohere embedding model", + "version": 1, + "protocol": "http", + "credential": { + "cohere_key": "your_cohere_api_key" + }, + "parameters": { + "model": "embed-english-v3.0", + "embedding_types": ["int8"], + "input_type": "search_document" + }, + "actions": [ + { + "action_type": "predict", + "method": "POST", + "headers": { + "Authorization": "Bearer ${credential.cohere_key}", + "Request-Source": "unspecified:opensearch" + }, + "url": "https://api.cohere.ai/v1/embed", + "request_body": "{ \"model\": \"${parameters.model}\", \"texts\": ${parameters.texts}, \"input_type\":\"${parameters.input_type}\", \"embedding_types\": ${parameters.embedding_types} }", + "pre_process_function": "connector.pre_process.cohere.embedding", + "post_process_function": "\n def name = \"sentence_embedding\";\n def data_type = \"FLOAT32\";\n def result;\n if (params.embeddings.int8 != null) {\n data_type = \"INT8\";\n result = params.embeddings.int8;\n } else if (params.embeddings.uint8 != null) {\n data_type = \"UINT8\";\n result = params.embeddings.uint8;\n } else if (params.embeddings.float != null) {\n data_type = \"FLOAT32\";\n result = params.embeddings.float;\n }\n \n if (result == null) {\n return \"Invalid embedding result\";\n }\n \n def embedding_list = new StringBuilder(\"[\");\n \n for (int m=0; m