Add agent framework/throttling/hidden model/OS assistant and update conversational search documentation #6354

kolchfa-aws · 2024-02-05T22:30:08Z

Closes #5379
Closes #5839
Closes #6268
Closes #6292

Agent Framework and OS Assistant Toolkit are experimental.

Checklist

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and subject to the Developers Certificate of Origin.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Fanit Kolchina <[email protected]>

_search-plugins/search-pipelines/rag-processor.md

Signed-off-by: Fanit Kolchina <[email protected]>

kolchfa-aws · 2024-02-15T19:29:04Z

_ml-commons-plugin/api/memory-apis/get-message.md

+**Introduced 2.12**
+{: .label .label-purple }
+
+Retrieves message information for [conversational search]({{site.url}}{{site.baseurl}}/search-plugins/conversational-search/). 


Suggested change

Retrieves message information for [conversational search]({{site.url}}{{site.baseurl}}/search-plugins/conversational-search/).

Use this API to retrieve message information for [conversational search]({{site.url}}{{site.baseurl}}/search-plugins/conversational-search/).

kolchfa-aws · 2024-02-15T19:29:58Z

_ml-commons-plugin/api/memory-apis/search-memory.md

+**Introduced 2.12**
+{: .label .label-purple }
+
+Retrieves a conversational memory for [conversational search]({{site.url}}{{site.baseurl}}/search-plugins/conversational-search/). Use this command to search for memories.


Suggested change

Retrieves a conversational memory for [conversational search]({{site.url}}{{site.baseurl}}/search-plugins/conversational-search/). Use this command to search for memories.

This API retrieves a conversational memory for [conversational search]({{site.url}}{{site.baseurl}}/search-plugins/conversational-search/). Use this command to search for memories.

kolchfa-aws · 2024-02-15T19:31:47Z

_ml-commons-plugin/api/model-apis/index.md

 - [Get model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/get-model/)
 - [Deploy model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/deploy-model/)
 - [Undeploy model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/undeploy-model/)
 - [Delete model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/delete-model/)
+- [Predict]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/train-predict/predict/): Invokes a model


Suggested change

- [Predict]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/train-predict/predict/): Invokes a model

- [Predict]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/train-predict/predict/) (invokes a model)

kolchfa-aws · 2024-02-15T19:32:38Z

_ml-commons-plugin/api/model-apis/register-model.md

+
+| Parameter | Data type | Description |
+| :--- | :--- | :--- |
+| `deploy` | Boolean | Whether to deploy the model after registering by calling the [Deploy Model API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/deploy-model/). Default is `false`. |


Suggested change

| `deploy` | Boolean | Whether to deploy the model after registering by calling the [Deploy Model API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/deploy-model/). Default is `false`. |

| `deploy` | Boolean | Whether to deploy the model after registering it. The deploy operation is performed by calling the [Deploy Model API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/deploy-model/). Default is `false`. |

kolchfa-aws · 2024-02-15T19:33:11Z

_ml-commons-plugin/api/model-apis/search-model.md

+
+Use this command to search for models you've already created.
+
+The response will contain only those model versions to which you have access. For example, if you send a match all query, model versions for the following model group types will be returned:


Suggested change

The response will contain only those model versions to which you have access. For example, if you send a match all query, model versions for the following model group types will be returned:

The response will contain only those model versions to which you have access. For example, if you send a `match_all` query, model versions for the following model group types will be returned:

kolchfa-aws · 2024-02-15T19:34:11Z

_ml-commons-plugin/api/model-apis/update-model.md

+
+#### Example request: Rate limiting inference calls for a model
+
+The following request limits the number of times you can call the Predict API on the model to four Predict API calls per minute:


Suggested change

The following request limits the number of times you can call the Predict API on the model to four Predict API calls per minute:

The following request limits the number of times you can call the Predict API on the model to 4 Predict API calls per minute:

kolchfa-aws · 2024-02-15T19:36:12Z

_ml-commons-plugin/opensearch-assistant.md

+
+The OpenSearch Assistant Toolkit helps you create AI-powered assistants for OpenSearch Dashboards. The toolkit includes the following parts:
+
+- [**Agents and tools**]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/index/): _Agents_ interface with a large language model (LLM) and execute high-level tasks, such as summarization or generating PPL from natural language. The agent's high-level tasks consist of low-level tasks called _tools_, which can be reused by multiple agents.


Suggested change

- [**Agents and tools**]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/index/): _Agents_ interface with a large language model (LLM) and execute high-level tasks, such as summarization or generating PPL from natural language. The agent's high-level tasks consist of low-level tasks called _tools_, which can be reused by multiple agents.

- [**Agents and tools**]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/index/): _Agents_ interface with a large language model (LLM) and execute high-level tasks, such as summarization or generating Piped Processing Language (PPL) from natural language. The agent's high-level tasks consist of low-level tasks called _tools_, which can be reused by multiple agents.

kolchfa-aws · 2024-02-15T19:37:03Z

_ml-commons-plugin/opensearch-assistant.md

+
+- [**Agents and tools**]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/index/): _Agents_ interface with a large language model (LLM) and execute high-level tasks, such as summarization or generating PPL from natural language. The agent's high-level tasks consist of low-level tasks called _tools_, which can be reused by multiple agents.
+- [**Workflow automation**]({{site.url}}{{site.baseurl}}/automating-workflows/index/): Uses templates to set up infrastructure for artificial intelligence and machine learning (AI/ML) applications. For example, you can automate configuring agents to be used for chat or generating PPL queries from natural language.
+- **OpenSearch Assistant**: The UI for the AI-powered assistant. The assistant's workflow is configured with various agents and tools.


Suggested change

- **OpenSearch Assistant**: The UI for the AI-powered assistant. The assistant's workflow is configured with various agents and tools.

- **OpenSearch Assistant**: This is the OpenSearch Dashboards UI for the AI-powered assistant. The assistant's workflow is configured with various agents and tools.

kolchfa-aws · 2024-02-15T19:37:32Z

_ml-commons-plugin/opensearch-assistant.md

+
+To enable OpenSearch Assistant, perform the following steps:
+
+- Enable the agent framework and retrieval-augmented generation by configuring the following settings:


Suggested change

- Enable the agent framework and retrieval-augmented generation by configuring the following settings:

- Enable the agent framework and retrieval-augmented generation (RAG) by configuring the following settings:

kolchfa-aws · 2024-02-15T20:14:21Z

_search-plugins/conversational-search.md

+:--- | :--- | :---
+`llm_question` | Yes | The question the LLM must answer. 
+`llm_model` | No | Overrides the original model set in the connection in cases where you want to use a different model (for example, GPT 4 instead of GPT 3.5). This option is required if a default model is not set during pipeline creation.
+`memory_id` | No | If you provide a `memory_id`, the pipeline retrieves the 10 most recent messages to the LLM prompt. If you don't specify a `memory_id`, the prior context is not added to the LLM prompt. 


Suggested change

`memory_id` | No | If you provide a `memory_id`, the pipeline retrieves the 10 most recent messages to the LLM prompt. If you don't specify a `memory_id`, the prior context is not added to the LLM prompt.

`memory_id` | No | If you provide a `memory_id`, the pipeline retrieves the 10 most recent messages in the specified memory and adds them to the LLM prompt. If you don't specify a `memory_id`, the prior context is not added to the LLM prompt.

kolchfa-aws · 2024-02-15T20:14:44Z

_search-plugins/conversational-search.md

+`llm_model` | No | Overrides the original model set in the connection in cases where you want to use a different model (for example, GPT 4 instead of GPT 3.5). This option is required if a default model is not set during pipeline creation.
+`memory_id` | No | If you provide a `memory_id`, the pipeline retrieves the 10 most recent messages to the LLM prompt. If you don't specify a `memory_id`, the prior context is not added to the LLM prompt. 
+`context_size` | No | The number of search results sent to the LLM. This is typically needed in order to meet the token size limit, which can vary by model. Alternatively, you can use the `size` parameter in the Search API to control the number of search results sent to the LLM.
+`message_size` | No | The number of messages sent to the LLM. Similarly to the number of search results, this affects the total number of tokens seen by the LLM. When not set, the pipeline uses the default message size of `10`.


Suggested change

`message_size` | No | The number of messages sent to the LLM. Similarly to the number of search results, this affects the total number of tokens seen by the LLM. When not set, the pipeline uses the default message size of `10`.

`message_size` | No | The number of messages sent to the LLM. Similarly to the number of search results, this affects the total number of tokens received by the LLM. When not set, the pipeline uses the default message size of `10`.

kolchfa-aws · 2024-02-15T20:15:46Z

_search-plugins/conversational-search.md


-Use the following options when setting up a RAG pipeline under the `retrieval_augmented_generation` argument.
+To verify that both messages were added to the memory, provide the `memory_ID` to the Get Message API:


Suggested change

To verify that both messages were added to the memory, provide the `memory_ID` to the Get Message API:

To verify that both messages were added to the memory, provide the `memory_ID` to the Get Messages API:

Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: kolchfa-aws <[email protected]>

Signed-off-by: Fanit Kolchina <[email protected]>

natebower

@kolchfa-aws Done with the additional content. Thanks!

natebower · 2024-02-16T11:55:40Z

_ml-commons-plugin/agents-tools/tools/agent-tool.md

-## Parameters
+## Register parameters
+
+The following table lists all available parameters when registering the tool.


"The following table lists all parameters that are available when registering the tool."

natebower · 2024-02-16T11:56:19Z

_ml-commons-plugin/agents-tools/tools/agent-tool.md


-The following table lists all available parameters. 
+The following table lists all available parameters when running the tool.


"The following table lists all parameters that are available when running the tool." Please apply this structure to all instances.

natebower · 2024-02-16T11:59:13Z

_ml-commons-plugin/agents-tools/tools/rag-tool.md

Remove the comma after "directly".

natebower · 2024-02-16T11:59:55Z

_ml-commons-plugin/agents-tools/tools/search-alerts-tool.md

Should "0" be in code font?

natebower · 2024-02-16T12:01:57Z

_ml-commons-plugin/api/agent-apis/register-agent.md

+:---  | :--- | :--- | :--- | :---
+`name`| String | Required | All | The agent name. |
+`type` | String | Required | All | The agent type. Valid values are `flow` and `conversational`. For more information, see [Agents]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/index/). |
+`description` | String | Optional| All | The agent description. |


"A description of the agent."

natebower · 2024-02-16T12:04:06Z

_ml-commons-plugin/api/agent-apis/register-agent.md

+`app_type` | String | Optional | All | Specifies an optional agent category. You can then perform operations on all agents in the category. For example, you can delete all messages for RAG agents.
+`memory.type` | String | Optional | `conversational_flow`, `conversational` | Specifies where to store the conversational memory. Currently, the only supported type is `conversation_index` (store the memory in a conversational system index).
+`llm.model_id` | String | Required | `conversational` | The model ID of the LLM to which to send questions.
+`llm.parameters.response_filter` | String | Required | `conversational` | The pattern for parsing the LLM response. For each LLM, you need to provide the field where the response is located. For example, for the Anthropic Claude model, the response is in the `completion` field so the pattern is `$.completion`. For OpenAI models, the pattern is `$.choices[0].message.content`.


"in which the response is located." "the response is located in the completion field, so" (add "located" and a comma after "field").

natebower · 2024-02-16T12:04:39Z

_ml-commons-plugin/api/agent-apis/register-agent.md

+`memory.type` | String | Optional | `conversational_flow`, `conversational` | Specifies where to store the conversational memory. Currently, the only supported type is `conversation_index` (store the memory in a conversational system index).
+`llm.model_id` | String | Required | `conversational` | The model ID of the LLM to which to send questions.
+`llm.parameters.response_filter` | String | Required | `conversational` | The pattern for parsing the LLM response. For each LLM, you need to provide the field where the response is located. For example, for the Anthropic Claude model, the response is in the `completion` field so the pattern is `$.completion`. For OpenAI models, the pattern is `$.choices[0].message.content`.
+`llm.parameters.max_iteration` | Integer | Optional | `conversational` | The maximum number of messages to send to the LLM. Default is 3.


Should "3" be in code font?

Signed-off-by: Fanit Kolchina <[email protected]>

b4sjoo · 2024-02-20T06:12:02Z

_ml-commons-plugin/integrating-ml-models.md

+Model-level rate limiting applies to all users of the model. If you specify both a model-level rate limit and a user-level rate limit, the overall rate limit is set to the more restrictive of the two. For example, if the model-level limit is 2 requests per minute and the user-level limit is 4 requests per minute, the overall limit will be set to 2 requests per minute.
+
+To set the rate limit, you must provide two inputs: the maximum number of requests and the time frame. OpenSearch uses these inputs to calculate the rate limit as the maximum number of requests divided by the time frame. For example, if you set the limit to be 4 requests per minute, the rate limit is `4 requests / 1 minute`, which is `1 request / 0.25 minutes`, or `1 request / 15 seconds`. OpenSearch processes predict requests sequentially, in a first-come-first-served manner, and will limit those requests to 1 request per 15 seconds. Imagine two users, Alice and Bob, calling the Predict API for the same model, which has a rate limit of 1 request per 15 seconds. If Alice calls the Predict API and immediately after that Bob calls the Predict API, OpenSearch processes Alice's predict request first and stores Bob's request in a queue. Once 15 seconds has passed since Alice's request, OpenSearch processes Bob's request. 


Hi Fanit, sorry for the confusion, but rate limiter cannot store the request. Instead, it will refuse the request directly. So Bob can only make his request after 15 seconds in this scenario.

Thanks for the clarification; I've updated the docs.

Signed-off-by: Fanit Kolchina <[email protected]>

natebower

LGTM

…onversational search documentation (opensearch-project#6354) * Add agent framework documentation Signed-off-by: Fanit Kolchina <[email protected]> * Add hidden model and API updates Signed-off-by: Fanit Kolchina <[email protected]> * Vale error Signed-off-by: Fanit Kolchina <[email protected]> * Updated field names Signed-off-by: Fanit Kolchina <[email protected]> * Add updating credentials Signed-off-by: Fanit Kolchina <[email protected]> * Added tools table Signed-off-by: Fanit Kolchina <[email protected]> * Add OpenSearch forum thread for OS Assistant Signed-off-by: Fanit Kolchina <[email protected]> * Add tech review for conv search Signed-off-by: Fanit Kolchina <[email protected]> * Fix links Signed-off-by: Fanit Kolchina <[email protected]> * Add tools Signed-off-by: Fanit Kolchina <[email protected]> * Add links to tools Signed-off-by: Fanit Kolchina <[email protected]> * More info about tools Signed-off-by: Fanit Kolchina <[email protected]> * Tool parameters Signed-off-by: Fanit Kolchina <[email protected]> * Update cat-index-tool.md Signed-off-by: kolchfa-aws <[email protected]> * Parameter clarification Signed-off-by: Fanit Kolchina <[email protected]> * Tech review feedback Signed-off-by: Fanit Kolchina <[email protected]> * Typo fix Signed-off-by: Fanit Kolchina <[email protected]> * More tech review feedback: RAG tool Signed-off-by: Fanit Kolchina <[email protected]> * Tech review feedback: memory APis Signed-off-by: Fanit Kolchina <[email protected]> * Update _ml-commons-plugin/agents-tools/index.md Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: kolchfa-aws <[email protected]> * Update _ml-commons-plugin/agents-tools/tools/neural-sparse-tool.md Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: kolchfa-aws <[email protected]> * Update _ml-commons-plugin/agents-tools/tools/neural-sparse-tool.md Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: kolchfa-aws <[email protected]> * Update _ml-commons-plugin/agents-tools/tools/neural-sparse-tool.md Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: kolchfa-aws <[email protected]> * Update _ml-commons-plugin/opensearch-assistant.md Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: kolchfa-aws <[email protected]> * Update _ml-commons-plugin/agents-tools/tools/ppl-tool.md Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: kolchfa-aws <[email protected]> * Apply suggestions from code review Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: kolchfa-aws <[email protected]> * Separated search and get APIs and add conversational flow agent Signed-off-by: Fanit Kolchina <[email protected]> * More parameters for PPL tool Signed-off-by: Fanit Kolchina <[email protected]> * Added more parameters Signed-off-by: Fanit Kolchina <[email protected]> * Tech review feedback: PPL tool Signed-off-by: Fanit Kolchina <[email protected]> * Apply suggestions from code review Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: kolchfa-aws <[email protected]> * Rename to automating configurations Signed-off-by: Fanit Kolchina <[email protected]> * Editorial comments on the new text Signed-off-by: Fanit Kolchina <[email protected]> * Add parameter to PPl tool Signed-off-by: Fanit Kolchina <[email protected]> * Changed link to configurations Signed-off-by: Fanit Kolchina <[email protected]> * Rate limiter feedback and added warning Signed-off-by: Fanit Kolchina <[email protected]> --------- Signed-off-by: Fanit Kolchina <[email protected]> Signed-off-by: kolchfa-aws <[email protected]> Co-authored-by: Melissa Vagi <[email protected]> Co-authored-by: Nathan Bower <[email protected]>

kolchfa-aws added 2 commits February 4, 2024 21:34

Add agent framework documentation

2102d08

Signed-off-by: Fanit Kolchina <[email protected]>

Add hidden model and API updates

b50da8c

Signed-off-by: Fanit Kolchina <[email protected]>

kolchfa-aws self-assigned this Feb 5, 2024

kolchfa-aws requested review from hdhalter, Naarcha-AWS, vagimeli, AMoo-Miki, natebower and dlvenable as code owners February 5, 2024 22:30

hdhalter added release-notes PR: Include this PR in the automated release notes v2.12.0 labels Feb 6, 2024

kolchfa-aws added 5 commits February 5, 2024 21:01

Merge branch 'main' into agent-framework

45963b0

Vale error

0382604

Signed-off-by: Fanit Kolchina <[email protected]>

Updated field names

7f4499b

Signed-off-by: Fanit Kolchina <[email protected]>

Add updating credentials

56c148c

Signed-off-by: Fanit Kolchina <[email protected]>

Added tools table

00d53a9

Signed-off-by: Fanit Kolchina <[email protected]>

hdhalter added 3 - Tech review PR: Tech review in progress experimental and removed experimental labels Feb 7, 2024

kolchfa-aws added 2 commits February 7, 2024 15:23

Add OpenSearch forum thread for OS Assistant

afc9ac6

Signed-off-by: Fanit Kolchina <[email protected]>

Merge branch 'main' into agent-framework

36bf022

kolchfa-aws mentioned this pull request Feb 8, 2024

Add blank links to workflow content to pass link checker #6372

Merged

1 task

Zhangxunmt reviewed Feb 8, 2024

View reviewed changes

_search-plugins/search-pipelines/rag-processor.md Outdated Show resolved Hide resolved

Zhangxunmt reviewed Feb 8, 2024

View reviewed changes

_search-plugins/search-pipelines/rag-processor.md Outdated Show resolved Hide resolved

kolchfa-aws added 6 commits February 8, 2024 15:06

Add tech review for conv search

7583223

Signed-off-by: Fanit Kolchina <[email protected]>

Fix links

92e2af7

Signed-off-by: Fanit Kolchina <[email protected]>

Add tools

16943ed

Signed-off-by: Fanit Kolchina <[email protected]>

Add links to tools

6185c1d

Signed-off-by: Fanit Kolchina <[email protected]>

More info about tools

3f61951

Signed-off-by: Fanit Kolchina <[email protected]>

Tool parameters

ba3983f

Signed-off-by: Fanit Kolchina <[email protected]>

kolchfa-aws commented Feb 15, 2024

View reviewed changes

kolchfa-aws and others added 3 commits February 15, 2024 15:18

Apply suggestions from code review

9cced21

Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: kolchfa-aws <[email protected]>

Resolve merge conflicts

aaba606

Signed-off-by: Fanit Kolchina <[email protected]>

Rename to automating configurations

de47bf9

Signed-off-by: Fanit Kolchina <[email protected]>

natebower reviewed Feb 16, 2024

View reviewed changes

kolchfa-aws added 4 commits February 16, 2024 11:38

Editorial comments on the new text

01ab264

Signed-off-by: Fanit Kolchina <[email protected]>

Merge conflicts

bb7e0a3

Signed-off-by: Fanit Kolchina <[email protected]>

Add parameter to PPl tool

e78a542

Signed-off-by: Fanit Kolchina <[email protected]>

Changed link to configurations

a28e5d8

Signed-off-by: Fanit Kolchina <[email protected]>

kolchfa-aws added 6 - Done but waiting to merge PR: The work is done and ready to merge and removed 4 - Doc review PR: Doc review in progress labels Feb 19, 2024

Resolve merge conflicts and add links to UI assistant

31beeb9

Signed-off-by: Fanit Kolchina <[email protected]>

b4sjoo reviewed Feb 20, 2024

View reviewed changes

Rate limiter feedback and added warning

0ccdc7f

Signed-off-by: Fanit Kolchina <[email protected]>

natebower approved these changes Feb 20, 2024

View reviewed changes

kolchfa-aws added the experimental label Feb 20, 2024

kolchfa-aws merged commit 3f7468b into main Feb 20, 2024
5 checks passed

kolchfa-aws deleted the agent-framework branch March 28, 2024 21:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add agent framework/throttling/hidden model/OS assistant and update conversational search documentation #6354

Add agent framework/throttling/hidden model/OS assistant and update conversational search documentation #6354

kolchfa-aws commented Feb 5, 2024 •

edited

Loading

kolchfa-aws Feb 15, 2024

kolchfa-aws Feb 15, 2024

kolchfa-aws Feb 15, 2024

kolchfa-aws Feb 15, 2024

kolchfa-aws Feb 15, 2024

kolchfa-aws Feb 15, 2024

kolchfa-aws Feb 15, 2024

kolchfa-aws Feb 15, 2024

kolchfa-aws Feb 15, 2024

kolchfa-aws Feb 15, 2024

kolchfa-aws Feb 15, 2024

kolchfa-aws Feb 15, 2024

natebower left a comment

natebower Feb 16, 2024

natebower Feb 16, 2024

natebower Feb 16, 2024

natebower Feb 16, 2024

natebower Feb 16, 2024

natebower Feb 16, 2024

natebower Feb 16, 2024

b4sjoo Feb 20, 2024

kolchfa-aws Feb 20, 2024

natebower left a comment

	Retrieves message information for [conversational search]({{site.url}}{{site.baseurl}}/search-plugins/conversational-search/).
	Use this API to retrieve message information for [conversational search]({{site.url}}{{site.baseurl}}/search-plugins/conversational-search/).

	Retrieves a conversational memory for [conversational search]({{site.url}}{{site.baseurl}}/search-plugins/conversational-search/). Use this command to search for memories.
	This API retrieves a conversational memory for [conversational search]({{site.url}}{{site.baseurl}}/search-plugins/conversational-search/). Use this command to search for memories.

	- [Predict]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/train-predict/predict/): Invokes a model
	- [Predict]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/train-predict/predict/) (invokes a model)

	\| `deploy` \| Boolean \| Whether to deploy the model after registering by calling the [Deploy Model API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/deploy-model/). Default is `false`. \|
	\| `deploy` \| Boolean \| Whether to deploy the model after registering it. The deploy operation is performed by calling the [Deploy Model API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/deploy-model/). Default is `false`. \|


		Use this command to search for models you've already created.

		The response will contain only those model versions to which you have access. For example, if you send a match all query, model versions for the following model group types will be returned:


		#### Example request: Rate limiting inference calls for a model

		The following request limits the number of times you can call the Predict API on the model to four Predict API calls per minute:


		The OpenSearch Assistant Toolkit helps you create AI-powered assistants for OpenSearch Dashboards. The toolkit includes the following parts:

		- [Agents and tools]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/index/): _Agents_ interface with a large language model (LLM) and execute high-level tasks, such as summarization or generating PPL from natural language. The agent's high-level tasks consist of low-level tasks called _tools_, which can be reused by multiple agents.

	- OpenSearch Assistant: The UI for the AI-powered assistant. The assistant's workflow is configured with various agents and tools.
	- OpenSearch Assistant: This is the OpenSearch Dashboards UI for the AI-powered assistant. The assistant's workflow is configured with various agents and tools.


		To enable OpenSearch Assistant, perform the following steps:

		- Enable the agent framework and retrieval-augmented generation by configuring the following settings:

	`memory_id` \| No \| If you provide a `memory_id`, the pipeline retrieves the 10 most recent messages to the LLM prompt. If you don't specify a `memory_id`, the prior context is not added to the LLM prompt.
	`memory_id` \| No \| If you provide a `memory_id`, the pipeline retrieves the 10 most recent messages in the specified memory and adds them to the LLM prompt. If you don't specify a `memory_id`, the prior context is not added to the LLM prompt.

	`message_size` \| No \| The number of messages sent to the LLM. Similarly to the number of search results, this affects the total number of tokens seen by the LLM. When not set, the pipeline uses the default message size of `10`.
	`message_size` \| No \| The number of messages sent to the LLM. Similarly to the number of search results, this affects the total number of tokens received by the LLM. When not set, the pipeline uses the default message size of `10`.


		Use the following options when setting up a RAG pipeline under the `retrieval_augmented_generation` argument.
		To verify that both messages were added to the memory, provide the `memory_ID` to the Get Message API:


		The following table lists all available parameters.
		The following table lists all available parameters when running the tool.

		Model-level rate limiting applies to all users of the model. If you specify both a model-level rate limit and a user-level rate limit, the overall rate limit is set to the more restrictive of the two. For example, if the model-level limit is 2 requests per minute and the user-level limit is 4 requests per minute, the overall limit will be set to 2 requests per minute.

		To set the rate limit, you must provide two inputs: the maximum number of requests and the time frame. OpenSearch uses these inputs to calculate the rate limit as the maximum number of requests divided by the time frame. For example, if you set the limit to be 4 requests per minute, the rate limit is `4 requests / 1 minute`, which is `1 request / 0.25 minutes`, or `1 request / 15 seconds`. OpenSearch processes predict requests sequentially, in a first-come-first-served manner, and will limit those requests to 1 request per 15 seconds. Imagine two users, Alice and Bob, calling the Predict API for the same model, which has a rate limit of 1 request per 15 seconds. If Alice calls the Predict API and immediately after that Bob calls the Predict API, OpenSearch processes Alice's predict request first and stores Bob's request in a queue. Once 15 seconds has passed since Alice's request, OpenSearch processes Bob's request.

Add agent framework/throttling/hidden model/OS assistant and update conversational search documentation #6354

Add agent framework/throttling/hidden model/OS assistant and update conversational search documentation #6354

Conversation

kolchfa-aws commented Feb 5, 2024 • edited Loading

Checklist

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

natebower left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

natebower left a comment

Choose a reason for hiding this comment

kolchfa-aws commented Feb 5, 2024 •

edited

Loading