-
Notifications
You must be signed in to change notification settings - Fork 505
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add agent framework/throttling/hidden model/OS assistant and update conversational search documentation #6354
Conversation
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
**Introduced 2.12** | ||
{: .label .label-purple } | ||
|
||
Retrieves message information for [conversational search]({{site.url}}{{site.baseurl}}/search-plugins/conversational-search/). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Retrieves message information for [conversational search]({{site.url}}{{site.baseurl}}/search-plugins/conversational-search/). | |
Use this API to retrieve message information for [conversational search]({{site.url}}{{site.baseurl}}/search-plugins/conversational-search/). |
**Introduced 2.12** | ||
{: .label .label-purple } | ||
|
||
Retrieves a conversational memory for [conversational search]({{site.url}}{{site.baseurl}}/search-plugins/conversational-search/). Use this command to search for memories. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Retrieves a conversational memory for [conversational search]({{site.url}}{{site.baseurl}}/search-plugins/conversational-search/). Use this command to search for memories. | |
This API retrieves a conversational memory for [conversational search]({{site.url}}{{site.baseurl}}/search-plugins/conversational-search/). Use this command to search for memories. |
- [Get model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/get-model/) | ||
- [Deploy model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/deploy-model/) | ||
- [Undeploy model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/undeploy-model/) | ||
- [Delete model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/delete-model/) | ||
- [Predict]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/train-predict/predict/): Invokes a model |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- [Predict]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/train-predict/predict/): Invokes a model | |
- [Predict]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/train-predict/predict/) (invokes a model) |
|
||
| Parameter | Data type | Description | | ||
| :--- | :--- | :--- | | ||
| `deploy` | Boolean | Whether to deploy the model after registering by calling the [Deploy Model API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/deploy-model/). Default is `false`. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| `deploy` | Boolean | Whether to deploy the model after registering by calling the [Deploy Model API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/deploy-model/). Default is `false`. | | |
| `deploy` | Boolean | Whether to deploy the model after registering it. The deploy operation is performed by calling the [Deploy Model API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/deploy-model/). Default is `false`. | |
|
||
Use this command to search for models you've already created. | ||
|
||
The response will contain only those model versions to which you have access. For example, if you send a match all query, model versions for the following model group types will be returned: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The response will contain only those model versions to which you have access. For example, if you send a match all query, model versions for the following model group types will be returned: | |
The response will contain only those model versions to which you have access. For example, if you send a `match_all` query, model versions for the following model group types will be returned: |
|
||
#### Example request: Rate limiting inference calls for a model | ||
|
||
The following request limits the number of times you can call the Predict API on the model to four Predict API calls per minute: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following request limits the number of times you can call the Predict API on the model to four Predict API calls per minute: | |
The following request limits the number of times you can call the Predict API on the model to 4 Predict API calls per minute: |
|
||
The OpenSearch Assistant Toolkit helps you create AI-powered assistants for OpenSearch Dashboards. The toolkit includes the following parts: | ||
|
||
- [**Agents and tools**]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/index/): _Agents_ interface with a large language model (LLM) and execute high-level tasks, such as summarization or generating PPL from natural language. The agent's high-level tasks consist of low-level tasks called _tools_, which can be reused by multiple agents. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- [**Agents and tools**]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/index/): _Agents_ interface with a large language model (LLM) and execute high-level tasks, such as summarization or generating PPL from natural language. The agent's high-level tasks consist of low-level tasks called _tools_, which can be reused by multiple agents. | |
- [**Agents and tools**]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/index/): _Agents_ interface with a large language model (LLM) and execute high-level tasks, such as summarization or generating Piped Processing Language (PPL) from natural language. The agent's high-level tasks consist of low-level tasks called _tools_, which can be reused by multiple agents. |
|
||
- [**Agents and tools**]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/index/): _Agents_ interface with a large language model (LLM) and execute high-level tasks, such as summarization or generating PPL from natural language. The agent's high-level tasks consist of low-level tasks called _tools_, which can be reused by multiple agents. | ||
- [**Workflow automation**]({{site.url}}{{site.baseurl}}/automating-workflows/index/): Uses templates to set up infrastructure for artificial intelligence and machine learning (AI/ML) applications. For example, you can automate configuring agents to be used for chat or generating PPL queries from natural language. | ||
- **OpenSearch Assistant**: The UI for the AI-powered assistant. The assistant's workflow is configured with various agents and tools. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- **OpenSearch Assistant**: The UI for the AI-powered assistant. The assistant's workflow is configured with various agents and tools. | |
- **OpenSearch Assistant**: This is the OpenSearch Dashboards UI for the AI-powered assistant. The assistant's workflow is configured with various agents and tools. |
|
||
To enable OpenSearch Assistant, perform the following steps: | ||
|
||
- Enable the agent framework and retrieval-augmented generation by configuring the following settings: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Enable the agent framework and retrieval-augmented generation by configuring the following settings: | |
- Enable the agent framework and retrieval-augmented generation (RAG) by configuring the following settings: |
:--- | :--- | :--- | ||
`llm_question` | Yes | The question the LLM must answer. | ||
`llm_model` | No | Overrides the original model set in the connection in cases where you want to use a different model (for example, GPT 4 instead of GPT 3.5). This option is required if a default model is not set during pipeline creation. | ||
`memory_id` | No | If you provide a `memory_id`, the pipeline retrieves the 10 most recent messages to the LLM prompt. If you don't specify a `memory_id`, the prior context is not added to the LLM prompt. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
`memory_id` | No | If you provide a `memory_id`, the pipeline retrieves the 10 most recent messages to the LLM prompt. If you don't specify a `memory_id`, the prior context is not added to the LLM prompt. | |
`memory_id` | No | If you provide a `memory_id`, the pipeline retrieves the 10 most recent messages in the specified memory and adds them to the LLM prompt. If you don't specify a `memory_id`, the prior context is not added to the LLM prompt. |
`llm_model` | No | Overrides the original model set in the connection in cases where you want to use a different model (for example, GPT 4 instead of GPT 3.5). This option is required if a default model is not set during pipeline creation. | ||
`memory_id` | No | If you provide a `memory_id`, the pipeline retrieves the 10 most recent messages to the LLM prompt. If you don't specify a `memory_id`, the prior context is not added to the LLM prompt. | ||
`context_size` | No | The number of search results sent to the LLM. This is typically needed in order to meet the token size limit, which can vary by model. Alternatively, you can use the `size` parameter in the Search API to control the number of search results sent to the LLM. | ||
`message_size` | No | The number of messages sent to the LLM. Similarly to the number of search results, this affects the total number of tokens seen by the LLM. When not set, the pipeline uses the default message size of `10`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
`message_size` | No | The number of messages sent to the LLM. Similarly to the number of search results, this affects the total number of tokens seen by the LLM. When not set, the pipeline uses the default message size of `10`. | |
`message_size` | No | The number of messages sent to the LLM. Similarly to the number of search results, this affects the total number of tokens received by the LLM. When not set, the pipeline uses the default message size of `10`. |
|
||
Use the following options when setting up a RAG pipeline under the `retrieval_augmented_generation` argument. | ||
To verify that both messages were added to the memory, provide the `memory_ID` to the Get Message API: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To verify that both messages were added to the memory, provide the `memory_ID` to the Get Message API: | |
To verify that both messages were added to the memory, provide the `memory_ID` to the Get Messages API: |
Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: kolchfa-aws <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kolchfa-aws Done with the additional content. Thanks!
## Parameters | ||
## Register parameters | ||
|
||
The following table lists all available parameters when registering the tool. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"The following table lists all parameters that are available when registering the tool."
|
||
The following table lists all available parameters. | ||
The following table lists all available parameters when running the tool. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"The following table lists all parameters that are available when running the tool." Please apply this structure to all instances.
`enable_Content_Generation` | Boolean | Optional | If `true`, returns results generated by an LLM. If `false`, returns results directly, without LLM-assisted content generation. Default is `true`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the comma after "directly".
`startIndex`| Integer | The paginated index of the alert to start from. Default is 0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should "0" be in code font?
:--- | :--- | :--- | :--- | :--- | ||
`name`| String | Required | All | The agent name. | | ||
`type` | String | Required | All | The agent type. Valid values are `flow` and `conversational`. For more information, see [Agents]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/index/). | | ||
`description` | String | Optional| All | The agent description. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"A description of the agent."
`app_type` | String | Optional | All | Specifies an optional agent category. You can then perform operations on all agents in the category. For example, you can delete all messages for RAG agents. | ||
`memory.type` | String | Optional | `conversational_flow`, `conversational` | Specifies where to store the conversational memory. Currently, the only supported type is `conversation_index` (store the memory in a conversational system index). | ||
`llm.model_id` | String | Required | `conversational` | The model ID of the LLM to which to send questions. | ||
`llm.parameters.response_filter` | String | Required | `conversational` | The pattern for parsing the LLM response. For each LLM, you need to provide the field where the response is located. For example, for the Anthropic Claude model, the response is in the `completion` field so the pattern is `$.completion`. For OpenAI models, the pattern is `$.choices[0].message.content`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"in which the response is located." "the response is located in the completion
field, so" (add "located" and a comma after "field").
`memory.type` | String | Optional | `conversational_flow`, `conversational` | Specifies where to store the conversational memory. Currently, the only supported type is `conversation_index` (store the memory in a conversational system index). | ||
`llm.model_id` | String | Required | `conversational` | The model ID of the LLM to which to send questions. | ||
`llm.parameters.response_filter` | String | Required | `conversational` | The pattern for parsing the LLM response. For each LLM, you need to provide the field where the response is located. For example, for the Anthropic Claude model, the response is in the `completion` field so the pattern is `$.completion`. For OpenAI models, the pattern is `$.choices[0].message.content`. | ||
`llm.parameters.max_iteration` | Integer | Optional | `conversational` | The maximum number of messages to send to the LLM. Default is 3. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should "3" be in code font?
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Model-level rate limiting applies to all users of the model. If you specify both a model-level rate limit and a user-level rate limit, the overall rate limit is set to the more restrictive of the two. For example, if the model-level limit is 2 requests per minute and the user-level limit is 4 requests per minute, the overall limit will be set to 2 requests per minute. | ||
|
||
To set the rate limit, you must provide two inputs: the maximum number of requests and the time frame. OpenSearch uses these inputs to calculate the rate limit as the maximum number of requests divided by the time frame. For example, if you set the limit to be 4 requests per minute, the rate limit is `4 requests / 1 minute`, which is `1 request / 0.25 minutes`, or `1 request / 15 seconds`. OpenSearch processes predict requests sequentially, in a first-come-first-served manner, and will limit those requests to 1 request per 15 seconds. Imagine two users, Alice and Bob, calling the Predict API for the same model, which has a rate limit of 1 request per 15 seconds. If Alice calls the Predict API and immediately after that Bob calls the Predict API, OpenSearch processes Alice's predict request first and stores Bob's request in a queue. Once 15 seconds has passed since Alice's request, OpenSearch processes Bob's request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Fanit, sorry for the confusion, but rate limiter cannot store the request. Instead, it will refuse the request directly. So Bob can only make his request after 15 seconds in this scenario.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the clarification; I've updated the docs.
Signed-off-by: Fanit Kolchina <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…onversational search documentation (opensearch-project#6354) * Add agent framework documentation Signed-off-by: Fanit Kolchina <[email protected]> * Add hidden model and API updates Signed-off-by: Fanit Kolchina <[email protected]> * Vale error Signed-off-by: Fanit Kolchina <[email protected]> * Updated field names Signed-off-by: Fanit Kolchina <[email protected]> * Add updating credentials Signed-off-by: Fanit Kolchina <[email protected]> * Added tools table Signed-off-by: Fanit Kolchina <[email protected]> * Add OpenSearch forum thread for OS Assistant Signed-off-by: Fanit Kolchina <[email protected]> * Add tech review for conv search Signed-off-by: Fanit Kolchina <[email protected]> * Fix links Signed-off-by: Fanit Kolchina <[email protected]> * Add tools Signed-off-by: Fanit Kolchina <[email protected]> * Add links to tools Signed-off-by: Fanit Kolchina <[email protected]> * More info about tools Signed-off-by: Fanit Kolchina <[email protected]> * Tool parameters Signed-off-by: Fanit Kolchina <[email protected]> * Update cat-index-tool.md Signed-off-by: kolchfa-aws <[email protected]> * Parameter clarification Signed-off-by: Fanit Kolchina <[email protected]> * Tech review feedback Signed-off-by: Fanit Kolchina <[email protected]> * Typo fix Signed-off-by: Fanit Kolchina <[email protected]> * More tech review feedback: RAG tool Signed-off-by: Fanit Kolchina <[email protected]> * Tech review feedback: memory APis Signed-off-by: Fanit Kolchina <[email protected]> * Update _ml-commons-plugin/agents-tools/index.md Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: kolchfa-aws <[email protected]> * Update _ml-commons-plugin/agents-tools/tools/neural-sparse-tool.md Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: kolchfa-aws <[email protected]> * Update _ml-commons-plugin/agents-tools/tools/neural-sparse-tool.md Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: kolchfa-aws <[email protected]> * Update _ml-commons-plugin/agents-tools/tools/neural-sparse-tool.md Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: kolchfa-aws <[email protected]> * Update _ml-commons-plugin/opensearch-assistant.md Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: kolchfa-aws <[email protected]> * Update _ml-commons-plugin/agents-tools/tools/ppl-tool.md Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: kolchfa-aws <[email protected]> * Apply suggestions from code review Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: kolchfa-aws <[email protected]> * Separated search and get APIs and add conversational flow agent Signed-off-by: Fanit Kolchina <[email protected]> * More parameters for PPL tool Signed-off-by: Fanit Kolchina <[email protected]> * Added more parameters Signed-off-by: Fanit Kolchina <[email protected]> * Tech review feedback: PPL tool Signed-off-by: Fanit Kolchina <[email protected]> * Apply suggestions from code review Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: kolchfa-aws <[email protected]> * Rename to automating configurations Signed-off-by: Fanit Kolchina <[email protected]> * Editorial comments on the new text Signed-off-by: Fanit Kolchina <[email protected]> * Add parameter to PPl tool Signed-off-by: Fanit Kolchina <[email protected]> * Changed link to configurations Signed-off-by: Fanit Kolchina <[email protected]> * Rate limiter feedback and added warning Signed-off-by: Fanit Kolchina <[email protected]> --------- Signed-off-by: Fanit Kolchina <[email protected]> Signed-off-by: kolchfa-aws <[email protected]> Co-authored-by: Melissa Vagi <[email protected]> Co-authored-by: Nathan Bower <[email protected]>
Closes #5379
Closes #5839
Closes #6268
Closes #6292
Agent Framework and OS Assistant Toolkit are experimental.
Checklist
For more information on following Developer Certificate of Origin and signing off your commits, please check here.