Merge branch 'main' into metrics-logs

opensearch-project · May 13, 2024 · c761ef5 · c761ef5
2 parents 44500d8 + a606405
commit c761ef5
Show file tree

Hide file tree

Showing 17 changed files with 837 additions and 21 deletions.
diff --git a/_data-prepper/index.md b/_data-prepper/index.md
@@ -16,7 +16,7 @@ redirect_from:
 
 Data Prepper is a server-side data collector capable of filtering, enriching, transforming, normalizing, and aggregating data for downstream analysis and visualization. Data Prepper is the preferred data ingestion tool for OpenSearch. It is recommended for most data ingestion use cases in OpenSearch and for processing large, complex datasets.
 
-With Data Prepper you can build custom pipelines to improve the operational view of applications. Two common use cases for Data Prepper are trace analytics and log analytics. [Trace analytics]({{site.url}}{{site.baseurl}}/observability-plugin/trace/index/) can help you visualize event flows and identify performance problems. [Log analytics]({{site.url}}{{site.baseurl}}/observability-plugin/log-analytics/) equips you with tools to enhance your search capabilities, conduct comprehensive analysis, and gain insights into your applications' performance and behavior.
+With Data Prepper you can build custom pipelines to improve the operational view of applications. Two common use cases for Data Prepper are trace analytics and log analytics. [Trace analytics]({{site.url}}{{site.baseurl}}/data-prepper/common-use-cases/trace-analytics/) can help you visualize event flows and identify performance problems. [Log analytics]({{site.url}}{{site.baseurl}}/data-prepper/common-use-cases/log-analytics/) equips you with tools to enhance your search capabilities, conduct comprehensive analysis, and gain insights into your applications' performance and behavior.
 
 ## Concepts
 

diff --git a/_im-plugin/index-codecs.md b/_im-plugin/index-codecs.md
@@ -50,7 +50,7 @@ The `index.codec.qatmode` setting controls the behavior of the hardware accelera
 
 For information about the `index.codec.qatmode` setting's effects on snapshots, see the [Snapshots](#snapshots) section.
 
-For more information about hardware acceleration on Intel, see the [Intel (R) QAT accelerator overview](https://www.intel.com/content/www/us/en/architecture-and-technology/intel-quick-assist-technology-overview.html).
+For more information about hardware acceleration on Intel, see the [Intel (R) QAT accelerator overview](https://www.intel.com/content/www/us/en/developer/topic-technology/open/quick-assist-technology/overview.html).
 
 ## Choosing a codec 
 

diff --git a/_install-and-configure/configuring-opensearch/cluster-settings.md b/_install-and-configure/configuring-opensearch/cluster-settings.md
@@ -100,6 +100,8 @@ OpenSearch supports the following cluster-level routing and shard allocation set
     - `REPLICA_FIRST` – Replica shards are relocated first, before primary shards. This prioritization may help prevent a cluster's health status from going red when carrying out shard relocation in a mixed-version, segment-replication-enabled OpenSearch cluster. In this situation, primary shards relocated to OpenSearch nodes of a newer version could try to copy segment files to replica shards on an older version of OpenSearch, which would result in shard failure. Relocating replica shards first may help to avoid this in multi-version clusters. 
     - `NO_PREFERENCE` – The default behavior in which the order of shard relocation has no importance. 
 
+- `cluster.allocator.gateway.batch_size` (Integer): Limits the number of shards sent to data nodes in one batch for fetching any unassigned shard metadata. Default is `2000`.
+- `cluster.allocator.existing_shards_allocator.batch_enabled` (Boolean): Enables batch allocation of unassigned shards that already exist on the disk as opposed to allocating one shard at a time. This reduces memory and transport overhead by fetching any unassigned shard metadata in a batch call. Default is `false`.
 ## Cluster-level shard, block, and task settings
 
 OpenSearch supports the following cluster-level shard, block, and task settings:

diff --git a/_install-and-configure/configuring-opensearch/index-settings.md b/_install-and-configure/configuring-opensearch/index-settings.md
@@ -168,6 +168,8 @@ OpenSearch supports the following dynamic index-level index settings:
 
 - `index.search.idle.after` (Time unit): The amount of time a shard should wait for a search or get request until it goes idle. Default is `30s`.
 
+- `index.search.default_pipeline` (String): The name of the search pipeline that is used if no pipeline is explicitly set when searching an index. If a default pipeline is set and the pipeline doesn't exist, then the index requests fail. Use the pipeline name `_none` to specify no default search pipeline. For more information, see [Default search pipeline]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/using-search-pipeline/#default-search-pipeline).
+
 - `index.refresh_interval` (Time unit): How often the index should refresh, which publishes its most recent changes and makes them available for searching. Can be set to `-1` to disable refreshing. Default is `1s`.
 
 - `index.max_result_window` (Integer): The maximum value of `from` + `size` for searches of the index. `from` is the starting index to search from, and `size` is the number of results to return. Default is 10000.

diff --git a/_install-and-configure/install-opensearch/index.md b/_install-and-configure/install-opensearch/index.md
@@ -60,8 +60,7 @@ Port number | OpenSearch component
 443 | OpenSearch Dashboards in AWS OpenSearch Service with encryption in transit (TLS)
 5601 | OpenSearch Dashboards
 9200 | OpenSearch REST API
-9250 | Cross-cluster search
-9300 | Node communication and transport
+9300 | Node communication and transport (internal), cross cluster search
 9600 | Performance Analyzer
 
 ## Important settings

diff --git a/_ml-commons-plugin/api/model-apis/register-model.md b/_ml-commons-plugin/api/model-apis/register-model.md
@@ -56,8 +56,7 @@ Field | Data type | Required/Optional | Description
 `version` | String | Required | The model version. |
 `model_format` | String | Required | The portable format of the model file. Valid values are `TORCH_SCRIPT` and `ONNX`. |
 `description` | String | Optional| The model description. |
-`model_group_id` | String | Optional | The model group ID of the model group to register this model to. 
-`is_enabled`| Boolean | Specifies whether the model is enabled. Disabling the model makes it unavailable for Predict API requests, regardless of the model's deployment status. Default is `true`.
+`model_group_id` | String | Optional | The ID of the model group to which to register the model.
 
 #### Example request: OpenSearch-provided text embedding model
 
@@ -89,8 +88,7 @@ Field | Data type | Required/Optional | Description
 `model_content_hash_value` | String | Required | The model content hash generated using the SHA-256 hashing algorithm.
 `url` | String | Required | The URL that contains the model. |
 `description` | String | Optional| The model description. |
-`model_group_id` | String | Optional | The model group ID of the model group to register this model to. 
-`is_enabled`| Boolean | Specifies whether the model is enabled. Disabling the model makes it unavailable for Predict API requests, regardless of the model's deployment status. Default is `true`.
+`model_group_id` | String | Optional | The ID of the model group to which to register this model.
 
 #### Example request: OpenSearch-provided sparse encoding model
 
@@ -124,7 +122,9 @@ Field | Data type | Required/Optional | Description
 `url` | String | Required | The URL that contains the model. |
 `description` | String | Optional| The model description. |
 `model_group_id` | String | Optional | The model group ID of the model group to register this model to. 
-`is_enabled`| Boolean | Specifies whether the model is enabled. Disabling the model makes it unavailable for Predict API requests, regardless of the model's deployment status. Default is `true`.
+`is_enabled`| Boolean | Optional | Specifies whether the model is enabled. Disabling the model makes it unavailable for Predict API requests, regardless of the model's deployment status. Default is `true`.
+`rate_limiter` | Object | Optional | Limits the number of times that any user can call the Predict API on the model. For more information, see [Rate limiting inference calls]({{site.url}}{{site.baseurl}}/ml-commons-plugin/integrating-ml-models/#rate-limiting-inference-calls).
+`interface`| Object | Optional | The interface for the model. For more information, see [Interface](#the-interface-parameter).|
 
 #### The `model_config` object
 
@@ -182,8 +182,10 @@ Field | Data type | Required/Optional | Description
 `connector` | Object | Required | Contains specifications for a connector for a model hosted on a third-party platform. For more information, see [Creating a connector for a specific model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/remote-models/connectors/#creating-a-connector-for-a-specific-model). You must provide either `connector_id` or `connector`.
 `description` | String | Optional| The model description. |
 `model_group_id` | String | Optional | The model group ID of the model group to register this model to. 
-`is_enabled`| Boolean | Specifies whether the model is enabled. Disabling the model makes it unavailable for Predict API requests, regardless of the model's deployment status. Default is `true`.
+`is_enabled`| Boolean | Optional | Specifies whether the model is enabled. Disabling the model makes it unavailable for Predict API requests, regardless of the model's deployment status. Default is `true`.
+`rate_limiter` | Object | Optional | Limits the number of times that any user can call the Predict API on the model. For more information, see [Rate limiting inference calls]({{site.url}}{{site.baseurl}}/ml-commons-plugin/integrating-ml-models/#rate-limiting-inference-calls).
 `guardrails`| Object | Optional | The guardrails for the model input. For more information, see [Guardrails](#the-guardrails-parameter).|
+`interface`| Object | Optional | The interface for the model. For more information, see [Interface](#the-interface-parameter).|
 
 #### Example request: Externally hosted with a standalone connector
 
@@ -240,12 +242,13 @@ POST /_plugins/_ml/models/_register
 
 #### Example response
 
-OpenSearch responds with the `task_id` and task `status`.
+OpenSearch responds with the `task_id`, task `status`, and `model_id`:
 
 ```json
 {
   "task_id" : "ew8I44MBhyWuIwnfvDIH", 
-  "status" : "CREATED"
+  "status" : "CREATED",
+  "model_id": "t8qvDY4BChVAiNVEuo8q"
 }
 ```
 
@@ -304,12 +307,99 @@ For a complete example, see [Guardrails]({{site.url}}{{site.baseurl}}/ml-commons
 
 #### Example response
 
-OpenSearch responds with the `task_id` and task `status`:
+OpenSearch responds with the `task_id`, task `status`, and `model_id`:
+
+```json
+{
+    "task_id": "tsqvDY4BChVAiNVEuo8F",
+    "status": "CREATED",
+    "model_id": "t8qvDY4BChVAiNVEuo8q"
+}
+```
+
+### The `interface` parameter
+
+The model interface provides a highly flexible way to add arbitrary metadata annotations to all local deep learning models and remote models in a JSON schema syntax. This annotation initiates a validation check on the input and output fields of the model during the model's invocation. The validation check ensures that the input and output fields are in the correct format both before and after the model performs a prediction.
+To register a model with a model interface, provide the `interface` parameter, which supports the following fields.
+
+Field | Data type | Description                         
+:---  | :--- |:------------------------------------
+`input`| Object | The JSON schema for the model input. |
+`output`| Object | The JSON schema for the model output. |
+
+The input and output fields will be evaluated against the separately provided JSON schema. You do not necessarily need to provide both input and output fields simultaneously.
+
+To learn more about the JSON schema syntax, see [Understanding JSON Schema](https://json-schema.org/understanding-json-schema/).
+
+#### Example request: Externally hosted model with an interface
+
+```json
+POST /_plugins/_ml/models/_register
+{
+    "name": "openAI-gpt-3.5-turbo",
+    "function_name": "remote",
+    "description": "test model",
+    "connector_id": "A-j7K48BZzNMh1sWVdJu",
+    "interface": {
+        "input": {
+            "properties": {
+                "parameters": {
+                    "properties": {
+                        "messages": {
+                            "type": "string",
+                            "description": "This is a test description field"
+                        }
+                    }
+                }
+            }
+        },
+        "output": {
+            "properties": {
+                "inference_results": {
+                    "type": "array",
+                    "items": {
+                        "type": "object",
+                        "properties": {
+                            "output": {
+                                "type": "array",
+                                "items": {
+                                    "properties": {
+                                        "name": {
+                                            "type": "string",
+                                            "description": "This is a test description field"
+                                        },
+                                        "dataAsMap": {
+                                            "type": "object",
+                                            "description": "This is a test description field"
+                                        }
+                                    }
+                                },
+                                "description": "This is a test description field"
+                            },
+                            "status_code": {
+                                "type": "integer",
+                                "description": "This is a test description field"
+                            }
+                        }
+                    },
+                    "description": "This is a test description field"
+                }
+            }
+        }
+    }
+}
+```
+{% include copy-curl.html %}
+
+#### Example response
+
+OpenSearch responds with the `task_id`, task `status`, and `model_id`:
 
 ```json
 {
-  "task_id" : "ew8I44MBhyWuIwnfvDIH",
-  "status" : "CREATED"
+    "task_id": "tsqvDY4BChVAiNVEuo8F",
+    "status": "CREATED",
+    "model_id": "t8qvDY4BChVAiNVEuo8q"
 }
 ```
 

diff --git a/_ml-commons-plugin/api/model-apis/undeploy-model.md b/_ml-commons-plugin/api/model-apis/undeploy-model.md
@@ -56,3 +56,35 @@ POST /_plugins/_ml/models/_undeploy
   }
 }
 ```
+### Automatically undeploy a model based on TTL
+
+Starting with OpenSearch  2.14, models can be automatically undeployed from memory based on the predefined time-to-live (TTL) when the model was last accessed or used. To define a TTL that automatically undeploys a model, include the following `ModelDeploySetting` in your machine learning (ML) model. Note that model TTLs are checked periodically by a `syn_up` cron job, so the maximum time that a model lives in memory could be TTL + the `sync_up_job_` interval. The default cron job interval is 10 seconds. To update the cron job internally, use the following cluster setting:
+
+```json
+PUT /_cluster/settings
+{
+    "persistent": {
+        "plugins.ml_commons.sync_up_job_interval_in_seconds": 10
+    }
+}
+```
+
+#### Example request: Creating a model with a TTL
+```json
+POST /_plugins/_ml/models/_register
+ {
+   "name": "Sample Model Name",
+   "function_name": "remote",
+   "description": "test model",
+   "connector_id": "-g1nOo8BOaAC5MIJ3_4R",
+   "deploy_setting": {"model_ttl_minutes": 100}
+ }
+```
+
+#### Example request: Updating a model with a TTL when the model is undeployed
+```json
+PUT /_plugins/_ml/models/COj7K48BZzNMh1sWedLK
+{
+    "deploy_setting": {"model_ttl_minutes" : 100}
+}
+```
diff --git a/_ml-commons-plugin/api/model-apis/update-model.md b/_ml-commons-plugin/api/model-apis/update-model.md
@@ -37,6 +37,7 @@ Field | Data type |  Description
 `rate_limiter.limit` | Integer | The maximum number of times any user can call the Predict API on the model per `unit` of time. By default, there is no limit on the number of Predict API calls. Once you set a limit, you cannot reset it to no limit. As an alternative, you can specify a high limit value and a small time unit, for example, 1 request per nanosecond.
 `rate_limiter.unit` | String | The unit of time for the rate limiter. Valid values are `DAYS`, `HOURS`, `MICROSECONDS`, `MILLISECONDS`, `MINUTES`, `NANOSECONDS`, and `SECONDS`.
 `guardrails`| Object | The guardrails for the model.
+`interface`| Object | The interface for the model.
 
 #### Example request: Disabling a model