Skip to content

Commit

Permalink
Merge branch 'main' into metrics-logs
Browse files Browse the repository at this point in the history
  • Loading branch information
vagimeli authored May 13, 2024
2 parents 44500d8 + a606405 commit c761ef5
Show file tree
Hide file tree
Showing 17 changed files with 837 additions and 21 deletions.
2 changes: 1 addition & 1 deletion _data-prepper/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ redirect_from:

Data Prepper is a server-side data collector capable of filtering, enriching, transforming, normalizing, and aggregating data for downstream analysis and visualization. Data Prepper is the preferred data ingestion tool for OpenSearch. It is recommended for most data ingestion use cases in OpenSearch and for processing large, complex datasets.

With Data Prepper you can build custom pipelines to improve the operational view of applications. Two common use cases for Data Prepper are trace analytics and log analytics. [Trace analytics]({{site.url}}{{site.baseurl}}/observability-plugin/trace/index/) can help you visualize event flows and identify performance problems. [Log analytics]({{site.url}}{{site.baseurl}}/observability-plugin/log-analytics/) equips you with tools to enhance your search capabilities, conduct comprehensive analysis, and gain insights into your applications' performance and behavior.
With Data Prepper you can build custom pipelines to improve the operational view of applications. Two common use cases for Data Prepper are trace analytics and log analytics. [Trace analytics]({{site.url}}{{site.baseurl}}/data-prepper/common-use-cases/trace-analytics/) can help you visualize event flows and identify performance problems. [Log analytics]({{site.url}}{{site.baseurl}}/data-prepper/common-use-cases/log-analytics/) equips you with tools to enhance your search capabilities, conduct comprehensive analysis, and gain insights into your applications' performance and behavior.

## Concepts

Expand Down
2 changes: 1 addition & 1 deletion _im-plugin/index-codecs.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ The `index.codec.qatmode` setting controls the behavior of the hardware accelera

For information about the `index.codec.qatmode` setting's effects on snapshots, see the [Snapshots](#snapshots) section.

For more information about hardware acceleration on Intel, see the [Intel (R) QAT accelerator overview](https://www.intel.com/content/www/us/en/architecture-and-technology/intel-quick-assist-technology-overview.html).
For more information about hardware acceleration on Intel, see the [Intel (R) QAT accelerator overview](https://www.intel.com/content/www/us/en/developer/topic-technology/open/quick-assist-technology/overview.html).

## Choosing a codec

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,8 @@ OpenSearch supports the following cluster-level routing and shard allocation set
- `REPLICA_FIRST` – Replica shards are relocated first, before primary shards. This prioritization may help prevent a cluster's health status from going red when carrying out shard relocation in a mixed-version, segment-replication-enabled OpenSearch cluster. In this situation, primary shards relocated to OpenSearch nodes of a newer version could try to copy segment files to replica shards on an older version of OpenSearch, which would result in shard failure. Relocating replica shards first may help to avoid this in multi-version clusters.
- `NO_PREFERENCE` – The default behavior in which the order of shard relocation has no importance.

- `cluster.allocator.gateway.batch_size` (Integer): Limits the number of shards sent to data nodes in one batch for fetching any unassigned shard metadata. Default is `2000`.
- `cluster.allocator.existing_shards_allocator.batch_enabled` (Boolean): Enables batch allocation of unassigned shards that already exist on the disk as opposed to allocating one shard at a time. This reduces memory and transport overhead by fetching any unassigned shard metadata in a batch call. Default is `false`.
## Cluster-level shard, block, and task settings

OpenSearch supports the following cluster-level shard, block, and task settings:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -168,6 +168,8 @@ OpenSearch supports the following dynamic index-level index settings:

- `index.search.idle.after` (Time unit): The amount of time a shard should wait for a search or get request until it goes idle. Default is `30s`.

- `index.search.default_pipeline` (String): The name of the search pipeline that is used if no pipeline is explicitly set when searching an index. If a default pipeline is set and the pipeline doesn't exist, then the index requests fail. Use the pipeline name `_none` to specify no default search pipeline. For more information, see [Default search pipeline]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/using-search-pipeline/#default-search-pipeline).

- `index.refresh_interval` (Time unit): How often the index should refresh, which publishes its most recent changes and makes them available for searching. Can be set to `-1` to disable refreshing. Default is `1s`.

- `index.max_result_window` (Integer): The maximum value of `from` + `size` for searches of the index. `from` is the starting index to search from, and `size` is the number of results to return. Default is 10000.
Expand Down
3 changes: 1 addition & 2 deletions _install-and-configure/install-opensearch/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,8 +60,7 @@ Port number | OpenSearch component
443 | OpenSearch Dashboards in AWS OpenSearch Service with encryption in transit (TLS)
5601 | OpenSearch Dashboards
9200 | OpenSearch REST API
9250 | Cross-cluster search
9300 | Node communication and transport
9300 | Node communication and transport (internal), cross cluster search
9600 | Performance Analyzer

## Important settings
Expand Down
112 changes: 101 additions & 11 deletions _ml-commons-plugin/api/model-apis/register-model.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,8 +56,7 @@ Field | Data type | Required/Optional | Description
`version` | String | Required | The model version. |
`model_format` | String | Required | The portable format of the model file. Valid values are `TORCH_SCRIPT` and `ONNX`. |
`description` | String | Optional| The model description. |
`model_group_id` | String | Optional | The model group ID of the model group to register this model to.
`is_enabled`| Boolean | Specifies whether the model is enabled. Disabling the model makes it unavailable for Predict API requests, regardless of the model's deployment status. Default is `true`.
`model_group_id` | String | Optional | The ID of the model group to which to register the model.

#### Example request: OpenSearch-provided text embedding model

Expand Down Expand Up @@ -89,8 +88,7 @@ Field | Data type | Required/Optional | Description
`model_content_hash_value` | String | Required | The model content hash generated using the SHA-256 hashing algorithm.
`url` | String | Required | The URL that contains the model. |
`description` | String | Optional| The model description. |
`model_group_id` | String | Optional | The model group ID of the model group to register this model to.
`is_enabled`| Boolean | Specifies whether the model is enabled. Disabling the model makes it unavailable for Predict API requests, regardless of the model's deployment status. Default is `true`.
`model_group_id` | String | Optional | The ID of the model group to which to register this model.

#### Example request: OpenSearch-provided sparse encoding model

Expand Down Expand Up @@ -124,7 +122,9 @@ Field | Data type | Required/Optional | Description
`url` | String | Required | The URL that contains the model. |
`description` | String | Optional| The model description. |
`model_group_id` | String | Optional | The model group ID of the model group to register this model to.
`is_enabled`| Boolean | Specifies whether the model is enabled. Disabling the model makes it unavailable for Predict API requests, regardless of the model's deployment status. Default is `true`.
`is_enabled`| Boolean | Optional | Specifies whether the model is enabled. Disabling the model makes it unavailable for Predict API requests, regardless of the model's deployment status. Default is `true`.
`rate_limiter` | Object | Optional | Limits the number of times that any user can call the Predict API on the model. For more information, see [Rate limiting inference calls]({{site.url}}{{site.baseurl}}/ml-commons-plugin/integrating-ml-models/#rate-limiting-inference-calls).
`interface`| Object | Optional | The interface for the model. For more information, see [Interface](#the-interface-parameter).|

#### The `model_config` object

Expand Down Expand Up @@ -182,8 +182,10 @@ Field | Data type | Required/Optional | Description
`connector` | Object | Required | Contains specifications for a connector for a model hosted on a third-party platform. For more information, see [Creating a connector for a specific model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/remote-models/connectors/#creating-a-connector-for-a-specific-model). You must provide either `connector_id` or `connector`.
`description` | String | Optional| The model description. |
`model_group_id` | String | Optional | The model group ID of the model group to register this model to.
`is_enabled`| Boolean | Specifies whether the model is enabled. Disabling the model makes it unavailable for Predict API requests, regardless of the model's deployment status. Default is `true`.
`is_enabled`| Boolean | Optional | Specifies whether the model is enabled. Disabling the model makes it unavailable for Predict API requests, regardless of the model's deployment status. Default is `true`.
`rate_limiter` | Object | Optional | Limits the number of times that any user can call the Predict API on the model. For more information, see [Rate limiting inference calls]({{site.url}}{{site.baseurl}}/ml-commons-plugin/integrating-ml-models/#rate-limiting-inference-calls).
`guardrails`| Object | Optional | The guardrails for the model input. For more information, see [Guardrails](#the-guardrails-parameter).|
`interface`| Object | Optional | The interface for the model. For more information, see [Interface](#the-interface-parameter).|

#### Example request: Externally hosted with a standalone connector

Expand Down Expand Up @@ -240,12 +242,13 @@ POST /_plugins/_ml/models/_register

#### Example response

OpenSearch responds with the `task_id` and task `status`.
OpenSearch responds with the `task_id`, task `status`, and `model_id`:

```json
{
"task_id" : "ew8I44MBhyWuIwnfvDIH",
"status" : "CREATED"
"status" : "CREATED",
"model_id": "t8qvDY4BChVAiNVEuo8q"
}
```

Expand Down Expand Up @@ -304,12 +307,99 @@ For a complete example, see [Guardrails]({{site.url}}{{site.baseurl}}/ml-commons

#### Example response

OpenSearch responds with the `task_id` and task `status`:
OpenSearch responds with the `task_id`, task `status`, and `model_id`:

```json
{
"task_id": "tsqvDY4BChVAiNVEuo8F",
"status": "CREATED",
"model_id": "t8qvDY4BChVAiNVEuo8q"
}
```

### The `interface` parameter

The model interface provides a highly flexible way to add arbitrary metadata annotations to all local deep learning models and remote models in a JSON schema syntax. This annotation initiates a validation check on the input and output fields of the model during the model's invocation. The validation check ensures that the input and output fields are in the correct format both before and after the model performs a prediction.
To register a model with a model interface, provide the `interface` parameter, which supports the following fields.

Field | Data type | Description
:--- | :--- |:------------------------------------
`input`| Object | The JSON schema for the model input. |
`output`| Object | The JSON schema for the model output. |

The input and output fields will be evaluated against the separately provided JSON schema. You do not necessarily need to provide both input and output fields simultaneously.

To learn more about the JSON schema syntax, see [Understanding JSON Schema](https://json-schema.org/understanding-json-schema/).

#### Example request: Externally hosted model with an interface

```json
POST /_plugins/_ml/models/_register
{
"name": "openAI-gpt-3.5-turbo",
"function_name": "remote",
"description": "test model",
"connector_id": "A-j7K48BZzNMh1sWVdJu",
"interface": {
"input": {
"properties": {
"parameters": {
"properties": {
"messages": {
"type": "string",
"description": "This is a test description field"
}
}
}
}
},
"output": {
"properties": {
"inference_results": {
"type": "array",
"items": {
"type": "object",
"properties": {
"output": {
"type": "array",
"items": {
"properties": {
"name": {
"type": "string",
"description": "This is a test description field"
},
"dataAsMap": {
"type": "object",
"description": "This is a test description field"
}
}
},
"description": "This is a test description field"
},
"status_code": {
"type": "integer",
"description": "This is a test description field"
}
}
},
"description": "This is a test description field"
}
}
}
}
}
```
{% include copy-curl.html %}

#### Example response

OpenSearch responds with the `task_id`, task `status`, and `model_id`:

```json
{
"task_id" : "ew8I44MBhyWuIwnfvDIH",
"status" : "CREATED"
"task_id": "tsqvDY4BChVAiNVEuo8F",
"status": "CREATED",
"model_id": "t8qvDY4BChVAiNVEuo8q"
}
```

Expand Down
32 changes: 32 additions & 0 deletions _ml-commons-plugin/api/model-apis/undeploy-model.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,3 +56,35 @@ POST /_plugins/_ml/models/_undeploy
}
}
```
### Automatically undeploy a model based on TTL

Starting with OpenSearch 2.14, models can be automatically undeployed from memory based on the predefined time-to-live (TTL) when the model was last accessed or used. To define a TTL that automatically undeploys a model, include the following `ModelDeploySetting` in your machine learning (ML) model. Note that model TTLs are checked periodically by a `syn_up` cron job, so the maximum time that a model lives in memory could be TTL + the `sync_up_job_` interval. The default cron job interval is 10 seconds. To update the cron job internally, use the following cluster setting:

```json
PUT /_cluster/settings
{
"persistent": {
"plugins.ml_commons.sync_up_job_interval_in_seconds": 10
}
}
```

#### Example request: Creating a model with a TTL
```json
POST /_plugins/_ml/models/_register
{
"name": "Sample Model Name",
"function_name": "remote",
"description": "test model",
"connector_id": "-g1nOo8BOaAC5MIJ3_4R",
"deploy_setting": {"model_ttl_minutes": 100}
}
```

#### Example request: Updating a model with a TTL when the model is undeployed
```json
PUT /_plugins/_ml/models/COj7K48BZzNMh1sWedLK
{
"deploy_setting": {"model_ttl_minutes" : 100}
}
```
1 change: 1 addition & 0 deletions _ml-commons-plugin/api/model-apis/update-model.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ Field | Data type | Description
`rate_limiter.limit` | Integer | The maximum number of times any user can call the Predict API on the model per `unit` of time. By default, there is no limit on the number of Predict API calls. Once you set a limit, you cannot reset it to no limit. As an alternative, you can specify a high limit value and a small time unit, for example, 1 request per nanosecond.
`rate_limiter.unit` | String | The unit of time for the rate limiter. Valid values are `DAYS`, `HOURS`, `MICROSECONDS`, `MILLISECONDS`, `MINUTES`, `NANOSECONDS`, and `SECONDS`.
`guardrails`| Object | The guardrails for the model.
`interface`| Object | The interface for the model.

#### Example request: Disabling a model

Expand Down
Loading

0 comments on commit c761ef5

Please sign in to comment.