-
Notifications
You must be signed in to change notification settings - Fork 508
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ml commons batch inference #7899
Changes from all commits
3302007
599fa27
fd5f8df
f882f20
f9a9826
2c06427
41f862f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,167 @@ | ||
--- | ||
layout: default | ||
title: Batch predict | ||
parent: Model APIs | ||
grand_parent: ML Commons APIs | ||
nav_order: 65 | ||
--- | ||
|
||
# Batch predict | ||
|
||
This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [GitHub issue](https://github.com/opensearch-project/ml-commons/issues/2488). | ||
{: .warning} | ||
|
||
ML Commons can perform inference on large datasets in an offline asynchronous mode using a model deployed on external model servers. To use the Batch Predict API, you must provide the `model_id` for an externally hosted model. Amazon SageMaker, Cohere, and OpenAI are currently the only verified external servers that support this API. | ||
|
||
For information about user access for this API, see [Model access control considerations]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/index/#model-access-control-considerations). | ||
|
||
For information about externally hosted models, see [Connecting to externally hosted models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/remote-models/index/). | ||
|
||
For instructions on how set up batch inference and connector blueprints, see the following: | ||
|
||
- [Amazon SageMaker batch predict connector blueprint](https://github.com/opensearch-project/ml-commons/blob/main/docs/remote_inference_blueprints/batch_inference_sagemaker_connector_blueprint.md) | ||
|
||
- [OpenAI batch predict connector blueprint](https://github.com/opensearch-project/ml-commons/blob/main/docs/remote_inference_blueprints/batch_inference_openAI_connector_blueprint.md) | ||
|
||
## Path and HTTP methods | ||
|
||
```json | ||
POST /_plugins/_ml/models/<model_id>/_batch_predict | ||
``` | ||
|
||
## Prerequisites | ||
|
||
Before using the Batch Predict API, you need to create a connector to the externally hosted model. For example, to create a connector to an OpenAI `text-embedding-ada-002` model, send the following request: | ||
|
||
```json | ||
POST /_plugins/_ml/connectors/_create | ||
{ | ||
"name": "OpenAI Embedding model", | ||
"description": "OpenAI embedding model for testing offline batch", | ||
"version": "1", | ||
"protocol": "http", | ||
"parameters": { | ||
"model": "text-embedding-ada-002", | ||
"input_file_id": "<your input file id in OpenAI>", | ||
"endpoint": "/v1/embeddings" | ||
}, | ||
"credential": { | ||
"openAI_key": "<your openAI key>" | ||
}, | ||
"actions": [ | ||
{ | ||
"action_type": "predict", | ||
"method": "POST", | ||
"url": "https://api.openai.com/v1/embeddings", | ||
"headers": { | ||
"Authorization": "Bearer ${credential.openAI_key}" | ||
}, | ||
"request_body": "{ \"input\": ${parameters.input}, \"model\": \"${parameters.model}\" }", | ||
"pre_process_function": "connector.pre_process.openai.embedding", | ||
"post_process_function": "connector.post_process.openai.embedding" | ||
}, | ||
{ | ||
"action_type": "batch_predict", | ||
"method": "POST", | ||
"url": "https://api.openai.com/v1/batches", | ||
"headers": { | ||
"Authorization": "Bearer ${credential.openAI_key}" | ||
}, | ||
"request_body": "{ \"input_file_id\": \"${parameters.input_file_id}\", \"endpoint\": \"${parameters.endpoint}\", \"completion_window\": \"24h\" }" | ||
} | ||
] | ||
} | ||
``` | ||
{% include copy-curl.html %} | ||
|
||
The response contains a connector ID that you'll use in the next steps: | ||
|
||
```json | ||
{ | ||
"connector_id": "XU5UiokBpXT9icfOM0vt" | ||
} | ||
``` | ||
|
||
Next, register an externally hosted model and provide the connector ID of the created connector: | ||
|
||
```json | ||
POST /_plugins/_ml/models/_register?deploy=true | ||
{ | ||
"name": "OpenAI model for realtime embedding and offline batch inference", | ||
"function_name": "remote", | ||
"description": "OpenAI text embedding model", | ||
"connector_id": "XU5UiokBpXT9icfOM0vt" | ||
} | ||
``` | ||
{% include copy-curl.html %} | ||
|
||
The response contains the task ID for the register operation: | ||
|
||
```json | ||
{ | ||
"task_id": "rMormY8B8aiZvtEZIO_j", | ||
"status": "CREATED", | ||
"model_id": "lyjxwZABNrAVdFa9zrcZ" | ||
} | ||
``` | ||
|
||
To check the status of the operation, provide the task ID to the [Tasks API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/tasks-apis/get-task/). Once the registration is complete, the task `state` changes to `COMPLETED`. | ||
|
||
#### Example request | ||
|
||
Once you have completed the prerequisite steps, you can call the Batch Predict API. The parameters in the batch predict request override those defined in the connector: | ||
|
||
```json | ||
POST /_plugins/_ml/models/lyjxwZABNrAVdFa9zrcZ/_batch_predict | ||
{ | ||
"parameters": { | ||
"model": "text-embedding-3-large" | ||
} | ||
} | ||
``` | ||
{% include copy-curl.html %} | ||
|
||
#### Example response | ||
|
||
```json | ||
{ | ||
"inference_results": [ | ||
{ | ||
"output": [ | ||
{ | ||
"name": "response", | ||
"dataAsMap": { | ||
"id": "batch_<your file id>", | ||
"object": "batch", | ||
"endpoint": "/v1/embeddings", | ||
"errors": null, | ||
"input_file_id": "file-<your input file id>", | ||
"completion_window": "24h", | ||
"status": "validating", | ||
"output_file_id": null, | ||
"error_file_id": null, | ||
"created_at": 1722037257, | ||
"in_progress_at": null, | ||
"expires_at": 1722123657, | ||
"finalizing_at": null, | ||
"completed_at": null, | ||
"failed_at": null, | ||
"expired_at": null, | ||
"cancelling_at": null, | ||
"cancelled_at": null, | ||
"request_counts": { | ||
"total": 0, | ||
"completed": 0, | ||
"failed": 0 | ||
}, | ||
"metadata": null | ||
} | ||
} | ||
], | ||
"status_code": 200 | ||
} | ||
] | ||
} | ||
``` | ||
|
||
For the definition of each field in the result, see [OpenAI Batch API](https://platform.openai.com/docs/guides/batch). Once the batch inference is complete, you can download the output by calling the [OpenAI Files API](https://platform.openai.com/docs/api-reference/files) and providing the file name specified in the `id` field of the response. |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -9,16 +9,40 @@ | |
|
||
# Model APIs | ||
|
||
ML Commons supports the following model-level APIs: | ||
|
||
- [Register model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/register-model/) | ||
- [Deploy model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/deploy-model/) | ||
- [Get model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/get-model/) | ||
- [Search model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/search-model/) | ||
- [Update model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/update-model/) | ||
- [Undeploy model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/undeploy-model/) | ||
- [Delete model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/delete-model/) | ||
- [Predict]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/train-predict/predict/) (invokes a model) | ||
ML Commons supports the following model-level CRUD APIs: | ||
|
||
- [Register Model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/register-model/) | ||
- [Deploy Model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/deploy-model/) | ||
- [Get Model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/get-model/) | ||
- [Search Model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/search-model/) | ||
- [Update Model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/update-model/) | ||
- [Undeploy Model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/undeploy-model/) | ||
- [Delete Model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/delete-model/) | ||
|
||
# Predict APIs | ||
|
||
Predict APIs are used to invoke machine learning (ML) models. ML Commons supports the following Predict APIs: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is the second instance of "Predict" intentionally capitalized? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, capitalized since it's the API name. |
||
|
||
- [Predict]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/train-predict/predict/) | ||
- [Batch Predict]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/batch-predict/) (experimental) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "Batch Predict" is not capitalized in the title or H1 of the preceding file. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Right. Normally, we imply the operation in the H1 and left nav title but this is the actual API name so I capitalized. Alternatively, I can change everything to sentence case. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fine as is |
||
|
||
# Train API | ||
|
||
The ML Commons Train API lets you train ML algorithms synchronously and asynchronously: | ||
|
||
- [Train]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/train-predict/train/) | ||
|
||
To train tasks through the API, three inputs are required: | ||
|
||
- Algorithm name: Must be a [FunctionName](https://github.com/opensearch-project/ml-commons/blob/1.3/common/src/main/java/org/opensearch/ml/common/parameter/FunctionName.java). This determines what algorithm the ML model runs. To add a new function, see [How To Add a New Function](https://github.com/opensearch-project/ml-commons/blob/main/docs/how-to-add-new-function.md). | ||
- Model hyperparameters: Adjust these parameters to improve model accuracy. | ||
- Input data: The data that trains the ML model or applies it to predictions. You can input data in two ways: query against your index or use a data frame. | ||
|
||
# Train and Predict API | ||
Check failure on line 41 in _ml-commons-plugin/api/model-apis/index.md GitHub Actions / style-job
|
||
|
||
The Train and Predict API lets you train and invoke the model using the same dataset: | ||
|
||
- [Train and Predict]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/train-predict/train-and-predict/) | ||
|
||
## Model access control considerations | ||
|
||
|
This file was deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the information below need to be in the form of a bulleted list, or could it just be a sentence?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessarily but if we add more connectors, I think it's better if it's a list.