Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add batch inference API #7853

Closed
wants to merge 4 commits into from
Closed

Conversation

Zhangxunmt
Copy link
Contributor

@Zhangxunmt Zhangxunmt commented Jul 29, 2024

Description

Add doc for Batch Inference as a new API under the Ml-Commons/Model-API.

Issues Resolved

Closes #7848

Version

List the OpenSearch version to which this PR applies, e.g. 2.14, 2.12--2.14, or all.

Frontend features

If you're submitting documentation for an OpenSearch Dashboards feature, add a video that shows how a user will interact with the UI step by step. A voiceover is optional.

Checklist

  • By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and subject to the Developers Certificate of Origin.
    For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link

Thank you for submitting your PR. The PR states are In progress (or Draft) -> Tech review -> Doc review -> Editorial review -> Merged.

Before you submit your PR for doc review, make sure the content is technically accurate. If you need help finding a tech reviewer, tag a maintainer.

When you're ready for doc review, tag the assignee of this PR. The doc reviewer may push edits to the PR directly or leave comments and editorial suggestions for you to address (let us know in a comment if you have a preference). The doc reviewer will arrange for an editorial review.

@hdhalter hdhalter added release-notes PR: Include this PR in the automated release notes v2.16.0 labels Jul 29, 2024
@kolchfa-aws kolchfa-aws assigned kolchfa-aws and unassigned hdhalter Jul 29, 2024
@hdhalter hdhalter added the 4 - Doc review PR: Doc review in progress label Jul 29, 2024
@hdhalter hdhalter changed the title add batch inference API Add batch inference API Jul 29, 2024
For information about user access for this API, see [Model access control considerations]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/index/#model-access-control-considerations).


For information about connectors and remote models, see [Connecting to externally hosted models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/remote-models/index/). For more details of the connector blurprints for batch predict, see [GitHub docs](https://github.com/opensearch-project/ml-commons/blob/main/docs/remote_inference_blueprints/batch_inference_openAI_connector_blueprint.md)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think best action item would be put this blueprint under a sub folder for batch_prediction and link to that sub folder. In this way, if we add blueprints for sagemaker and cohere later, CX will still find these in the sub folder.

Right now we are saying this will work for Sagemaker or Cohere but there's no example for this. And also why customer needs to go another link to get the same blue print what is here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Dhrubo. This will avoid having to maintain a list of blueprints here in the documentation. @Zhangxunmt could you create a subfolder so we can link to that from the docs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine to have a subfolder for offline actions. But please note that this is the API page showing how this API works. So let's keep this page directly to the point. Using OpenAI as an example to show this API is enough. Other details will be documented elsewhere in blueprints or tutorials.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this API page should be simple. Ideally, I would even remove the prerequisite steps from this API. However, our users have a very disjointed experience when going back and forth from the doc website to the ML repo. I didn't realize blueprints contained the workflow and not just the blueprint itself. I think we should port all ML blueprints and tutorials to the doc repo and have them on the doc site. I can take this on once this version is released. For now, it's fine to leave this API page with the current information.

Copy link
Collaborator

@kolchfa-aws kolchfa-aws left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @Zhangxunmt! Please see my comments below.

---
layout: default
title: Batch inference
parent: Model APIs
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We currently have the Predict API under the train-predict directory, not model-apis. Either we need to move this one to train-predict, or we can move the predict API into the model-apis section. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes more sense to move Predict to the model-apis section. The training part doesn't matter much as most of the cases are remote models or pre-trained models which are directly predicable.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Zhangxunmt Should we move train APIs to the model-apis section as well so the train-predict and model-apis sections are combined?


# Batch inference

ML Commons can predict large datasets in an offline asynchronous mode with your remote model deployed in external model servers. To use the Batch_Predict API, the `model_id` for a remote model is required. This new API is released as an experimental feature in the OpenSearch version 2.16, and only SageMaker, Cohere, and OpenAI are verified as the external servers that support this features.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ML Commons can predict large datasets in an offline asynchronous mode with your remote model deployed in external model servers. To use the Batch_Predict API, the `model_id` for a remote model is required. This new API is released as an experimental feature in the OpenSearch version 2.16, and only SageMaker, Cohere, and OpenAI are verified as the external servers that support this features.
ML Commons can perform inference on large datasets in an offline asynchronous mode using a model deployed on external model servers. To use the Batch Predict API, you must provide the `model_id` for an externally hosted model. This new API is released as experimental in OpenSearch version 2.16, and only Amazon SageMaker, Cohere, and OpenAI are verified as the external servers that support this feature.

grand_parent: ML Commons APIs
nav_order: 20
---

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add an experimental header https://github.com/opensearch-project/documentation-website/blob/main/templates/EXPERIMENTAL_TEMPLATE.md and provide either a link to an issue where users can track the progress of the feature or a link to the OpenSearch forum.

For information about user access for this API, see [Model access control considerations]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/index/#model-access-control-considerations).


For information about connectors and remote models, see [Connecting to externally hosted models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/remote-models/index/). For more details of the connector blurprints for batch predict, see [GitHub docs](https://github.com/opensearch-project/ml-commons/blob/main/docs/remote_inference_blueprints/batch_inference_openAI_connector_blueprint.md)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
For information about connectors and remote models, see [Connecting to externally hosted models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/remote-models/index/). For more details of the connector blurprints for batch predict, see [GitHub docs](https://github.com/opensearch-project/ml-commons/blob/main/docs/remote_inference_blueprints/batch_inference_openAI_connector_blueprint.md)
For information about externally hosted models, see [Connecting to externally hosted models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/remote-models/index/). For the batch predict operation connector blueprints, see:
- [Amazon SageMaker batch predict connector blueprint](https://github.com/opensearch-project/ml-commons/blob/main/docs/remote_inference_blueprints/batch_inference_sagemaker_connector_blueprint.md).
- [OpenAI batch predict connector blueprint](https://github.com/opensearch-project/ml-commons/blob/main/docs/remote_inference_blueprints/batch_inference_openAI_connector_blueprint.md).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a Cohere blueprint for batch predict?

For information about user access for this API, see [Model access control considerations]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/index/#model-access-control-considerations).


For information about connectors and remote models, see [Connecting to externally hosted models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/remote-models/index/). For more details of the connector blurprints for batch predict, see [GitHub docs](https://github.com/opensearch-project/ml-commons/blob/main/docs/remote_inference_blueprints/batch_inference_openAI_connector_blueprint.md)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Dhrubo. This will avoid having to maintain a list of blueprints here in the documentation. @Zhangxunmt could you create a subfolder so we can link to that from the docs?

"model_id": "lyjxwZABNrAVdFa9zrcZ"
}
```

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
To check the status of the operation, provide the task ID to the [Tasks API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/tasks-apis/get-task/). Once the registration is complete, the task `state` changes to `COMPLETED`.

```

#### Example request

Copy link
Collaborator

@kolchfa-aws kolchfa-aws Aug 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Once you have completed the prerequisite steps, you can call the Batch Predict API. The parameters in the batch predict request override those defined in the connector:

POST /_plugins/_ml/models/lyjxwZABNrAVdFa9zrcZ/_batch_predict
{
"parameters": {
"model": "text-embedding-ada-002"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This parameter has the same value as the one in the connector. Can we show the users how to change this or any other parameters to a different value?

}
```
{% include copy-curl.html %}
The parameters in the batch_predict request will override those defined in the connector.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The parameters in the batch_predict request will override those defined in the connector.

{
"inference_results": [
{
"output": [
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We normally need to provide the descriptions of all response fields in the API doc. Is this the format of all batch predict responses? And where is the actual predict result? Maybe this API page should just show the API itself, and we need to add a complete end-to-end example under the remote-models section?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The actual predict results are in the output_file_id in the response, as this is offline asyc prediction. I provided some descriptions of this results in the OpenAI blueprint which is linked in this API. I think this page should just show the API itself and we should keep it simple and straight. The end-to-end example/explanation should be done in another tutorial page somewhere else?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have a complete example in a file under the remote-models directory.

"request_body": "{ \"input_file_id\": \"${parameters.input_file_id}\", \"endpoint\": \"${parameters.endpoint}\", \"completion_window\": \"24h\" }"
}
]
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
}
}
{% include copy-curl.html %}

"function_name": "remote",
"description": "OpenAI text embedding model",
"connector_id": "XU5UiokBpXT9icfOM0vt"
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
}
}
{% include copy-curl.html %}

Signed-off-by: Xun Zhang <[email protected]>
@Zhangxunmt Zhangxunmt mentioned this pull request Aug 2, 2024
1 task
@Zhangxunmt Zhangxunmt closed this Aug 2, 2024
@hdhalter hdhalter added Closed - Duplicate or Cancelled Issue: Nothing to be done and removed 4 - Doc review PR: Doc review in progress release-notes PR: Include this PR in the automated release notes v2.16.0 experimental labels Aug 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Closed - Duplicate or Cancelled Issue: Nothing to be done
Projects
None yet
5 participants