Skip to content

Commit

Permalink
Add Docs for Azure OpenAI Embeddings Inference (#107498)
Browse files Browse the repository at this point in the history
* Update docs for Azure OpenAI Embeddings inference

* cleanups

* update link for dot_product similarity

* final cleanups
  • Loading branch information
markjhoy authored Apr 16, 2024
1 parent 225edaf commit 624a5b1
Show file tree
Hide file tree
Showing 14 changed files with 394 additions and 20 deletions.
5 changes: 5 additions & 0 deletions docs/changelog/107178.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pr: 107178
summary: "Add support for Azure OpenAI embeddings to inference service"
area: Machine Learning
type: feature
issues: [ ]
79 changes: 75 additions & 4 deletions docs/reference/inference/put-inference.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@ experimental[]
Creates an {infer} endpoint to perform an {infer} task.

IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in
{ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, or
Hugging Face. For built-in models and models uploaded though
{ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure
OpenAI or Hugging Face. For built-in models and models uploaded though
Eland, the {infer} APIs offer an alternative way to use and manage trained
models. However, if you do not plan to use the {infer} APIs to use these models
or if you want to use non-NLP models, use the <<ml-df-trained-models-apis>>.
Expand Down Expand Up @@ -42,6 +42,7 @@ The following services are available through the {infer} API:
* ELSER
* Hugging Face
* OpenAI
* Azure OpenAI
* Elasticsearch (for built-in models and models uploaded through Eland)


Expand Down Expand Up @@ -78,6 +79,7 @@ Cohere service.
service.
* `openai`: specify the `completion` or `text_embedding` task type to use the
OpenAI service.
* `azureopenai`: specify the `text_embedding` task type to use the Azure OpenAI service.
* `elasticsearch`: specify the `text_embedding` task type to use the E5
built-in model or text embedding models uploaded by Eland.

Expand Down Expand Up @@ -187,6 +189,41 @@ https://platform.openai.com/account/organization[**Settings** > **Organizations*
(Optional, string)
The URL endpoint to use for the requests. Can be changed for testing purposes.
Defaults to `https://api.openai.com/v1/embeddings`.
=====
+
.`service_settings` for the `azureopenai` service
[%collapsible%closed]
=====
`api_key` or `entra_id`:::
(Required, string)
You must provide _either_ an API key or an Entra ID.
If you do not provide either, or provide both, you will receive an error when trying to create your model.
See the https://learn.microsoft.com/en-us/azure/ai-services/openai/reference#authentication[Azure OpenAI Authentication documentation] for more details on these authentication types.
IMPORTANT: You need to provide the API key or Entra ID only once, during the {infer} model creation.
The <<get-inference-api>> does not retrieve your authentication credentials.
After creating the {infer} model, you cannot change the associated API key or Entra ID.
If you want to use a different API key or Entra ID, delete the {infer} model and recreate it with the same name and the updated API key.
You _must_ have either an `api_key` or an `entra_id` defined.
If neither are present, an error will occur.
`resource_name`:::
(Required, string)
The name of your Azure OpenAI resource.
You can find this from the https://portal.azure.com/#view/HubsExtension/BrowseAll[list of resources] in the Azure Portal for your subscription.
`deployment_id`:::
(Required, string)
The deployment name of your deployed models.
Your Azure OpenAI deployments can be found though the https://oai.azure.com/[Azure OpenAI Studio] portal that is linked to your subscription.
`api_version`:::
(Required, string)
The Azure API version ID to use.
We recommend using the https://learn.microsoft.com/en-us/azure/ai-services/openai/reference#embeddings[latest supported non-preview version].
=====
+
.`service_settings` for the `elasticsearch` service
Expand Down Expand Up @@ -266,8 +303,17 @@ maximum token length. Defaults to `END`. Valid values are:
`user`:::
(optional, string)
For `openai` service only. Specifies the user issuing the request, which can be
used for abuse detection.
For `openai` and `azureopenai` service only. Specifies the user issuing the
request, which can be used for abuse detection.
=====
+
.`task_settings` for the `completion` task type
[%collapsible%closed]
=====
`user`:::
(optional, string)
For `openai` service only. Specifies the user issuing the request, which can be used for abuse detection.
=====


Expand Down Expand Up @@ -491,3 +537,28 @@ PUT _inference/completion/openai-completion
}
------------------------------------------------------------
// TEST[skip:TBD]

[discrete]
[[inference-example-azureopenai]]
===== Azure OpenAI service

The following example shows how to create an {infer} endpoint called
`azure_openai_embeddings` to perform a `text_embedding` task type.
Note that we do not specify a model here, as it is defined already via our Azure OpenAI deployment.

The list of embeddings models that you can choose from in your deployment can be found in the https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models#embeddings[Azure models documentation].

[source,console]
------------------------------------------------------------
PUT _inference/text_embedding/azure_openai_embeddings
{
"service": "azureopenai",
"service_settings": {
"api_key": "<api_key>",
"resource_name": "<resource_name>",
"deployment_id": "<deployment_id>",
"api_version": "2024-02-01"
}
}
------------------------------------------------------------
// TEST[skip:TBD]
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,12 @@
id="infer-api-ingest-openai">
OpenAI
</button>
<button role="tab"
aria-selected="false"
aria-controls="infer-api-ingest-azure-openai-tab"
id="infer-api-ingest-azure-openai">
Azure OpenAI
</button>
</div>
<div tabindex="0"
role="tabpanel"
Expand Down Expand Up @@ -50,7 +56,18 @@ include::infer-api-ingest-pipeline.asciidoc[tag=hugging-face]

include::infer-api-ingest-pipeline.asciidoc[tag=openai]

++++
</div>
<div tabindex="0"
role="tabpanel"
id="infer-api-ingest-azure-openai-tab"
aria-labelledby="infer-api-ingest-azure-openai"
hidden="">
++++

include::infer-api-ingest-pipeline.asciidoc[tag=azure-openai]

++++
</div>
</div>
++++
++++
Original file line number Diff line number Diff line change
Expand Up @@ -85,4 +85,30 @@ PUT _ingest/pipeline/openai_embeddings
<2> Configuration object that defines the `input_field` for the {infer} process
and the `output_field` that will contain the {infer} results.

// end::openai[]
// end::openai[]

// tag::azure-openai[]

[source,console]
--------------------------------------------------
PUT _ingest/pipeline/azure_openai_embeddings
{
"processors": [
{
"inference": {
"model_id": "azure_openai_embeddings", <1>
"input_output": { <2>
"input_field": "content",
"output_field": "content_embedding"
}
}
}
]
}
--------------------------------------------------
<1> The name of the inference endpoint you created by using the
<<put-inference-api>>, it's referred to as `inference_id` in that step.
<2> Configuration object that defines the `input_field` for the {infer} process
and the `output_field` that will contain the {infer} results.

// end::azure-openai[]
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,12 @@
id="infer-api-mapping-openai">
OpenAI
</button>
<button role="tab"
aria-selected="false"
aria-controls="infer-api-mapping-azure-openai-tab"
id="infer-api-mapping-azure-openai">
Azure OpenAI
</button>
</div>
<div tabindex="0"
role="tabpanel"
Expand Down Expand Up @@ -49,7 +55,18 @@ include::infer-api-mapping.asciidoc[tag=hugging-face]

include::infer-api-mapping.asciidoc[tag=openai]

++++
</div>
<div tabindex="0"
role="tabpanel"
id="infer-api-mapping-azure-openai-tab"
aria-labelledby="infer-api-mapping-azure-openai"
hidden="">
++++

include::infer-api-mapping.asciidoc[tag=azure-openai]

++++
</div>
</div>
++++
++++
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@ PUT openai-embeddings
}
}
--------------------------------------------------
<1> The name of the field to contain the generated tokens. It must be refrenced
<1> The name of the field to contain the generated tokens. It must be referenced
in the {infer} pipeline configuration in the next step.
<2> The field to contain the tokens is a `dense_vector` field.
<3> The output dimensions of the model. Find this value in the
Expand All @@ -99,4 +99,43 @@ In this example, the name of the field is `content`. It must be referenced in
the {infer} pipeline configuration in the next step.
<6> The field type which is text in this example.

// end::openai[]
// end::openai[]

// tag::azure-openai[]

[source,console]
--------------------------------------------------
PUT azure-openai-embeddings
{
"mappings": {
"properties": {
"content_embedding": { <1>
"type": "dense_vector", <2>
"dims": 1536, <3>
"element_type": "float",
"similarity": "dot_product" <4>
},
"content": { <5>
"type": "text" <6>
}
}
}
}
--------------------------------------------------
<1> The name of the field to contain the generated tokens. It must be referenced
in the {infer} pipeline configuration in the next step.
<2> The field to contain the tokens is a `dense_vector` field.
<3> The output dimensions of the model. Find this value in the
https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models#embeddings-models[Azure OpenAI documentation]
of the model you use.
<4> For Azure OpenAI embeddings, the `dot_product` function should be used to
calculate similarity as Azure OpenAI embeddings are normalised to unit length.
See the
https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/understand-embeddings[Azure OpenAI embeddings]
documentation for more information on the model specifications.
<5> The name of the field from which to create the dense vector representation.
In this example, the name of the field is `content`. It must be referenced in
the {infer} pipeline configuration in the next step.
<6> The field type which is text in this example.

// end::azure-openai[]
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,12 @@
id="infer-api-reindex-openai">
OpenAI
</button>
<button role="tab"
aria-selected="false"
aria-controls="infer-api-reindex-azure-openai-tab"
id="infer-api-reindex-azure-openai">
Azure OpenAI
</button>
</div>
<div tabindex="0"
role="tabpanel"
Expand Down Expand Up @@ -50,7 +56,19 @@ include::infer-api-reindex.asciidoc[tag=hugging-face]

include::infer-api-reindex.asciidoc[tag=openai]

++++
</div>
</div>
<div tabindex="0"
role="tabpanel"
id="infer-api-reindex-azure-openai-tab"
aria-labelledby="infer-api-reindex-azure-openai"
hidden="">
++++

include::infer-api-reindex.asciidoc[tag=azure-openai]

++++
</div>
</div>
++++
++++
Original file line number Diff line number Diff line change
Expand Up @@ -75,4 +75,32 @@ https://platform.openai.com/account/limits[rate limit of your OpenAI account]
may affect the throughput of the reindexing process. If this happens, change
`size` to `3` or a similar value in magnitude.

// end::openai[]
// end::openai[]

// tag::azure-openai[]

[source,console]
----
POST _reindex?wait_for_completion=false
{
"source": {
"index": "test-data",
"size": 50 <1>
},
"dest": {
"index": "azure-openai-embeddings",
"pipeline": "azure_openai_embeddings"
}
}
----
// TEST[skip:TBD]
<1> The default batch size for reindexing is 1000. Reducing `size` to a smaller
number makes the update of the reindexing process quicker which enables you to
follow the progress closely and detect errors early.

NOTE: The
https://learn.microsoft.com/en-us/azure/ai-services/openai/quotas-limits#quotas-and-limits-reference[rate limit of your Azure OpenAI account]
may affect the throughput of the reindexing process. If this happens, change
`size` to `3` or a similar value in magnitude.

// end::azure-openai[]
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,12 @@
id="infer-api-requirements-openai">
OpenAI
</button>
<button role="tab"
aria-selected="false"
aria-controls="infer-api-requirements-azure-openai-tab"
id="infer-api-requirements-azure-openai">
Azure OpenAI
</button>
</div>
<div tabindex="0"
role="tabpanel"
Expand Down Expand Up @@ -50,7 +56,18 @@ include::infer-api-requirements.asciidoc[tag=hugging-face]

include::infer-api-requirements.asciidoc[tag=openai]

++++
</div>
<div tabindex="0"
role="tabpanel"
id="infer-api-requirements-azure-openai-tab"
aria-labelledby="infer-api-requirements-azure-openai"
hidden="">
++++

include::infer-api-requirements.asciidoc[tag=azure-openai]

++++
</div>
</div>
++++
++++
Original file line number Diff line number Diff line change
Expand Up @@ -17,4 +17,12 @@ API with the HuggingFace service.
An https://openai.com/[OpenAI account] is required to use the {infer} API with
the OpenAI service.

// end::openai[]
// end::openai[]

// tag::azure-openai[]
* An https://azure.microsoft.com/free/cognitive-services?azure-portal=true[Azure subscription]
* Access granted to Azure OpenAI in the desired Azure subscription.
You can apply for access to Azure OpenAI by completing the form at https://aka.ms/oai/access.
* An embedding model deployed in https://oai.azure.com/[Azure OpenAI Studio].
// end::azure-openai[]
Loading

0 comments on commit 624a5b1

Please sign in to comment.