From 624a5b1fe52c7160f9a2cb4f892a947047ee6bdf Mon Sep 17 00:00:00 2001 From: "Mark J. Hoy" Date: Tue, 16 Apr 2024 10:52:27 -0400 Subject: [PATCH] Add Docs for Azure OpenAI Embeddings Inference (#107498) * Update docs for Azure OpenAI Embeddings inference * cleanups * update link for dot_product similarity * final cleanups --- docs/changelog/107178.yaml | 5 ++ .../inference/put-inference.asciidoc | 79 ++++++++++++++++++- .../infer-api-ingest-pipeline-widget.asciidoc | 19 ++++- .../infer-api-ingest-pipeline.asciidoc | 28 ++++++- .../infer-api-mapping-widget.asciidoc | 19 ++++- .../inference-api/infer-api-mapping.asciidoc | 43 +++++++++- .../infer-api-reindex-widget.asciidoc | 20 ++++- .../inference-api/infer-api-reindex.asciidoc | 30 ++++++- .../infer-api-requirements-widget.asciidoc | 19 ++++- .../infer-api-requirements.asciidoc | 10 ++- .../infer-api-search-widget.asciidoc | 19 ++++- .../inference-api/infer-api-search.asciidoc | 67 +++++++++++++++- .../infer-api-task-widget.asciidoc | 19 ++++- .../inference-api/infer-api-task.asciidoc | 37 ++++++++- 14 files changed, 394 insertions(+), 20 deletions(-) create mode 100644 docs/changelog/107178.yaml diff --git a/docs/changelog/107178.yaml b/docs/changelog/107178.yaml new file mode 100644 index 0000000000000..94a91357d38e6 --- /dev/null +++ b/docs/changelog/107178.yaml @@ -0,0 +1,5 @@ +pr: 107178 +summary: "Add support for Azure OpenAI embeddings to inference service" +area: Machine Learning +type: feature +issues: [ ] diff --git a/docs/reference/inference/put-inference.asciidoc b/docs/reference/inference/put-inference.asciidoc index 332752e52f068..1f73cd08401ee 100644 --- a/docs/reference/inference/put-inference.asciidoc +++ b/docs/reference/inference/put-inference.asciidoc @@ -7,8 +7,8 @@ experimental[] Creates an {infer} endpoint to perform an {infer} task. IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in -{ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, or -Hugging Face. For built-in models and models uploaded though +{ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure +OpenAI or Hugging Face. For built-in models and models uploaded though Eland, the {infer} APIs offer an alternative way to use and manage trained models. However, if you do not plan to use the {infer} APIs to use these models or if you want to use non-NLP models, use the <>. @@ -42,6 +42,7 @@ The following services are available through the {infer} API: * ELSER * Hugging Face * OpenAI +* Azure OpenAI * Elasticsearch (for built-in models and models uploaded through Eland) @@ -78,6 +79,7 @@ Cohere service. service. * `openai`: specify the `completion` or `text_embedding` task type to use the OpenAI service. +* `azureopenai`: specify the `text_embedding` task type to use the Azure OpenAI service. * `elasticsearch`: specify the `text_embedding` task type to use the E5 built-in model or text embedding models uploaded by Eland. @@ -187,6 +189,41 @@ https://platform.openai.com/account/organization[**Settings** > **Organizations* (Optional, string) The URL endpoint to use for the requests. Can be changed for testing purposes. Defaults to `https://api.openai.com/v1/embeddings`. + +===== ++ +.`service_settings` for the `azureopenai` service +[%collapsible%closed] +===== + +`api_key` or `entra_id`::: +(Required, string) +You must provide _either_ an API key or an Entra ID. +If you do not provide either, or provide both, you will receive an error when trying to create your model. +See the https://learn.microsoft.com/en-us/azure/ai-services/openai/reference#authentication[Azure OpenAI Authentication documentation] for more details on these authentication types. + +IMPORTANT: You need to provide the API key or Entra ID only once, during the {infer} model creation. +The <> does not retrieve your authentication credentials. +After creating the {infer} model, you cannot change the associated API key or Entra ID. +If you want to use a different API key or Entra ID, delete the {infer} model and recreate it with the same name and the updated API key. +You _must_ have either an `api_key` or an `entra_id` defined. +If neither are present, an error will occur. + +`resource_name`::: +(Required, string) +The name of your Azure OpenAI resource. +You can find this from the https://portal.azure.com/#view/HubsExtension/BrowseAll[list of resources] in the Azure Portal for your subscription. + +`deployment_id`::: +(Required, string) +The deployment name of your deployed models. +Your Azure OpenAI deployments can be found though the https://oai.azure.com/[Azure OpenAI Studio] portal that is linked to your subscription. + +`api_version`::: +(Required, string) +The Azure API version ID to use. +We recommend using the https://learn.microsoft.com/en-us/azure/ai-services/openai/reference#embeddings[latest supported non-preview version]. + ===== + .`service_settings` for the `elasticsearch` service @@ -266,8 +303,17 @@ maximum token length. Defaults to `END`. Valid values are: `user`::: (optional, string) -For `openai` service only. Specifies the user issuing the request, which can be -used for abuse detection. +For `openai` and `azureopenai` service only. Specifies the user issuing the +request, which can be used for abuse detection. + +===== ++ +.`task_settings` for the `completion` task type +[%collapsible%closed] +===== +`user`::: +(optional, string) +For `openai` service only. Specifies the user issuing the request, which can be used for abuse detection. ===== @@ -491,3 +537,28 @@ PUT _inference/completion/openai-completion } ------------------------------------------------------------ // TEST[skip:TBD] + +[discrete] +[[inference-example-azureopenai]] +===== Azure OpenAI service + +The following example shows how to create an {infer} endpoint called +`azure_openai_embeddings` to perform a `text_embedding` task type. +Note that we do not specify a model here, as it is defined already via our Azure OpenAI deployment. + +The list of embeddings models that you can choose from in your deployment can be found in the https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models#embeddings[Azure models documentation]. + +[source,console] +------------------------------------------------------------ +PUT _inference/text_embedding/azure_openai_embeddings +{ + "service": "azureopenai", + "service_settings": { + "api_key": "", + "resource_name": "", + "deployment_id": "", + "api_version": "2024-02-01" + } +} +------------------------------------------------------------ +// TEST[skip:TBD] diff --git a/docs/reference/tab-widgets/inference-api/infer-api-ingest-pipeline-widget.asciidoc b/docs/reference/tab-widgets/inference-api/infer-api-ingest-pipeline-widget.asciidoc index 069dcb61f81b0..4baada19998e8 100644 --- a/docs/reference/tab-widgets/inference-api/infer-api-ingest-pipeline-widget.asciidoc +++ b/docs/reference/tab-widgets/inference-api/infer-api-ingest-pipeline-widget.asciidoc @@ -19,6 +19,12 @@ id="infer-api-ingest-openai"> OpenAI +
+
-++++ \ No newline at end of file +++++ diff --git a/docs/reference/tab-widgets/inference-api/infer-api-ingest-pipeline.asciidoc b/docs/reference/tab-widgets/inference-api/infer-api-ingest-pipeline.asciidoc index 869e41a4ca7d1..f50b866e8a5b1 100644 --- a/docs/reference/tab-widgets/inference-api/infer-api-ingest-pipeline.asciidoc +++ b/docs/reference/tab-widgets/inference-api/infer-api-ingest-pipeline.asciidoc @@ -85,4 +85,30 @@ PUT _ingest/pipeline/openai_embeddings <2> Configuration object that defines the `input_field` for the {infer} process and the `output_field` that will contain the {infer} results. -// end::openai[] \ No newline at end of file +// end::openai[] + +// tag::azure-openai[] + +[source,console] +-------------------------------------------------- +PUT _ingest/pipeline/azure_openai_embeddings +{ + "processors": [ + { + "inference": { + "model_id": "azure_openai_embeddings", <1> + "input_output": { <2> + "input_field": "content", + "output_field": "content_embedding" + } + } + } + ] +} +-------------------------------------------------- +<1> The name of the inference endpoint you created by using the +<>, it's referred to as `inference_id` in that step. +<2> Configuration object that defines the `input_field` for the {infer} process +and the `output_field` that will contain the {infer} results. + +// end::azure-openai[] diff --git a/docs/reference/tab-widgets/inference-api/infer-api-mapping-widget.asciidoc b/docs/reference/tab-widgets/inference-api/infer-api-mapping-widget.asciidoc index 9d94ce880988a..e35ee712b8f56 100644 --- a/docs/reference/tab-widgets/inference-api/infer-api-mapping-widget.asciidoc +++ b/docs/reference/tab-widgets/inference-api/infer-api-mapping-widget.asciidoc @@ -19,6 +19,12 @@ id="infer-api-mapping-openai"> OpenAI +
+
-++++ \ No newline at end of file +++++ diff --git a/docs/reference/tab-widgets/inference-api/infer-api-mapping.asciidoc b/docs/reference/tab-widgets/inference-api/infer-api-mapping.asciidoc index 6803b73c06879..037c5957b01ff 100644 --- a/docs/reference/tab-widgets/inference-api/infer-api-mapping.asciidoc +++ b/docs/reference/tab-widgets/inference-api/infer-api-mapping.asciidoc @@ -84,7 +84,7 @@ PUT openai-embeddings } } -------------------------------------------------- -<1> The name of the field to contain the generated tokens. It must be refrenced +<1> The name of the field to contain the generated tokens. It must be referenced in the {infer} pipeline configuration in the next step. <2> The field to contain the tokens is a `dense_vector` field. <3> The output dimensions of the model. Find this value in the @@ -99,4 +99,43 @@ In this example, the name of the field is `content`. It must be referenced in the {infer} pipeline configuration in the next step. <6> The field type which is text in this example. -// end::openai[] \ No newline at end of file +// end::openai[] + +// tag::azure-openai[] + +[source,console] +-------------------------------------------------- +PUT azure-openai-embeddings +{ + "mappings": { + "properties": { + "content_embedding": { <1> + "type": "dense_vector", <2> + "dims": 1536, <3> + "element_type": "float", + "similarity": "dot_product" <4> + }, + "content": { <5> + "type": "text" <6> + } + } + } +} +-------------------------------------------------- +<1> The name of the field to contain the generated tokens. It must be referenced +in the {infer} pipeline configuration in the next step. +<2> The field to contain the tokens is a `dense_vector` field. +<3> The output dimensions of the model. Find this value in the +https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models#embeddings-models[Azure OpenAI documentation] +of the model you use. +<4> For Azure OpenAI embeddings, the `dot_product` function should be used to +calculate similarity as Azure OpenAI embeddings are normalised to unit length. +See the +https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/understand-embeddings[Azure OpenAI embeddings] +documentation for more information on the model specifications. +<5> The name of the field from which to create the dense vector representation. +In this example, the name of the field is `content`. It must be referenced in +the {infer} pipeline configuration in the next step. +<6> The field type which is text in this example. + +// end::azure-openai[] diff --git a/docs/reference/tab-widgets/inference-api/infer-api-reindex-widget.asciidoc b/docs/reference/tab-widgets/inference-api/infer-api-reindex-widget.asciidoc index 9a78868e44da1..58dac586ba234 100644 --- a/docs/reference/tab-widgets/inference-api/infer-api-reindex-widget.asciidoc +++ b/docs/reference/tab-widgets/inference-api/infer-api-reindex-widget.asciidoc @@ -19,6 +19,12 @@ id="infer-api-reindex-openai"> OpenAI +
+
+ -++++ \ No newline at end of file +++++ diff --git a/docs/reference/tab-widgets/inference-api/infer-api-reindex.asciidoc b/docs/reference/tab-widgets/inference-api/infer-api-reindex.asciidoc index 118f7f0460924..e97a7187415f1 100644 --- a/docs/reference/tab-widgets/inference-api/infer-api-reindex.asciidoc +++ b/docs/reference/tab-widgets/inference-api/infer-api-reindex.asciidoc @@ -75,4 +75,32 @@ https://platform.openai.com/account/limits[rate limit of your OpenAI account] may affect the throughput of the reindexing process. If this happens, change `size` to `3` or a similar value in magnitude. -// end::openai[] \ No newline at end of file +// end::openai[] + +// tag::azure-openai[] + +[source,console] +---- +POST _reindex?wait_for_completion=false +{ + "source": { + "index": "test-data", + "size": 50 <1> + }, + "dest": { + "index": "azure-openai-embeddings", + "pipeline": "azure_openai_embeddings" + } +} +---- +// TEST[skip:TBD] +<1> The default batch size for reindexing is 1000. Reducing `size` to a smaller +number makes the update of the reindexing process quicker which enables you to +follow the progress closely and detect errors early. + +NOTE: The +https://learn.microsoft.com/en-us/azure/ai-services/openai/quotas-limits#quotas-and-limits-reference[rate limit of your Azure OpenAI account] +may affect the throughput of the reindexing process. If this happens, change +`size` to `3` or a similar value in magnitude. + +// end::azure-openai[] diff --git a/docs/reference/tab-widgets/inference-api/infer-api-requirements-widget.asciidoc b/docs/reference/tab-widgets/inference-api/infer-api-requirements-widget.asciidoc index c55056cd1a3d2..781ddb43cb352 100644 --- a/docs/reference/tab-widgets/inference-api/infer-api-requirements-widget.asciidoc +++ b/docs/reference/tab-widgets/inference-api/infer-api-requirements-widget.asciidoc @@ -19,6 +19,12 @@ id="infer-api-requirements-openai"> OpenAI +
+
-++++ \ No newline at end of file +++++ diff --git a/docs/reference/tab-widgets/inference-api/infer-api-requirements.asciidoc b/docs/reference/tab-widgets/inference-api/infer-api-requirements.asciidoc index 21a9d2111ef74..e67a905e1e97d 100644 --- a/docs/reference/tab-widgets/inference-api/infer-api-requirements.asciidoc +++ b/docs/reference/tab-widgets/inference-api/infer-api-requirements.asciidoc @@ -17,4 +17,12 @@ API with the HuggingFace service. An https://openai.com/[OpenAI account] is required to use the {infer} API with the OpenAI service. -// end::openai[] \ No newline at end of file +// end::openai[] + +// tag::azure-openai[] +* An https://azure.microsoft.com/free/cognitive-services?azure-portal=true[Azure subscription] +* Access granted to Azure OpenAI in the desired Azure subscription. +You can apply for access to Azure OpenAI by completing the form at https://aka.ms/oai/access. +* An embedding model deployed in https://oai.azure.com/[Azure OpenAI Studio]. + +// end::azure-openai[] diff --git a/docs/reference/tab-widgets/inference-api/infer-api-search-widget.asciidoc b/docs/reference/tab-widgets/inference-api/infer-api-search-widget.asciidoc index e945146e22ca4..d3b7ba96bb199 100644 --- a/docs/reference/tab-widgets/inference-api/infer-api-search-widget.asciidoc +++ b/docs/reference/tab-widgets/inference-api/infer-api-search-widget.asciidoc @@ -19,6 +19,12 @@ id="infer-api-search-openai"> OpenAI +
+
-++++ \ No newline at end of file +++++ diff --git a/docs/reference/tab-widgets/inference-api/infer-api-search.asciidoc b/docs/reference/tab-widgets/inference-api/infer-api-search.asciidoc index 1aa3b6f2f2ae8..04515d0040eaf 100644 --- a/docs/reference/tab-widgets/inference-api/infer-api-search.asciidoc +++ b/docs/reference/tab-widgets/inference-api/infer-api-search.asciidoc @@ -209,4 +209,69 @@ query from the `openai-embeddings` index sorted by their proximity to the query: -------------------------------------------------- // NOTCONSOLE -// end::openai[] \ No newline at end of file +// end::openai[] + +// tag::azure-openai[] + +[source,console] +-------------------------------------------------- +GET azure-openai-embeddings/_search +{ + "knn": { + "field": "content_embedding", + "query_vector_builder": { + "text_embedding": { + "model_id": "azure_openai_embeddings", + "model_text": "Calculate fuel cost" + } + }, + "k": 10, + "num_candidates": 100 + }, + "_source": [ + "id", + "content" + ] +} +-------------------------------------------------- +// TEST[skip:TBD] + +As a result, you receive the top 10 documents that are closest in meaning to the +query from the `openai-embeddings` index sorted by their proximity to the query: + +[source,consol-result] +-------------------------------------------------- +"hits": [ + { + "_index": "azure-openai-embeddings", + "_id": "DDd5OowBHxQKHyc3TDSC", + "_score": 0.83704096, + "_source": { + "id": 862114, + "body": "How to calculate fuel cost for a road trip. By Tara Baukus Mello • Bankrate.com. Dear Driving for Dollars, My family is considering taking a long road trip to finish off the end of the summer, but I'm a little worried about gas prices and our overall fuel cost.It doesn't seem easy to calculate since we'll be traveling through many states and we are considering several routes.y family is considering taking a long road trip to finish off the end of the summer, but I'm a little worried about gas prices and our overall fuel cost. It doesn't seem easy to calculate since we'll be traveling through many states and we are considering several routes." + } + }, + { + "_index": "azure-openai-embeddings", + "_id": "ajd5OowBHxQKHyc3TDSC", + "_score": 0.8345704, + "_source": { + "id": 820622, + "body": "Home Heating Calculator. Typically, approximately 50% of the energy consumed in a home annually is for space heating. When deciding on a heating system, many factors will come into play: cost of fuel, installation cost, convenience and life style are all important.This calculator can help you estimate the cost of fuel for different heating appliances.hen deciding on a heating system, many factors will come into play: cost of fuel, installation cost, convenience and life style are all important. This calculator can help you estimate the cost of fuel for different heating appliances." + } + }, + { + "_index": "azure-openai-embeddings", + "_id": "Djd5OowBHxQKHyc3TDSC", + "_score": 0.8327426, + "_source": { + "id": 8202683, + "body": "Fuel is another important cost. This cost will depend on your boat, how far you travel, and how fast you travel. A 33-foot sailboat traveling at 7 knots should be able to travel 300 miles on 50 gallons of diesel fuel.If you are paying $4 per gallon, the trip would cost you $200.Most boats have much larger gas tanks than cars.uel is another important cost. This cost will depend on your boat, how far you travel, and how fast you travel. A 33-foot sailboat traveling at 7 knots should be able to travel 300 miles on 50 gallons of diesel fuel." + } + }, + (...) + ] +-------------------------------------------------- +// NOTCONSOLE + +// end::azure-openai[] diff --git a/docs/reference/tab-widgets/inference-api/infer-api-task-widget.asciidoc b/docs/reference/tab-widgets/inference-api/infer-api-task-widget.asciidoc index ebc8d093d01a0..aac26913f955e 100644 --- a/docs/reference/tab-widgets/inference-api/infer-api-task-widget.asciidoc +++ b/docs/reference/tab-widgets/inference-api/infer-api-task-widget.asciidoc @@ -19,6 +19,12 @@ id="infer-api-task-openai"> OpenAI +
+
-++++ \ No newline at end of file +++++ diff --git a/docs/reference/tab-widgets/inference-api/infer-api-task.asciidoc b/docs/reference/tab-widgets/inference-api/infer-api-task.asciidoc index efbf1f8f25f56..07d5177b60344 100644 --- a/docs/reference/tab-widgets/inference-api/infer-api-task.asciidoc +++ b/docs/reference/tab-widgets/inference-api/infer-api-task.asciidoc @@ -13,7 +13,7 @@ PUT _inference/text_embedding/cohere_embeddings <1> } ------------------------------------------------------------ // TEST[skip:TBD] -<1> The task type is `text_embedding` in the path and the `inference_id` which +<1> The task type is `text_embedding` in the path and the `inference_id` which is the unique identifier of the {infer} endpoint is `cohere_embeddings`. <2> The API key of your Cohere account. You can find your API keys in your Cohere dashboard under the @@ -54,7 +54,7 @@ PUT _inference/text_embedding/hugging_face_embeddings <1> } ------------------------------------------------------------ // TEST[skip:TBD] -<1> The task type is `text_embedding` in the path and the `inference_id` which +<1> The task type is `text_embedding` in the path and the `inference_id` which is the unique identifier of the {infer} endpoint is `hugging_face_embeddings`. <2> A valid HuggingFace access token. You can find on the https://huggingface.co/settings/tokens[settings page of your account]. @@ -77,7 +77,7 @@ PUT _inference/text_embedding/openai_embeddings <1> } ------------------------------------------------------------ // TEST[skip:TBD] -<1> The task type is `text_embedding` in the path and the `inference_id` which +<1> The task type is `text_embedding` in the path and the `inference_id` which is the unique identifier of the {infer} endpoint is `openai_embeddings`. <2> The API key of your OpenAI account. You can find your OpenAI API keys in your OpenAI account under the @@ -93,4 +93,33 @@ NOTE: When using this model the recommended similarity measure to use in the embeddings are normalized to unit length in which case the `dot_product` and the `cosine` measures are equivalent. -// end::openai[] \ No newline at end of file +// end::openai[] + +// tag::azure-openai[] + +[source,console] +------------------------------------------------------------ +PUT _inference/text_embedding/azure_openai_embeddings <1> +{ + "service": "azureopenai", + "service_settings": { + "api_key": "", <2> + "resource_name": "", <3> + "deployment_id": "", <4> + "api_version": "2024-02-01" + } +} +------------------------------------------------------------ +// TEST[skip:TBD] +<1> The task type is `text_embedding` in the path and the `inference_id` which is the unique identifier of the {infer} endpoint is `azure_openai_embeddings`. +<2> The API key for accessing your Azure OpenAI services. +Alternately, you can provide an `entra_id` instead of an `api_key` here. +The <> does not return this information. +<3> The name our your Azure resource. +<4> The id of your deployed model. + +NOTE: When using this model the recommended similarity measure to use in the +`dense_vector` field mapping is `dot_product`. +In the case of Azure OpenAI models, the embeddings are normalized to unit length in which case the `dot_product` and the `cosine` measures are equivalent. + +// end::azure-openai[]