From c530ddc0296a6394e686674298877d005d634486 Mon Sep 17 00:00:00 2001 From: Joan Martinez Date: Mon, 16 Dec 2024 17:37:26 +0100 Subject: [PATCH 1/7] add documentation for jinaai service --- .../inference/put-inference.asciidoc | 3 +- .../inference/service-jinaai.asciidoc | 253 ++++++++++++++++++ 2 files changed, 255 insertions(+), 1 deletion(-) create mode 100644 docs/reference/inference/service-jinaai.asciidoc diff --git a/docs/reference/inference/put-inference.asciidoc b/docs/reference/inference/put-inference.asciidoc index 4f82889f562d8..d6abe90e48956 100644 --- a/docs/reference/inference/put-inference.asciidoc +++ b/docs/reference/inference/put-inference.asciidoc @@ -72,6 +72,7 @@ Click the links to review the configuration details of the services: * <> (`text_embedding`) * <> (`completion`, `text_embedding`) * <> (`text_embedding`) +* <> (`text_embedding`, `rerank`) The {es} and ELSER services run on a {ml} node in your {es} cluster. The rest of the services connect to external providers. @@ -87,4 +88,4 @@ When adaptive allocations are enabled: - The number of allocations scales up automatically when the load increases. - Allocations scale down to a minimum of 0 when the load decreases, saving resources. -For more information about adaptive allocations and resources, refer to the {ml-docs}/ml-nlp-auto-scale.html[trained model autoscaling] documentation. \ No newline at end of file +For more information about adaptive allocations and resources, refer to the {ml-docs}/ml-nlp-auto-scale.html[trained model autoscaling] documentation. diff --git a/docs/reference/inference/service-jinaai.asciidoc b/docs/reference/inference/service-jinaai.asciidoc new file mode 100644 index 0000000000000..d569e589a73e1 --- /dev/null +++ b/docs/reference/inference/service-jinaai.asciidoc @@ -0,0 +1,253 @@ +[[infer-service-jinaai]] +=== JinaAI {infer} service + +Creates an {infer} endpoint to perform an {infer} task with the `jinaai` service. + + +[discrete] +[[infer-service-jinaai-api-request]] +==== {api-request-title} + +`PUT /_inference//` + +[discrete] +[[infer-service-jinaai-api-path-params]] +==== {api-path-parms-title} + +``:: +(Required, string) +include::inference-shared.asciidoc[tag=inference-id] + +``:: +(Required, string) +include::inference-shared.asciidoc[tag=task-type] ++ +-- +Available task types: + +* `text_embedding`, +* `rerank`. +-- + +[discrete] +[[infer-service-jinaai-api-request-body]] +==== {api-request-body-title} + +`chunking_settings`:: +(Optional, object) +include::inference-shared.asciidoc[tag=chunking-settings] + +`max_chunking_size`::: +(Optional, integer) +include::inference-shared.asciidoc[tag=chunking-settings-max-chunking-size] + +`overlap`::: +(Optional, integer) +include::inference-shared.asciidoc[tag=chunking-settings-overlap] + +`sentence_overlap`::: +(Optional, integer) +include::inference-shared.asciidoc[tag=chunking-settings-sentence-overlap] + +`strategy`::: +(Optional, string) +include::inference-shared.asciidoc[tag=chunking-settings-strategy] + +`service`:: +(Required, string) +The type of service supported for the specified task type. In this case, +`jinaai`. + +`service_settings`:: +(Required, object) +include::inference-shared.asciidoc[tag=service-settings] ++ +-- +These settings are specific to the `jinaai` service. +-- + +`api_key`::: +(Required, string) +A valid API key of your JinaAI account. +You can find in: +https://jina.ai/embeddings/. ++ +-- +include::inference-shared.asciidoc[tag=api-key-admonition] +-- + +`rate_limit`::: +(Optional, object) +By default, the `jinaai` service sets the number of requests allowed per minute to `2000`. +This value is the same for all task types. +To modify this, set the `requests_per_minute` setting of this object in your service settings: ++ +-- +include::inference-shared.asciidoc[tag=request-per-minute-example] + +More information about JinaAI's rate limits can be found in https://jina.ai/contact-sales/#rate-limit. +-- ++ +.`service_settings` for the `rerank` task type +[%collapsible%closed] +===== +`model_id`:: +(Optional, string) +The name of the model to use for the {infer} task. +To review the available `rerank` models, refer to the +https://jina.ai/reranker. +===== ++ +.`service_settings` for the `text_embedding` task type +[%collapsible%closed] +===== +`model_id`::: +(Optional, string) +The name of the model to use for the {infer} task. +To review the available `text_embedding` models, refer to the +https://jina.ai/embeddings/. + +`similarity`::: +(Optional, string) +Similarity measure. One of `cosine`, `dot_product`, `l2_norm`. +Defaults based on the `embedding_type` (`float` -> `dot_product`, `int8/byte` -> `cosine`). +===== + + + +`task_settings`:: +(Optional, object) +include::inference-shared.asciidoc[tag=task-settings] ++ +.`task_settings` for the `rerank` task type +[%collapsible%closed] +===== +`return_documents`:: +(Optional, boolean) +Specify whether to return doc text within the results. + +`top_n`:: +(Optional, integer) +The number of most relevant documents to return, defaults to the number of the documents. +If this {infer} endpoint is used in a `text_similarity_reranker` retriever query and `top_n` is set, it must be greater than or equal to `rank_window_size` in the query. +===== ++ +.`task_settings` for the `text_embedding` task type +[%collapsible%closed] +===== +`task`::: +(Optional, string) +Specifies the task passed to the model. +Valid values are: +* `classification`: use it for embeddings passed through a text classifier. +* `clustering`: use it for the embeddings run through a clustering algorithm. +* `ingest`: use it for storing document embeddings in a vector database. +* `search`: use it for storing embeddings of search queries run against a vector database to find relevant documents. +===== + + +[discrete] +[[inference-example-jinaai]] +==== JinaAI service examples + +The following example shows how to create {infer} endpoints to get `text_embeddings` and `rerank` and to use them in a search application. + +First, we create the `embeddings` service: + +[source,console] +------------------------------------------------------------ +PUT _inference/text_embedding/jinaai-embeddings +{ + "service": "jinaai", + "service_settings": { + "model_id": "jina-embeddings-v3", + "api_key": "", + }, + "task_settings": {} +} +------------------------------------------------------------ + +Then, we create the `rerank` service: +[source,console] +------------------------------------------------------------ +PUT _inference/rerank/jinaai-rerank +{ + "service": "jinaai", + "service_settings": { + "api_key": "", + "model_id": "jina-reranker-v2-base-multilingual" + }, + "task_settings": { + "top_n": 10, + "return_documents": true + } +} +------------------------------------------------------------ + +Now we can create an index that will use `jinaai-embeddings` service to index the documents. + +[source,console] +------------------------------------------------------------ +PUT jinaai-index +{ + "mappings": { + "properties": { + "content": { + "type": "semantic_text", + "inference_id": "jinaai-embeddings" + } + } + } +} +------------------------------------------------------------ + +[source,console] +------------------------------------------------------------ +PUT jinaai-index/_bulk +{ "index" : { "_index" : "jinaai-index", "_id" : "1" } } +{"content": "Sarah Johnson is a talented marine biologist working at the Oceanographic Institute. Her groundbreaking research on coral reef ecosystems has garnered international attention and numerous accolades."} +{ "index" : { "_index" : "jinaai-index", "_id" : "2" } } +{"content": "She spends months at a time diving in remote locations, meticulously documenting the intricate relationships between various marine species. "} +{ "index" : { "_index" : "jinaai-index", "_id" : "3" } } +{"content": "Her dedication to preserving these delicate underwater environments has inspired a new generation of conservationists."} +------------------------------------------------------------ + +Now, with the index created, we can search with and without the reranker service. + +[source,console] +------------------------------------------------------------ +GET jinaai-index/_search +{ + "query": { + "semantic": { + "field": "content", + "query": "who inspired taking care of the sea?" + } + } +} +------------------------------------------------------------ + +[source,console] +------------------------------------------------------------ +POST jinaai-index/_search +{ + "retriever": { + "text_similarity_reranker": { + "retriever": { + "standard": { + "query": { + "semantic": { + "field": "content", + "query": "who inspired taking care of the sea?" + } + } + } + }, + "field": "content", + "rank_window_size": 100, + "inference_id": "jinaai-rerank", + "inference_text": "who inspired taking care of the sea?" + } + } +} +------------------------------------------------------------ From a1fa8d7ae0328a762d3be688186641a4b929652d Mon Sep 17 00:00:00 2001 From: Joan Martinez Date: Tue, 17 Dec 2024 17:39:55 +0100 Subject: [PATCH 2/7] add to reference --- docs/reference/inference/inference-apis.asciidoc | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/reference/inference/inference-apis.asciidoc b/docs/reference/inference/inference-apis.asciidoc index 8d5ee1b7d6ba5..97dac0f844487 100644 --- a/docs/reference/inference/inference-apis.asciidoc +++ b/docs/reference/inference/inference-apis.asciidoc @@ -143,6 +143,7 @@ include::service-elser.asciidoc[] include::service-google-ai-studio.asciidoc[] include::service-google-vertex-ai.asciidoc[] include::service-hugging-face.asciidoc[] +include::service-jinaai.asciidoc[] include::service-mistral.asciidoc[] include::service-openai.asciidoc[] include::service-watsonx-ai.asciidoc[] From 07cbaf131accd32ec6b7041f5399e81d361d8720 Mon Sep 17 00:00:00 2001 From: Joan Fontanals Date: Tue, 17 Dec 2024 18:30:15 +0100 Subject: [PATCH 3/7] Update docs/reference/inference/service-jinaai.asciidoc Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> --- docs/reference/inference/service-jinaai.asciidoc | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/reference/inference/service-jinaai.asciidoc b/docs/reference/inference/service-jinaai.asciidoc index d569e589a73e1..6891bd21ef79a 100644 --- a/docs/reference/inference/service-jinaai.asciidoc +++ b/docs/reference/inference/service-jinaai.asciidoc @@ -166,6 +166,7 @@ PUT _inference/text_embedding/jinaai-embeddings "task_settings": {} } ------------------------------------------------------------ +// TEST[skip:uses ML] Then, we create the `rerank` service: [source,console] From 35fb01ec308ec024b2f82c6088c0b0cc566e28ab Mon Sep 17 00:00:00 2001 From: Joan Martinez Date: Tue, 17 Dec 2024 19:11:08 +0100 Subject: [PATCH 4/7] skip code snippets --- docs/reference/inference/service-jinaai.asciidoc | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/docs/reference/inference/service-jinaai.asciidoc b/docs/reference/inference/service-jinaai.asciidoc index 6891bd21ef79a..3f6e78204338a 100644 --- a/docs/reference/inference/service-jinaai.asciidoc +++ b/docs/reference/inference/service-jinaai.asciidoc @@ -184,6 +184,7 @@ PUT _inference/rerank/jinaai-rerank } } ------------------------------------------------------------ +// TEST[skip:uses ML] Now we can create an index that will use `jinaai-embeddings` service to index the documents. @@ -201,6 +202,7 @@ PUT jinaai-index } } ------------------------------------------------------------ +// TEST[skip:uses ML] [source,console] ------------------------------------------------------------ @@ -212,6 +214,7 @@ PUT jinaai-index/_bulk { "index" : { "_index" : "jinaai-index", "_id" : "3" } } {"content": "Her dedication to preserving these delicate underwater environments has inspired a new generation of conservationists."} ------------------------------------------------------------ +// TEST[skip:uses ML] Now, with the index created, we can search with and without the reranker service. @@ -227,6 +230,7 @@ GET jinaai-index/_search } } ------------------------------------------------------------ +// TEST[skip:uses ML] [source,console] ------------------------------------------------------------ @@ -252,3 +256,4 @@ POST jinaai-index/_search } } ------------------------------------------------------------ +// TEST[skip:uses ML] \ No newline at end of file From 6f81a8911b816ab27ccd98aa12ef6f3495cca34c Mon Sep 17 00:00:00 2001 From: Joan Fontanals Date: Wed, 18 Dec 2024 11:17:28 +0100 Subject: [PATCH 5/7] update docs/reference/inference/service-jinaai.asciidoc Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> --- docs/reference/inference/service-jinaai.asciidoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/reference/inference/service-jinaai.asciidoc b/docs/reference/inference/service-jinaai.asciidoc index 3f6e78204338a..798d1cf127dc5 100644 --- a/docs/reference/inference/service-jinaai.asciidoc +++ b/docs/reference/inference/service-jinaai.asciidoc @@ -161,7 +161,7 @@ PUT _inference/text_embedding/jinaai-embeddings "service": "jinaai", "service_settings": { "model_id": "jina-embeddings-v3", - "api_key": "", + "api_key": "" }, "task_settings": {} } From 531b08a2f6c0f3c863d6c8ef6bd9e1608801e967 Mon Sep 17 00:00:00 2001 From: Joan Fontanals Date: Wed, 18 Dec 2024 11:47:48 +0100 Subject: [PATCH 6/7] apply suggestions from code review Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> --- docs/reference/inference/service-jinaai.asciidoc | 15 ++++++--------- 1 file changed, 6 insertions(+), 9 deletions(-) diff --git a/docs/reference/inference/service-jinaai.asciidoc b/docs/reference/inference/service-jinaai.asciidoc index 798d1cf127dc5..f0eadc2767c37 100644 --- a/docs/reference/inference/service-jinaai.asciidoc +++ b/docs/reference/inference/service-jinaai.asciidoc @@ -68,9 +68,8 @@ These settings are specific to the `jinaai` service. `api_key`::: (Required, string) -A valid API key of your JinaAI account. -You can find in: -https://jina.ai/embeddings/. +A valid API key for your JinaAI account. +You can find it at https://jina.ai/embeddings/. + -- include::inference-shared.asciidoc[tag=api-key-admonition] @@ -78,9 +77,8 @@ include::inference-shared.asciidoc[tag=api-key-admonition] `rate_limit`::: (Optional, object) -By default, the `jinaai` service sets the number of requests allowed per minute to `2000`. -This value is the same for all task types. -To modify this, set the `requests_per_minute` setting of this object in your service settings: +The default rate limit for the `jinaai` service is 2000 requests per minute for all task types. +You can modify this using the `requests_per_minute` setting in your service settings: + -- include::inference-shared.asciidoc[tag=request-per-minute-example] @@ -94,8 +92,7 @@ More information about JinaAI's rate limits can be found in https://jina.ai/cont `model_id`:: (Optional, string) The name of the model to use for the {infer} task. -To review the available `rerank` models, refer to the -https://jina.ai/reranker. +To review the available `rerank` compatible models, refer to https://jina.ai/reranker. ===== + .`service_settings` for the `text_embedding` task type @@ -150,7 +147,7 @@ Valid values are: [[inference-example-jinaai]] ==== JinaAI service examples -The following example shows how to create {infer} endpoints to get `text_embeddings` and `rerank` and to use them in a search application. +The following examples demonstrate how to create {infer} endpoints for `text_embeddings` and `rerank` tasks using the JinaAI service and use them in search requests. First, we create the `embeddings` service: From fbf0941318a7bcae4d958cbfe2e31d05c4715c62 Mon Sep 17 00:00:00 2001 From: Joan Martinez Date: Wed, 18 Dec 2024 11:51:58 +0100 Subject: [PATCH 7/7] apply more suggestions --- docs/reference/inference/service-jinaai.asciidoc | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/docs/reference/inference/service-jinaai.asciidoc b/docs/reference/inference/service-jinaai.asciidoc index f0eadc2767c37..7c5aebe5bcf8e 100644 --- a/docs/reference/inference/service-jinaai.asciidoc +++ b/docs/reference/inference/service-jinaai.asciidoc @@ -90,7 +90,7 @@ More information about JinaAI's rate limits can be found in https://jina.ai/cont [%collapsible%closed] ===== `model_id`:: -(Optional, string) +(Required, string) The name of the model to use for the {infer} task. To review the available `rerank` compatible models, refer to https://jina.ai/reranker. ===== @@ -159,8 +159,7 @@ PUT _inference/text_embedding/jinaai-embeddings "service_settings": { "model_id": "jina-embeddings-v3", "api_key": "" - }, - "task_settings": {} + } } ------------------------------------------------------------ // TEST[skip:uses ML] @@ -172,7 +171,7 @@ PUT _inference/rerank/jinaai-rerank { "service": "jinaai", "service_settings": { - "api_key": "", + "api_key": "", "model_id": "jina-reranker-v2-base-multilingual" }, "task_settings": {