From c530ddc0296a6394e686674298877d005d634486 Mon Sep 17 00:00:00 2001
From: Joan Martinez <joan.fontanals.martinez@jina.ai>
Date: Mon, 16 Dec 2024 17:37:26 +0100
Subject: [PATCH 1/7] add documentation for jinaai service

---
 .../inference/put-inference.asciidoc          |   3 +-
 .../inference/service-jinaai.asciidoc         | 253 ++++++++++++++++++
 2 files changed, 255 insertions(+), 1 deletion(-)
 create mode 100644 docs/reference/inference/service-jinaai.asciidoc

diff --git a/docs/reference/inference/put-inference.asciidoc b/docs/reference/inference/put-inference.asciidoc
index 4f82889f562d8..d6abe90e48956 100644
--- a/docs/reference/inference/put-inference.asciidoc
+++ b/docs/reference/inference/put-inference.asciidoc
@@ -72,6 +72,7 @@ Click the links to review the configuration details of the services:
 * <<infer-service-mistral,Mistral>> (`text_embedding`)
 * <<infer-service-openai,OpenAI>> (`completion`, `text_embedding`)
 * <<infer-service-watsonx-ai>> (`text_embedding`)
+* <<infer-service-jinaai,JinaAI>> (`text_embedding`, `rerank`)
 
 The {es} and ELSER services run on a {ml} node in your {es} cluster. The rest of
 the services connect to external providers.
@@ -87,4 +88,4 @@ When adaptive allocations are enabled:
 - The number of allocations scales up automatically when the load increases.
 - Allocations scale down to a minimum of 0 when the load decreases, saving resources.
 
-For more information about adaptive allocations and resources, refer to the {ml-docs}/ml-nlp-auto-scale.html[trained model autoscaling] documentation.
\ No newline at end of file
+For more information about adaptive allocations and resources, refer to the {ml-docs}/ml-nlp-auto-scale.html[trained model autoscaling] documentation.
diff --git a/docs/reference/inference/service-jinaai.asciidoc b/docs/reference/inference/service-jinaai.asciidoc
new file mode 100644
index 0000000000000..d569e589a73e1
--- /dev/null
+++ b/docs/reference/inference/service-jinaai.asciidoc
@@ -0,0 +1,253 @@
+[[infer-service-jinaai]]
+=== JinaAI {infer} service
+
+Creates an {infer} endpoint to perform an {infer} task with the `jinaai` service.
+
+
+[discrete]
+[[infer-service-jinaai-api-request]]
+==== {api-request-title}
+
+`PUT /_inference/<task_type>/<inference_id>`
+
+[discrete]
+[[infer-service-jinaai-api-path-params]]
+==== {api-path-parms-title}
+
+`<inference_id>`::
+(Required, string)
+include::inference-shared.asciidoc[tag=inference-id]
+
+`<task_type>`::
+(Required, string)
+include::inference-shared.asciidoc[tag=task-type]
++
+--
+Available task types:
+
+* `text_embedding`,
+* `rerank`.
+--
+
+[discrete]
+[[infer-service-jinaai-api-request-body]]
+==== {api-request-body-title}
+
+`chunking_settings`::
+(Optional, object)
+include::inference-shared.asciidoc[tag=chunking-settings]
+
+`max_chunking_size`:::
+(Optional, integer)
+include::inference-shared.asciidoc[tag=chunking-settings-max-chunking-size]
+
+`overlap`:::
+(Optional, integer)
+include::inference-shared.asciidoc[tag=chunking-settings-overlap]
+
+`sentence_overlap`:::
+(Optional, integer)
+include::inference-shared.asciidoc[tag=chunking-settings-sentence-overlap]
+
+`strategy`:::
+(Optional, string)
+include::inference-shared.asciidoc[tag=chunking-settings-strategy]
+
+`service`::
+(Required, string)
+The type of service supported for the specified task type. In this case, 
+`jinaai`.
+
+`service_settings`::
+(Required, object)
+include::inference-shared.asciidoc[tag=service-settings]
++
+--
+These settings are specific to the `jinaai` service.
+--
+
+`api_key`:::
+(Required, string)
+A valid API key of your JinaAI account.
+You can find in:
+https://jina.ai/embeddings/.
++
+--
+include::inference-shared.asciidoc[tag=api-key-admonition]
+--
+
+`rate_limit`:::
+(Optional, object)
+By default, the `jinaai` service sets the number of requests allowed per minute to `2000`.
+This value is the same for all task types.
+To modify this, set the `requests_per_minute` setting of this object in your service settings:
++
+--
+include::inference-shared.asciidoc[tag=request-per-minute-example]
+
+More information about JinaAI's rate limits can be found in https://jina.ai/contact-sales/#rate-limit.
+--
++
+.`service_settings` for the `rerank` task type
+[%collapsible%closed]
+=====
+`model_id`::
+(Optional, string)
+The name of the model to use for the {infer} task.
+To review the available `rerank` models, refer to the
+https://jina.ai/reranker.
+=====
++
+.`service_settings` for the `text_embedding` task type
+[%collapsible%closed]
+=====
+`model_id`:::
+(Optional, string)
+The name of the model to use for the {infer} task.
+To review the available `text_embedding` models, refer to the
+https://jina.ai/embeddings/.
+
+`similarity`:::
+(Optional, string)
+Similarity measure. One of `cosine`, `dot_product`, `l2_norm`.
+Defaults based on the `embedding_type` (`float` -> `dot_product`, `int8/byte` -> `cosine`).
+=====
+
+
+
+`task_settings`::
+(Optional, object)
+include::inference-shared.asciidoc[tag=task-settings]
++
+.`task_settings` for the `rerank` task type
+[%collapsible%closed]
+=====
+`return_documents`::
+(Optional, boolean)
+Specify whether to return doc text within the results.
+
+`top_n`::
+(Optional, integer)
+The number of most relevant documents to return, defaults to the number of the documents.
+If this {infer} endpoint is used in a `text_similarity_reranker` retriever query and `top_n` is set, it must be greater than or equal to `rank_window_size` in the query.
+=====
++
+.`task_settings` for the `text_embedding` task type
+[%collapsible%closed]
+=====
+`task`:::
+(Optional, string)
+Specifies the task passed to the model.
+Valid values are:
+* `classification`: use it for embeddings passed through a text classifier.
+* `clustering`: use it for the embeddings run through a clustering algorithm.
+* `ingest`: use it for storing document embeddings in a vector database.
+* `search`: use it for storing embeddings of search queries run against a vector database to find relevant documents.
+=====
+
+
+[discrete]
+[[inference-example-jinaai]]
+==== JinaAI service examples
+
+The following example shows how to create {infer} endpoints to get `text_embeddings` and `rerank` and to use them in a search application.
+
+First, we create the `embeddings` service:
+
+[source,console]
+------------------------------------------------------------
+PUT _inference/text_embedding/jinaai-embeddings
+{
+    "service": "jinaai",
+    "service_settings": {
+        "model_id": "jina-embeddings-v3",
+        "api_key": "<api_key>",
+    },
+    "task_settings": {}
+}
+------------------------------------------------------------
+
+Then, we create the `rerank` service:
+[source,console]
+------------------------------------------------------------
+PUT _inference/rerank/jinaai-rerank
+{
+    "service": "jinaai",
+    "service_settings": {
+        "api_key": "<API-KEY>",
+        "model_id": "jina-reranker-v2-base-multilingual"
+    },
+    "task_settings": {
+        "top_n": 10,
+        "return_documents": true
+    }
+}
+------------------------------------------------------------
+
+Now we can create an index that will use `jinaai-embeddings` service to index the documents.
+
+[source,console]
+------------------------------------------------------------
+PUT jinaai-index
+{
+  "mappings": {
+    "properties": {
+      "content": {
+        "type": "semantic_text",
+        "inference_id": "jinaai-embeddings"
+      }
+    }
+  }
+}
+------------------------------------------------------------
+
+[source,console]
+------------------------------------------------------------
+PUT jinaai-index/_bulk
+{ "index" : { "_index" : "jinaai-index", "_id" : "1" } }
+{"content": "Sarah Johnson is a talented marine biologist working at the Oceanographic Institute. Her groundbreaking research on coral reef ecosystems has garnered international attention and numerous accolades."}
+{ "index" : { "_index" : "jinaai-index", "_id" : "2" } }
+{"content": "She spends months at a time diving in remote locations, meticulously documenting the intricate relationships between various marine species. "}
+{ "index" : { "_index" : "jinaai-index", "_id" : "3" } }
+{"content": "Her dedication to preserving these delicate underwater environments has inspired a new generation of conservationists."}
+------------------------------------------------------------
+
+Now, with the index created, we can search with and without the reranker service.
+
+[source,console]
+------------------------------------------------------------
+GET jinaai-index/_search 
+{
+  "query": {
+    "semantic": {
+      "field": "content",
+      "query": "who inspired taking care of the sea?"
+    }
+  }
+}
+------------------------------------------------------------
+
+[source,console]
+------------------------------------------------------------
+POST jinaai-index/_search
+{
+  "retriever": {
+    "text_similarity_reranker": {
+      "retriever": {
+        "standard": {
+          "query": {
+            "semantic": {
+              "field": "content",
+              "query": "who inspired taking care of the sea?"
+            }
+          }
+        }
+      },
+      "field": "content",
+      "rank_window_size": 100,
+      "inference_id": "jinaai-rerank",
+      "inference_text": "who inspired taking care of the sea?"
+    }
+  }
+}
+------------------------------------------------------------

From a1fa8d7ae0328a762d3be688186641a4b929652d Mon Sep 17 00:00:00 2001
From: Joan Martinez <joan.fontanals.martinez@jina.ai>
Date: Tue, 17 Dec 2024 17:39:55 +0100
Subject: [PATCH 2/7] add to reference

---
 docs/reference/inference/inference-apis.asciidoc | 1 +
 1 file changed, 1 insertion(+)

diff --git a/docs/reference/inference/inference-apis.asciidoc b/docs/reference/inference/inference-apis.asciidoc
index 8d5ee1b7d6ba5..97dac0f844487 100644
--- a/docs/reference/inference/inference-apis.asciidoc
+++ b/docs/reference/inference/inference-apis.asciidoc
@@ -143,6 +143,7 @@ include::service-elser.asciidoc[]
 include::service-google-ai-studio.asciidoc[]
 include::service-google-vertex-ai.asciidoc[]
 include::service-hugging-face.asciidoc[]
+include::service-jinaai.asciidoc[]
 include::service-mistral.asciidoc[]
 include::service-openai.asciidoc[]
 include::service-watsonx-ai.asciidoc[]

From 07cbaf131accd32ec6b7041f5399e81d361d8720 Mon Sep 17 00:00:00 2001
From: Joan Fontanals <jfontanalsmartinez@gmail.com>
Date: Tue, 17 Dec 2024 18:30:15 +0100
Subject: [PATCH 3/7] Update docs/reference/inference/service-jinaai.asciidoc

Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>
---
 docs/reference/inference/service-jinaai.asciidoc | 1 +
 1 file changed, 1 insertion(+)

diff --git a/docs/reference/inference/service-jinaai.asciidoc b/docs/reference/inference/service-jinaai.asciidoc
index d569e589a73e1..6891bd21ef79a 100644
--- a/docs/reference/inference/service-jinaai.asciidoc
+++ b/docs/reference/inference/service-jinaai.asciidoc
@@ -166,6 +166,7 @@ PUT _inference/text_embedding/jinaai-embeddings
     "task_settings": {}
 }
 ------------------------------------------------------------
+// TEST[skip:uses ML]
 
 Then, we create the `rerank` service:
 [source,console]

From 35fb01ec308ec024b2f82c6088c0b0cc566e28ab Mon Sep 17 00:00:00 2001
From: Joan Martinez <joan.fontanals.martinez@jina.ai>
Date: Tue, 17 Dec 2024 19:11:08 +0100
Subject: [PATCH 4/7] skip code snippets

---
 docs/reference/inference/service-jinaai.asciidoc | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/docs/reference/inference/service-jinaai.asciidoc b/docs/reference/inference/service-jinaai.asciidoc
index 6891bd21ef79a..3f6e78204338a 100644
--- a/docs/reference/inference/service-jinaai.asciidoc
+++ b/docs/reference/inference/service-jinaai.asciidoc
@@ -184,6 +184,7 @@ PUT _inference/rerank/jinaai-rerank
     }
 }
 ------------------------------------------------------------
+// TEST[skip:uses ML]
 
 Now we can create an index that will use `jinaai-embeddings` service to index the documents.
 
@@ -201,6 +202,7 @@ PUT jinaai-index
   }
 }
 ------------------------------------------------------------
+// TEST[skip:uses ML]
 
 [source,console]
 ------------------------------------------------------------
@@ -212,6 +214,7 @@ PUT jinaai-index/_bulk
 { "index" : { "_index" : "jinaai-index", "_id" : "3" } }
 {"content": "Her dedication to preserving these delicate underwater environments has inspired a new generation of conservationists."}
 ------------------------------------------------------------
+// TEST[skip:uses ML]
 
 Now, with the index created, we can search with and without the reranker service.
 
@@ -227,6 +230,7 @@ GET jinaai-index/_search
   }
 }
 ------------------------------------------------------------
+// TEST[skip:uses ML]
 
 [source,console]
 ------------------------------------------------------------
@@ -252,3 +256,4 @@ POST jinaai-index/_search
   }
 }
 ------------------------------------------------------------
+// TEST[skip:uses ML]
\ No newline at end of file

From 6f81a8911b816ab27ccd98aa12ef6f3495cca34c Mon Sep 17 00:00:00 2001
From: Joan Fontanals <jfontanalsmartinez@gmail.com>
Date: Wed, 18 Dec 2024 11:17:28 +0100
Subject: [PATCH 5/7] update docs/reference/inference/service-jinaai.asciidoc

Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>
---
 docs/reference/inference/service-jinaai.asciidoc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/reference/inference/service-jinaai.asciidoc b/docs/reference/inference/service-jinaai.asciidoc
index 3f6e78204338a..798d1cf127dc5 100644
--- a/docs/reference/inference/service-jinaai.asciidoc
+++ b/docs/reference/inference/service-jinaai.asciidoc
@@ -161,7 +161,7 @@ PUT _inference/text_embedding/jinaai-embeddings
     "service": "jinaai",
     "service_settings": {
         "model_id": "jina-embeddings-v3",
-        "api_key": "<api_key>",
+        "api_key": "<api_key>"
     },
     "task_settings": {}
 }

From 531b08a2f6c0f3c863d6c8ef6bd9e1608801e967 Mon Sep 17 00:00:00 2001
From: Joan Fontanals <jfontanalsmartinez@gmail.com>
Date: Wed, 18 Dec 2024 11:47:48 +0100
Subject: [PATCH 6/7] apply suggestions from code review

Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>
---
 docs/reference/inference/service-jinaai.asciidoc | 15 ++++++---------
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/docs/reference/inference/service-jinaai.asciidoc b/docs/reference/inference/service-jinaai.asciidoc
index 798d1cf127dc5..f0eadc2767c37 100644
--- a/docs/reference/inference/service-jinaai.asciidoc
+++ b/docs/reference/inference/service-jinaai.asciidoc
@@ -68,9 +68,8 @@ These settings are specific to the `jinaai` service.
 
 `api_key`:::
 (Required, string)
-A valid API key of your JinaAI account.
-You can find in:
-https://jina.ai/embeddings/.
+A valid API key for your JinaAI account.
+You can find it at https://jina.ai/embeddings/.
 +
 --
 include::inference-shared.asciidoc[tag=api-key-admonition]
@@ -78,9 +77,8 @@ include::inference-shared.asciidoc[tag=api-key-admonition]
 
 `rate_limit`:::
 (Optional, object)
-By default, the `jinaai` service sets the number of requests allowed per minute to `2000`.
-This value is the same for all task types.
-To modify this, set the `requests_per_minute` setting of this object in your service settings:
+The default rate limit for the `jinaai` service is 2000 requests per minute for all task types. 
+You can modify this using the `requests_per_minute` setting in your service settings:
 +
 --
 include::inference-shared.asciidoc[tag=request-per-minute-example]
@@ -94,8 +92,7 @@ More information about JinaAI's rate limits can be found in https://jina.ai/cont
 `model_id`::
 (Optional, string)
 The name of the model to use for the {infer} task.
-To review the available `rerank` models, refer to the
-https://jina.ai/reranker.
+To review the available `rerank` compatible models, refer to https://jina.ai/reranker.
 =====
 +
 .`service_settings` for the `text_embedding` task type
@@ -150,7 +147,7 @@ Valid values are:
 [[inference-example-jinaai]]
 ==== JinaAI service examples
 
-The following example shows how to create {infer} endpoints to get `text_embeddings` and `rerank` and to use them in a search application.
+The following examples demonstrate how to create {infer} endpoints for `text_embeddings` and `rerank` tasks using the JinaAI service and use them in search requests.
 
 First, we create the `embeddings` service:
 

From fbf0941318a7bcae4d958cbfe2e31d05c4715c62 Mon Sep 17 00:00:00 2001
From: Joan Martinez <joan.fontanals.martinez@jina.ai>
Date: Wed, 18 Dec 2024 11:51:58 +0100
Subject: [PATCH 7/7] apply more suggestions

---
 docs/reference/inference/service-jinaai.asciidoc | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/docs/reference/inference/service-jinaai.asciidoc b/docs/reference/inference/service-jinaai.asciidoc
index f0eadc2767c37..7c5aebe5bcf8e 100644
--- a/docs/reference/inference/service-jinaai.asciidoc
+++ b/docs/reference/inference/service-jinaai.asciidoc
@@ -90,7 +90,7 @@ More information about JinaAI's rate limits can be found in https://jina.ai/cont
 [%collapsible%closed]
 =====
 `model_id`::
-(Optional, string)
+(Required, string)
 The name of the model to use for the {infer} task.
 To review the available `rerank` compatible models, refer to https://jina.ai/reranker.
 =====
@@ -159,8 +159,7 @@ PUT _inference/text_embedding/jinaai-embeddings
     "service_settings": {
         "model_id": "jina-embeddings-v3",
         "api_key": "<api_key>"
-    },
-    "task_settings": {}
+    }
 }
 ------------------------------------------------------------
 // TEST[skip:uses ML]
@@ -172,7 +171,7 @@ PUT _inference/rerank/jinaai-rerank
 {
     "service": "jinaai",
     "service_settings": {
-        "api_key": "<API-KEY>",
+        "api_key": "<api_key>",
         "model_id": "jina-reranker-v2-base-multilingual"
     },
     "task_settings": {