diff --git a/docs/reference/inference/inference-apis.asciidoc b/docs/reference/inference/inference-apis.asciidoc index c7b779a994a05..871e42f2c09a6 100644 --- a/docs/reference/inference/inference-apis.asciidoc +++ b/docs/reference/inference/inference-apis.asciidoc @@ -48,21 +48,21 @@ When adaptive allocations are enabled: For more information about adaptive allocations and resources, refer to the {ml-docs}/ml-nlp-auto-scale.html[trained model autoscaling] documentation. -//[discrete] -//[[default-enpoints]] -//=== Default {infer} endpoints +[discrete] +[[default-enpoints]] +=== Default {infer} endpoints -//Your {es} deployment contains some preconfigured {infer} endpoints that makes it easier for you to use them when defining `semantic_text` fields or {infer} processors. -//The following list contains the default {infer} endpoints listed by `inference_id`: +Your {es} deployment contains some preconfigured {infer} endpoints that makes it easier for you to use them when defining `semantic_text` fields or {infer} processors. +The following list contains the default {infer} endpoints listed by `inference_id`: -//* `.elser-2-elasticsearch`: uses the {ml-docs}/ml-nlp-elser.html[ELSER] built-in trained model for `sparse_embedding` tasks (recommended for English language texts) -//* `.multilingual-e5-small-elasticsearch`: uses the {ml-docs}/ml-nlp-e5.html[E5] built-in trained model for `text_embedding` tasks (recommended for non-English language texts) +* `.elser-2-elasticsearch`: uses the {ml-docs}/ml-nlp-elser.html[ELSER] built-in trained model for `sparse_embedding` tasks (recommended for English language texts) +* `.multilingual-e5-small-elasticsearch`: uses the {ml-docs}/ml-nlp-e5.html[E5] built-in trained model for `text_embedding` tasks (recommended for non-English language texts) -//Use the `inference_id` of the endpoint in a <> field definition or when creating an <>. -//The API call will automatically download and deploy the model which might take a couple of minutes. -//Default {infer} enpoints have {ml-docs}/ml-nlp-auto-scale.html#nlp-model-adaptive-allocations[adaptive allocations] enabled. -//For these models, the minimum number of allocations is `0`. -//If there is no {infer} activity that uses the endpoint, the number of allocations will scale down to `0` automatically after 15 minutes. +Use the `inference_id` of the endpoint in a <> field definition or when creating an <>. +The API call will automatically download and deploy the model which might take a couple of minutes. +Default {infer} enpoints have {ml-docs}/ml-nlp-auto-scale.html#nlp-model-adaptive-allocations[adaptive allocations] enabled. +For these models, the minimum number of allocations is `0`. +If there is no {infer} activity that uses the endpoint, the number of allocations will scale down to `0` automatically after 15 minutes. [discrete]