Replace 'ent-search-generic' with 'search-default' pipeline (elastic#…

…118899) * Replace 'ent-search-generic' with 'search-default' pipeline * missed one * [CI] Auto commit changes from spotless --------- Co-authored-by: elasticsearchmachine <[email protected]>
rjernst · Dec 18, 2024 · b4529bb · b4529bb
1 parent b8076c2
commit b4529bb
Show file tree

Hide file tree

Showing 13 changed files with 37 additions and 182 deletions.
diff --git a/.java-version b/.java-version
@@ -0,0 +1 @@
+21
diff --git a/docs/reference/connector/docs/connectors-content-extraction.asciidoc b/docs/reference/connector/docs/connectors-content-extraction.asciidoc
@@ -8,7 +8,7 @@ The logic for content extraction is defined in {connectors-python}/connectors/ut
 While intended primarily for PDF and Microsoft Office formats, you can use any of the <<es-connectors-content-extraction-supported-file-types, supported formats>>.
 
 Enterprise Search uses an {ref}/ingest.html[Elasticsearch ingest pipeline^] to power the web crawler's binary content extraction.
-The default pipeline, `ent-search-generic-ingestion`, is automatically created when Enterprise Search first starts.
+The default pipeline, `search-default-ingestion`, is automatically created when Enterprise Search first starts.
 
 You can {ref}/ingest.html#create-manage-ingest-pipelines[view^] this pipeline in Kibana.
 Customizing your pipeline usage is also an option.

diff --git a/docs/reference/connector/docs/connectors-filter-extract-transform.asciidoc b/docs/reference/connector/docs/connectors-filter-extract-transform.asciidoc
@@ -13,7 +13,7 @@ The following diagram provides an overview of how content extraction, sync rules
 [.screenshot]
 image::images/pipelines-extraction-sync-rules.png[Architecture diagram of data pipeline with content extraction, sync rules, and ingest pipelines]
 
-By default, only the connector specific logic (2) and the default `ent-search-generic-ingestion` pipeline (6) extract and transform your data, as configured in your deployment.
+By default, only the connector specific logic (2) and the default `search-default-ingestion` pipeline (6) extract and transform your data, as configured in your deployment.
 
 The following tools are available for more advanced use cases:
 
@@ -50,4 +50,4 @@ Use ingest pipelines for data enrichment, normalization, and more.
 
 Elastic connectors use a default ingest pipeline, which you can copy and customize to meet your needs.
 
-Refer to {ref}/ingest-pipeline-search.html[ingest pipelines in Search] in the {es} documentation.
+Refer to {ref}/ingest-pipeline-search.html[ingest pipelines in Search] in the {es} documentation.
diff --git a/docs/reference/ingest/search-inference-processing.asciidoc b/docs/reference/ingest/search-inference-processing.asciidoc
@@ -88,7 +88,7 @@ The `monitor_ml` <<security-privileges, Elasticsearch cluster privilege>> is req
 
 To create the index-specific ML inference pipeline, go to *Search -> Content -> Indices -> <your index> -> Pipelines* in the Kibana UI.
 
-If you only see the `ent-search-generic-ingestion` pipeline, you will need to click *Copy and customize* to create index-specific pipelines.
+If you only see the `search-default-ingestion` pipeline, you will need to click *Copy and customize* to create index-specific pipelines.
 This will create the `{index_name}@ml-inference` pipeline.
 
 Once your index-specific ML inference pipeline is ready, you can add inference processors that use your ML trained models.

diff --git a/docs/reference/ingest/search-ingest-pipelines.asciidoc b/docs/reference/ingest/search-ingest-pipelines.asciidoc
@@ -40,7 +40,7 @@ Considerations such as error handling, conditional execution, sequencing, versio
 To this end, when you create indices for search use cases, (including {enterprise-search-ref}/crawler.html[Elastic web crawler], <<es-connectors,connectors>>.
 , and API indices), each index already has a pipeline set up with several processors that optimize your content for search.
 
-This pipeline is called `ent-search-generic-ingestion`.
+This pipeline is called `search-default-ingestion`.
 While it is a "managed" pipeline (meaning it should not be tampered with), you can view its details via the Kibana UI or the Elasticsearch API.
 You can also <<ingest-pipeline-search-details-generic-reference,read more about its contents below>>.
 
@@ -56,14 +56,14 @@ This will not effect existing indices.
 
 Each index also provides the capability to easily create index-specific ingest pipelines with customizable processing.
 If you need that extra flexibility, you can create a custom pipeline by going to your pipeline settings and choosing to "copy and customize".
-This will replace the index's use of `ent-search-generic-ingestion` with 3 newly generated pipelines:
+This will replace the index's use of `search-default-ingestion` with 3 newly generated pipelines:
 
 1. `<index-name>`
 2. `<index-name>@custom`
 3. `<index-name>@ml-inference`
 
-Like `ent-search-generic-ingestion`, the first of these is "managed", but the other two can and should be modified to fit your needs.
-You can view these pipelines using the platform tools (Kibana UI, Elasticsearch API), and can also 
+Like `search-default-ingestion`, the first of these is "managed", but the other two can and should be modified to fit your needs.
+You can view these pipelines using the platform tools (Kibana UI, Elasticsearch API), and can also
 <<ingest-pipeline-search-details-specific,read more about their content below>>.
 
 [discrete#ingest-pipeline-search-pipeline-settings]
@@ -123,7 +123,7 @@ If the pipeline is not specified, the underscore-prefixed fields will actually b
 === Details
 
 [discrete#ingest-pipeline-search-details-generic-reference]
-==== `ent-search-generic-ingestion` Reference
+==== `search-default-ingestion` Reference
 
 You can access this pipeline with the <<get-pipeline-api, Elasticsearch Ingest Pipelines API>> or via Kibana's <<create-manage-ingest-pipelines,Stack Management > Ingest Pipelines>> UI.
 
@@ -149,7 +149,7 @@ If you want to make customizations, we recommend you utilize index-specific pipe
 [discrete#ingest-pipeline-search-details-generic-reference-params]
 ===== Control flow parameters
 
-The `ent-search-generic-ingestion` pipeline does not always run all processors.
+The `search-default-ingestion` pipeline does not always run all processors.
 It utilizes a feature of ingest pipelines to <<conditionally-run-processor,conditionally run processors>> based on the contents of each individual document.
 
 * `_extract_binary_content` - if this field is present and has a value of `true` on a source document, the pipeline will attempt to run the `attachment`, `set_body`, and `remove_replacement_chars` processors.
@@ -167,8 +167,8 @@ See <<ingest-pipeline-search-pipeline-settings>>.
 ==== Index-specific ingest pipelines
 
 In the Kibana UI for your index, by clicking on the Pipelines tab, then *Settings > Copy and customize*, you can quickly generate 3 pipelines which are specific to your index.
-These 3 pipelines replace `ent-search-generic-ingestion` for the index.
-There is nothing lost in this action, as the `<index-name>` pipeline is a superset of functionality over the `ent-search-generic-ingestion` pipeline.
+These 3 pipelines replace `search-default-ingestion` for the index.
+There is nothing lost in this action, as the `<index-name>` pipeline is a superset of functionality over the `search-default-ingestion` pipeline.
 
 [IMPORTANT]
 ====
@@ -179,7 +179,7 @@ Refer to the Elastic subscriptions pages for https://www.elastic.co/subscription
 [discrete#ingest-pipeline-search-details-specific-reference]
 ===== `<index-name>` Reference
 
-This pipeline looks and behaves a lot like the <<ingest-pipeline-search-details-generic-reference,`ent-search-generic-ingestion` pipeline>>, but with <<ingest-pipeline-search-details-specific-reference-processors,two additional processors>>.
+This pipeline looks and behaves a lot like the <<ingest-pipeline-search-details-generic-reference,`search-default-ingestion` pipeline>>, but with <<ingest-pipeline-search-details-specific-reference-processors,two additional processors>>.
 
 [WARNING]
 =========================
@@ -197,7 +197,7 @@ If you want to make customizations, we recommend you utilize <<ingest-pipeline-s
 [discrete#ingest-pipeline-search-details-specific-reference-processors]
 ====== Processors
 
-In addition to the processors inherited from the <<ingest-pipeline-search-details-generic-reference,`ent-search-generic-ingestion` pipeline>>, the index-specific pipeline also defines:
+In addition to the processors inherited from the <<ingest-pipeline-search-details-generic-reference,`search-default-ingestion` pipeline>>, the index-specific pipeline also defines:
 
 * `index_ml_inference_pipeline` - this uses the <<pipeline-processor, Pipeline>> processor to run the `<index-name>@ml-inference` pipeline.
   This processor will only be run if the source document includes a `_run_ml_inference` field with the value `true`.
@@ -206,7 +206,7 @@ In addition to the processors inherited from the <<ingest-pipeline-search-detail
 [discrete#ingest-pipeline-search-details-specific-reference-params]
 ====== Control flow parameters
 
-Like the `ent-search-generic-ingestion` pipeline, the `<index-name>` pipeline does not always run all processors.
+Like the `search-default-ingestion` pipeline, the `<index-name>` pipeline does not always run all processors.
 In addition to the `_extract_binary_content` and `_reduce_whitespace` control flow parameters, the `<index-name>` pipeline also supports:
 
 * `_run_ml_inference` - if this field is present and has a value of `true` on a source document, the pipeline will attempt to run the `index_ml_inference_pipeline` processor.
@@ -220,7 +220,7 @@ See <<ingest-pipeline-search-pipeline-settings>>.
 ===== `<index-name>@ml-inference` Reference
 
 This pipeline is empty to start (no processors), but can be added to via the Kibana UI either through the Pipelines tab of your index, or from the *Stack Management > Ingest Pipelines* page.
-Unlike the `ent-search-generic-ingestion` pipeline and the `<index-name>` pipeline, this pipeline is NOT "managed".
+Unlike the `search-default-ingestion` pipeline and the `<index-name>` pipeline, this pipeline is NOT "managed".
 
 It's possible to add one or more ML inference pipelines to an index in the *Content* UI.
 This pipeline will serve as a container for all of the ML inference pipelines configured for the index.
@@ -241,7 +241,7 @@ The `monitor_ml` Elasticsearch cluster permission is required in order to manage
 
 This pipeline is empty to start (no processors), but can be added to via the Kibana UI either through the Pipelines
 tab of your index, or from the *Stack Management > Ingest Pipelines* page.
-Unlike the `ent-search-generic-ingestion` pipeline and the `<index-name>` pipeline, this pipeline is NOT "managed".
+Unlike the `search-default-ingestion` pipeline and the `<index-name>` pipeline, this pipeline is NOT "managed".
 
 You are encouraged to make additions and edits to this pipeline, provided its name remains the same.
 This provides a convenient hook from which to add custom processing and transformations for your data.
@@ -272,9 +272,12 @@ extraction.
 These changes should be re-applied to each index's `<index-name>@custom` pipeline in order to ensure a consistent data processing experience.
   In 8.5+, the <<ingest-pipeline-search-pipeline-settings, index setting to enable binary content>> is required *in addition* to the configurations mentioned in the {enterprise-search-ref}/crawler-managing.html#crawler-managing-binary-content[Elastic web crawler Guide].
 
-* `ent-search-generic-ingestion` - Since 8.5, Native Connectors, Connector Clients, and new (>8.4) Elastic web crawler indices will all make use of this pipeline by default.
+* `ent-search-generic-ingestion` - Since 8.5, Native Connectors, Connector Clients, and new (>8.4) Elastic web crawler indices all made use of this pipeline by default.
+  This pipeline evolved into the `search-default-ingestion` pipeline.
+
+* `search-default-ingestion` - Since 9.0, Connectors have made use of this pipeline by default.
   You can <<ingest-pipeline-search-details-generic-reference, read more about this pipeline>> above.
-  As this pipeline is "managed", any modifications that were made to `app_search_crawler` and/or `ent_search_crawler` should NOT be made to `ent-search-generic-ingestion`.
+  As this pipeline is "managed", any modifications that were made to `app_search_crawler` and/or `ent_search_crawler` should NOT be made to `search-default-ingestion`.
   Instead, if such customizations are desired, you should utilize <<ingest-pipeline-search-details-specific>>, placing all modifications in the `<index-name>@custom` pipeline(s).
 =============
 

diff --git a/docs/reference/ingest/search-nlp-tutorial.asciidoc b/docs/reference/ingest/search-nlp-tutorial.asciidoc
@@ -164,8 +164,8 @@ Now it's time to create an inference pipeline.
 
 1. From the overview page for your `search-photo-comments` index in "Search", click the *Pipelines* tab.
 By default, Elasticsearch does not create any index-specific ingest pipelines.
-2. Because we want to customize these pipelines, we need to *Copy and customize* the `ent-search-generic-ingestion` ingest pipeline.
-Find this option above the settings for the `ent-search-generic-ingestion` ingest pipeline.
+2. Because we want to customize these pipelines, we need to *Copy and customize* the `search-default-ingestion` ingest pipeline.
+Find this option above the settings for the `search-default-ingestion` ingest pipeline.
 This will create two new index-specific ingest pipelines.
 
 Next, we'll add an inference pipeline.

diff --git a/...emplate-resources/src/main/resources/entsearch/connector/elastic-connectors-mappings.json b/...emplate-resources/src/main/resources/entsearch/connector/elastic-connectors-mappings.json
@@ -7,7 +7,7 @@
       "dynamic": "false",
       "_meta": {
         "pipeline": {
-          "default_name": "ent-search-generic-ingestion",
+          "default_name": "search-default-ingestion",
           "default_extract_binary_content": true,
           "default_run_ml_inference": true,
           "default_reduce_whitespace": true

diff --git a/...ugin/core/template-resources/src/main/resources/entsearch/generic_ingestion_pipeline.json b/...ugin/core/template-resources/src/main/resources/entsearch/generic_ingestion_pipeline.json
diff --git a/...rc/main/java/org/elasticsearch/xpack/application/connector/ConnectorTemplateRegistry.java b/...rc/main/java/org/elasticsearch/xpack/application/connector/ConnectorTemplateRegistry.java
@@ -48,10 +48,6 @@ public class ConnectorTemplateRegistry extends IndexTemplateRegistry {
     public static final String MANAGED_CONNECTOR_INDEX_PREFIX = "content-";
 
     // Pipeline constants
-
-    public static final String ENT_SEARCH_GENERIC_PIPELINE_NAME = "ent-search-generic-ingestion";
-    public static final String ENT_SEARCH_GENERIC_PIPELINE_FILE = "generic_ingestion_pipeline";
-
     public static final String SEARCH_DEFAULT_PIPELINE_NAME = "search-default-ingestion";
     public static final String SEARCH_DEFAULT_PIPELINE_FILE = "search_default_pipeline";
 
@@ -111,12 +107,6 @@ public class ConnectorTemplateRegistry extends IndexTemplateRegistry {
     @Override
     protected List<IngestPipelineConfig> getIngestPipelines() {
         return List.of(
-            new JsonIngestPipelineConfig(
-                ENT_SEARCH_GENERIC_PIPELINE_NAME,
-                ROOT_RESOURCE_PATH + ENT_SEARCH_GENERIC_PIPELINE_FILE + ".json",
-                REGISTRY_VERSION,
-                TEMPLATE_VERSION_VARIABLE
-            ),
             new JsonIngestPipelineConfig(
                 SEARCH_DEFAULT_PIPELINE_NAME,
                 ROOT_RESOURCE_PATH + SEARCH_DEFAULT_PIPELINE_FILE + ".json",

diff --git a/...test/java/org/elasticsearch/xpack/application/connector/ConnectorIngestPipelineTests.java b/...test/java/org/elasticsearch/xpack/application/connector/ConnectorIngestPipelineTests.java
@@ -50,7 +50,7 @@ public void testToXContent() throws IOException {
         String content = XContentHelper.stripWhitespace("""
                 {
                     "extract_binary_content": true,
-                    "name": "ent-search-generic-ingestion",
+                    "name": "search-default-ingestion",
                     "reduce_whitespace": true,
                     "run_ml_inference": false
                 }