Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New indexes created for datastreams after update to 8.15.0 are without lifecycle policies #13898

Closed
lahsivjar opened this issue Aug 20, 2024 · 11 comments · Fixed by elastic/elasticsearch#112028 or elastic/elasticsearch#112759
Assignees
Milestone

Comments

@lahsivjar
Copy link
Contributor

lahsivjar commented Aug 20, 2024

APM Server version (apm-server version): 8.15.0

Description of the problem including expected versus actual behavior: In 8.15.0 we migrated from ILM to DSL. New indexes created for clusters which migrate to 8.15.0 don't have any lifecycle attached as existing datastream needs to be updated explicitly: https://www.elastic.co/guide/en/elasticsearch/reference/current/tutorial-manage-existing-data-stream.html.

Steps to reproduce:

Please include a minimal but complete recreation of the problem,
including server configuration, agent(s) used, etc. The easier you make it
for us to reproduce it, the more likely that somebody will take the time to
look at it.

  1. Create a cluster with version < 8.15.0
  2. Send data to the cluster (for example: traces) (and continue sending it throughout the steps)
  3. Perform a rollover on the datastream getting data
  4. Upgrade the cluster to 8.15.0
  5. Observe that the latest indices are Unmanaged (for example: by using GET /_data_stream/traces-apm-default for traces datastream)
  6. Perform a rollover on the datastream getting data
  7. Observe that the newly created index is also Unmanaged (for example: by using GET /_data_stream/traces-apm-default for traces datastream)

Temporary mitigation is to explicitly set the lifecycle using the PUT API. For example, the below operations would set data retention based on APM defaults for all APM datastreams:

PUT _data_stream/traces-apm-*/_lifecycle
{
  "data_retention": "10d" 
}

PUT _data_stream/traces-apm.rum*/_lifecycle
{
  "data_retention": "90d" 
}

PUT _data_stream/traces-apm.sampled*/_lifecycle
{
  "data_retention": "1h" 
}

PUT _data_stream/metrics-apm.*.1m-*/_lifecycle
{
  "data_retention": "90d" 
}

PUT _data_stream/metrics-apm.*.10m-*/_lifecycle
{
  "data_retention": "180d" 
}

PUT _data_stream/metrics-apm.*.60m-*/_lifecycle
{
  "data_retention": "390d" 
}

PUT _data_stream/metrics-apm.internal-*/_lifecycle
{
  "data_retention": "90d" 
}

PUT _data_stream/metrics-apm.app.*/_lifecycle
{
  "data_retention": "90d" 
}

PUT _data_stream/logs-apm.*/_lifecycle
{
  "data_retention": "10d" 
}

Provide logs (if relevant): N/A

@lahsivjar
Copy link
Contributor Author

The fix for v8.15.1 has been merged via elastic/elasticsearch#112097, however, the same patch cannot be applied to v8.16.0. The details of this are tracked in elastic/elasticsearch#112137

@axw
Copy link
Member

axw commented Aug 30, 2024

I don't love elastic/elasticsearch#112097 as a solution, since it kinda implies we'll keep the old ILM config around in the index templates for perpetuity.

Perhaps we can remove that while also addressing serverless, by dynamically injecting the ILM config into the index template if there are existing indices with ILM config?

@axw
Copy link
Member

axw commented Sep 2, 2024

We need to handle the following scenarios:

  • Create a new 8.15.1+ cluster, with defaults: DLM should be used
  • Create a new 8.15.1+ cluster, with customised ILM (https://www.elastic.co/guide/en/observability/current/apm-ilm-how-to.html#apm-data-streams-custom-three): customised ILM policy should be used
  • Upgrade 8.14.x to 8.15.1+ with defaults: ILM should continue to be used for old indices, DLM should be used for new indices
  • Upgrade 8.14.x to 8.15.1+ with customised ILM (same guide as above): ILM should continue to be used for old indices, and also for new indices

@lahsivjar
Copy link
Contributor Author

lahsivjar commented Sep 2, 2024

With the current changes, the custom ILM policies would break even for versions >= 8.15.1 as index templates cannot be overridden. elastic/elasticsearch#112432 provides a fix for this by moving the fallback to component template which could be overridden as required. NOTE that this would also require us to update our documents for configuring custom ILM policy to include prefer_ilm: true for datastreams created on or after 8.15.x

Testing elastic/elasticsearch#112432 locally

Create a new 8.15.1+ cluster, with defaults: DLM should be used

✅ Tested (new cluster created locally)
{
  "data_streams": [
    {
      "name": "traces-apm-default",
      "timestamp_field": {
        "name": "@timestamp"
      },
      "indices": [
        {
          "index_name": ".ds-traces-apm-default-2024.09.02-000001",
          "index_uuid": "4rLT17ZQQIC6mFDlPS1SNA",
          "prefer_ilm": false,
          "ilm_policy": "traces-apm.traces-default_policy",
          "managed_by": "Data stream lifecycle"
        }
      ],
      "generation": 1,
      "_meta": {
        "description": "Index template for traces-apm-*",
        "managed": true
      },
      "status": "YELLOW",
      "template": "traces-apm@template",
      "lifecycle": {
        "enabled": true,
        "data_retention": "10d"
      },
      "ilm_policy": "traces-apm.traces-default_policy",
      "next_generation_managed_by": "Data stream lifecycle",
      "prefer_ilm": false,
      "hidden": false,
      "system": false,
      "allow_custom_routing": false,
      "replicated": false,
      "rollover_on_write": false,
      "failure_store": {
        "enabled": false,
        "rollover_on_write": true,
        "indices": []
      }
    }
  ]
}

Create a new 8.15.1+ cluster, with customised ILM (https://www.elastic.co/guide/en/observability/current/apm-ilm-how-to.html#apm-data-streams-custom-three): customised ILM policy should be used

✅ Tested by created traces-apm@custom component template with a custom ILM policy but prefer_ilm as false

Since the prefer_ilm is false, DSL took priority.

{
  "data_streams": [
    {
      "name": "traces-apm-default",
      "timestamp_field": {
        "name": "@timestamp"
      },
      "indices": [
        {
          "index_name": ".ds-traces-apm-default-2024.09.02-000001",
          "index_uuid": "4rLT17ZQQIC6mFDlPS1SNA",
          "prefer_ilm": false,
          "ilm_policy": "traces-apm.traces-default_policy",
          "managed_by": "Data stream lifecycle"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.02-000002",
          "index_uuid": "FeVicze2S7ex5o7R7yeLaQ",
          "prefer_ilm": false,
          "ilm_policy": "custom-ilm-override-dsl",
          "managed_by": "Data stream lifecycle"
        }
      ],
      "generation": 2,
      "_meta": {
        "description": "Index template for traces-apm-*",
        "managed": true
      },
      "status": "YELLOW",
      "template": "traces-apm@template",
      "lifecycle": {
        "enabled": true,
        "data_retention": "10d"
      },
      "ilm_policy": "custom-ilm-override-dsl",
      "next_generation_managed_by": "Data stream lifecycle",
      "prefer_ilm": false,
      "hidden": false,
      "system": false,
      "allow_custom_routing": false,
      "replicated": false,
      "rollover_on_write": false,
      "failure_store": {
        "enabled": false,
        "rollover_on_write": true,
        "indices": []
      }
    }
  ]
}
✅ Tested by updating traces-apm@custom component template to include prefer_ilm as true
{
  "data_streams": [
    {
      "name": "traces-apm-default",
      "timestamp_field": {
        "name": "@timestamp"
      },
      "indices": [
        {
          "index_name": ".ds-traces-apm-default-2024.09.02-000001",
          "index_uuid": "4rLT17ZQQIC6mFDlPS1SNA",
          "prefer_ilm": false,
          "ilm_policy": "traces-apm.traces-default_policy",
          "managed_by": "Data stream lifecycle"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.02-000002",
          "index_uuid": "FeVicze2S7ex5o7R7yeLaQ",
          "prefer_ilm": false,
          "ilm_policy": "custom-ilm-override-dsl",
          "managed_by": "Data stream lifecycle"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.02-000003",
          "index_uuid": "7ufMCy5RTBaxTBJ78dxC-A",
          "prefer_ilm": true,
          "ilm_policy": "custom-ilm-override-dsl",
          "managed_by": "Index Lifecycle Management"
        }
      ],
      "generation": 3,
      "_meta": {
        "description": "Index template for traces-apm-*",
        "managed": true
      },
      "status": "YELLOW",
      "template": "traces-apm@template",
      "lifecycle": {
        "enabled": true,
        "data_retention": "10d"
      },
      "ilm_policy": "custom-ilm-override-dsl",
      "next_generation_managed_by": "Index Lifecycle Management",
      "prefer_ilm": true,
      "hidden": false,
      "system": false,
      "allow_custom_routing": false,
      "replicated": false,
      "rollover_on_write": false,
      "failure_store": {
        "enabled": false,
        "rollover_on_write": true,
        "indices": []
      }
    }
  ]
}

Upgrade 8.14.x to 8.15.1+ with defaults: ILM should continue to be used for old indices, DLM should be used for new indices

@axw this case, as expected, did not work. The reason is that Datastream Lifecycle needs to be explicitly configured for ALREADY created datastreams. So, for any old cluster which has used APM and upgraded to 8.15.1+ would continue to use ILM (even for new indices) unless they explicitly configure DSL using the PUT API.

❌ Tested by updating from 8.14.x to 8.15.1
{
  "data_streams": [
    {
      "name": "traces-apm-default",
      "timestamp_field": {
        "name": "@timestamp"
      },
      "indices": [
        {
          "index_name": ".ds-traces-apm-default-2024.09.02-000001",
          "index_uuid": "farCK-0lRI-H64IxGfRUuQ",
          "prefer_ilm": true,
          "ilm_policy": "traces-apm.traces-default_policy",
          "managed_by": "Index Lifecycle Management"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.02-000002",
          "index_uuid": "C_qI1J2FQ5O-81VuHMl82A",
          "prefer_ilm": true,
          "ilm_policy": "traces-apm.traces-default_policy",
          "managed_by": "Index Lifecycle Management"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.02-000003",
          "index_uuid": "HVQRbVrtSEm6UmeyPWj8bA",
          "prefer_ilm": false,
          "ilm_policy": "traces-apm.traces-default_policy",
          "managed_by": "Index Lifecycle Management"
        }
      ],
      "generation": 3,
      "_meta": {
        "package": {
          "name": "apm"
        },
        "managed_by": "fleet",
        "managed": true
      },
      "status": "YELLOW",
      "template": "traces-apm@template",
      "ilm_policy": "traces-apm.traces-default_policy",
      "next_generation_managed_by": "Index Lifecycle Management",
      "prefer_ilm": false,
      "hidden": false,
      "system": false,
      "allow_custom_routing": false,
      "replicated": false,
      "rollover_on_write": false,
      "failure_store": {
        "enabled": false,
        "rollover_on_write": false,
        "indices": []
      }
    }
  ]
}

Upgrade 8.14.x to 8.15.1+ with customised ILM (same guide as above): ILM should continue to be used for old indices, and also for new indices

✅ Tested (cluster created locally with 8.14.3 and upgraded to 8.15.1)
{
  "data_streams": [
    {
      "name": "traces-apm-default",
      "timestamp_field": {
        "name": "@timestamp"
      },
      "indices": [
        {
          "index_name": ".ds-traces-apm-default-2024.09.02-000001",
          "index_uuid": "I5k3QG4QRGGZLdUnNAWpJQ",
          "prefer_ilm": true,
          "ilm_policy": "traces-apm.traces-default_policy",
          "managed_by": "Index Lifecycle Management"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.02-000002",
          "index_uuid": "IoauvZ7pQqC-xsRkDTLBkA",
          "prefer_ilm": true,
          "ilm_policy": "custom-ilmdlm-test",
          "managed_by": "Index Lifecycle Management"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.02-000003",
          "index_uuid": "DDJu8FbnTUCapW6Xr13-Kw",
          "prefer_ilm": false,
          "ilm_policy": "custom-ilmdlm-test",
          "managed_by": "Index Lifecycle Management"
        }
      ],
      "generation": 3,
      "_meta": {
        "package": {
          "name": "apm"
        },
        "managed_by": "fleet",
        "managed": true
      },
      "status": "YELLOW",
      "template": "traces-apm@template",
      "ilm_policy": "custom-ilmdlm-test",
      "next_generation_managed_by": "Index Lifecycle Management",
      "prefer_ilm": false,
      "hidden": false,
      "system": false,
      "allow_custom_routing": false,
      "replicated": false,
      "rollover_on_write": false,
      "failure_store": {
        "enabled": false,
        "rollover_on_write": false,
        "indices": []
      }
    }
  ]
}

Upgrade 8.14.x to 8.15.0 with custom ILM policy: custom ILM policy should work as expected (no bugs for this case)

✅ Tested by upgrading from 8.14.3 with custom ILM to 8.15.0
{
  "data_streams": [
    {
      "name": "traces-apm-default",
      "timestamp_field": {
        "name": "@timestamp"
      },
      "indices": [
        {
          "index_name": ".ds-traces-apm-default-2024.09.02-000001",
          "index_uuid": "dFLoaPf3QZC4t7ttplDSqg",
          "prefer_ilm": true,
          "ilm_policy": "custom-ilmdlm-test",
          "managed_by": "Index Lifecycle Management"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.02-000002",
          "index_uuid": "fJ__qQv7Tumu9n_j8hEljA",
          "prefer_ilm": true,
          "ilm_policy": "custom-ilmdlm-test",
          "managed_by": "Index Lifecycle Management"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.02-000003",
          "index_uuid": "kNSWO6pMQ06eZo7yHqKQCw",
          "prefer_ilm": true,
          "ilm_policy": "custom-ilmdlm-test",
          "managed_by": "Index Lifecycle Management"
        }
      ],
      "generation": 3,
      "_meta": {
        "package": {
          "name": "apm"
        },
        "managed_by": "fleet",
        "managed": true
      },
      "status": "YELLOW",
      "template": "traces-apm@template",
      "ilm_policy": "custom-ilmdlm-test",
      "next_generation_managed_by": "Index Lifecycle Management",
      "prefer_ilm": true,
      "hidden": false,
      "system": false,
      "allow_custom_routing": false,
      "replicated": false,
      "rollover_on_write": false
    }
  ]
}

Upgrade 8.14.x to 8.15.0 with default ILM policy and then upgrade to 8.15.1

This, as expected, creates unmanaged indices. We would need to suggest workaround for this in our changelog/release-notes. The simplest way would be to have users with default ILM settings explicitly configure DSL for all datastreams. Since default ILM and DSL are identical, there would be no impact to the users. In future, if the user wants to move to ILM, they can do so by creating a custom component template. WDYT @axw ?

⚠️ Tested: leads to unmanaged indices for indices that were created in 8.15.0
{
  "data_streams": [
    {
      "name": "traces-apm-default",
      "timestamp_field": {
        "name": "@timestamp"
      },
      "indices": [
        {
          "index_name": ".ds-traces-apm-default-2024.09.02-000001",
          "index_uuid": "4MAl7phPSkexDJFda-38RQ",
          "prefer_ilm": true,
          "ilm_policy": "traces-apm.traces-default_policy",
          "managed_by": "Index Lifecycle Management"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.02-000002",
          "index_uuid": "hv1XnUBeQmim7VkOgvHAiw",
          "prefer_ilm": true,
          "ilm_policy": "traces-apm.traces-default_policy",
          "managed_by": "Index Lifecycle Management"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.02-000003",
          "index_uuid": "jb2UpzXgQ_CmcVPTW2ICwA",
          "prefer_ilm": true,
          "managed_by": "Unmanaged"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.02-000004",
          "index_uuid": "4e9MCYGWSqKR8BZ2JYKZeA",
          "prefer_ilm": true,
          "managed_by": "Unmanaged"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.02-000005",
          "index_uuid": "OlSp2sXGRA6ozl86UIe3Lw",
          "prefer_ilm": false,
          "ilm_policy": "traces-apm.traces-default_policy",
          "managed_by": "Index Lifecycle Management"
        }
      ],
      "generation": 5,
      "_meta": {
        "package": {
          "name": "apm"
        },
        "managed_by": "fleet",
        "managed": true
      },
      "status": "YELLOW",
      "template": "traces-apm@template",
      "ilm_policy": "traces-apm.traces-default_policy",
      "next_generation_managed_by": "Index Lifecycle Management",
      "prefer_ilm": false,
      "hidden": false,
      "system": false,
      "allow_custom_routing": false,
      "replicated": false,
      "rollover_on_write": false,
      "failure_store": {
        "enabled": false,
        "rollover_on_write": false,
        "indices": []
      }
    }
  ]
}

@simitt
Copy link
Contributor

simitt commented Sep 2, 2024

Reading through the test cases, can you clarify if this is true: customers on 8.14.3 with custom ILM policies, who upgrade to 8.15.1 will not have to do any manual interactions for their custom ILM policies to continue to be applied?

Please also provide test scenarios where users already have upgraded to 8.15.0 and then upgrade to 8.15.1.

@lahsivjar
Copy link
Contributor Author

lahsivjar commented Sep 2, 2024

Reading through the test cases, can you clarify if this is true: customers on 8.14.3 with custom ILM policies, who upgrade to 8.15.1 will not have to do any manual interactions for their custom ILM policies to continue to be applied?

True, no changes need to be done for this case.

Please also provide test scenarios where users already have upgraded to 8.15.0 and then upgrade to 8.15.1.

For customers having custom ILM, all would be good. Even for version 8.15.0, they would be in the clear i.e. all their indices would be managed by the configured custom ILM policy (already tested above).

However, if a customer with the default ILM policy has moved to 8.15.0 and then upgraded to 8.15.1, then, the indices created in 8.15.0 would be unmanaged even after the upgrade to 8.15.1. Indices created after the upgrade to 8.15.1 would be good though. I was initially thinking of suggesting configuring DSL as when DSL on a datastream is configured then all unmanaged indices are moved to be managed by DSL, however, this would have the side-effect of moving from ILM to DSL. I don't think this should be a big deal as the solution would only be required for installations using the default ILM policies, but, would be good if others could validate this (CC:@Silvia, @axw)

([DONE] PS: I will update the test case comment with the details on 8.15.0 as a step in the upgrade path)

@kruskall
Copy link
Member

kruskall commented Sep 2, 2024

Test with 8.14.3 -> 8.15.0 -> 8.15.1

  • Create 8.14.3-SNAPSHOT deployment
  • send data
  • run GET /_data_stream/traces-apm-default
json
{
  "data_streams": [
    {
      "name": "traces-apm-default",
      "timestamp_field": {
        "name": "@timestamp"
      },
      "indices": [
        {
          "index_name": ".ds-traces-apm-default-2024.09.02-000001",
          "index_uuid": "DxvSZVQhQlO-caBMRWQEzQ",
          "prefer_ilm": true,
          "ilm_policy": "traces-apm.traces-default_policy",
          "managed_by": "Index Lifecycle Management"
        }
      ],
      "generation": 1,
      "failure_indices": [],
      "_meta": {
        "package": {
          "name": "apm"
        },
        "managed": true,
        "managed_by": "fleet"
      },
      "status": "GREEN",
      "template": "traces-apm",
      "ilm_policy": "traces-apm.traces-default_policy",
      "next_generation_managed_by": "Index Lifecycle Management",
      "prefer_ilm": true,
      "hidden": false,
      "system": false,
      "allow_custom_routing": false,
      "replicated": false,
      "rollover_on_write": false,
      "failure_store": false
    }
  ]
}
  • perform rollover: POST /traces-apm-default/_rollover/
  • run GET /_data_stream/traces-apm-default
json
{
  "data_streams": [
    {
      "name": "traces-apm-default",
      "timestamp_field": {
        "name": "@timestamp"
      },
      "indices": [
        {
          "index_name": ".ds-traces-apm-default-2024.09.02-000001",
          "index_uuid": "DxvSZVQhQlO-caBMRWQEzQ",
          "prefer_ilm": true,
          "ilm_policy": "traces-apm.traces-default_policy",
          "managed_by": "Index Lifecycle Management"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.02-000002",
          "index_uuid": "oOlQOsG6RJ6rXbybz-sPtg",
          "prefer_ilm": true,
          "ilm_policy": "traces-apm.traces-default_policy",
          "managed_by": "Index Lifecycle Management"
        }
      ],
      "generation": 2,
      "failure_indices": [],
      "_meta": {
        "package": {
          "name": "apm"
        },
        "managed_by": "fleet",
        "managed": true
      },
      "status": "GREEN",
      "template": "traces-apm",
      "ilm_policy": "traces-apm.traces-default_policy",
      "next_generation_managed_by": "Index Lifecycle Management",
      "prefer_ilm": true,
      "hidden": false,
      "system": false,
      "allow_custom_routing": false,
      "replicated": false,
      "rollover_on_write": false,
      "failure_store": false
    }
  ]
}
  • upgrade to 8.15.0-SNAPSHOT
  • run GET /_data_stream/traces-apm-default
json
{
  "data_streams": [
    {
      "name": "traces-apm-default",
      "timestamp_field": {
        "name": "@timestamp"
      },
      "indices": [
        {
          "index_name": ".ds-traces-apm-default-2024.09.02-000001",
          "index_uuid": "DxvSZVQhQlO-caBMRWQEzQ",
          "prefer_ilm": true,
          "ilm_policy": "traces-apm.traces-default_policy",
          "managed_by": "Index Lifecycle Management"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.02-000002",
          "index_uuid": "oOlQOsG6RJ6rXbybz-sPtg",
          "prefer_ilm": true,
          "ilm_policy": "traces-apm.traces-default_policy",
          "managed_by": "Index Lifecycle Management"
        }
      ],
      "generation": 2,
      "_meta": {
        "package": {
          "name": "apm"
        },
        "managed_by": "fleet",
        "managed": true
      },
      "status": "GREEN",
      "template": "traces-apm@template",
      "next_generation_managed_by": "Unmanaged",
      "prefer_ilm": true,
      "hidden": false,
      "system": false,
      "allow_custom_routing": false,
      "replicated": false,
      "rollover_on_write": false,
      "failure_store": {
        "enabled": false,
        "rollover_on_write": false,
        "indices": []
      }
    }
  ]
}
  • perform rollover: POST /traces-apm-default/_rollover/
  • run GET /_data_stream/traces-apm-default
json
{
  "data_streams": [
    {
      "name": "traces-apm-default",
      "timestamp_field": {
        "name": "@timestamp"
      },
      "indices": [
        {
          "index_name": ".ds-traces-apm-default-2024.09.02-000001",
          "index_uuid": "DxvSZVQhQlO-caBMRWQEzQ",
          "prefer_ilm": true,
          "ilm_policy": "traces-apm.traces-default_policy",
          "managed_by": "Index Lifecycle Management"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.02-000002",
          "index_uuid": "oOlQOsG6RJ6rXbybz-sPtg",
          "prefer_ilm": true,
          "ilm_policy": "traces-apm.traces-default_policy",
          "managed_by": "Index Lifecycle Management"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.02-000003",
          "index_uuid": "1duuNfCwT6aS_55IEfzMAQ",
          "prefer_ilm": true,
          "managed_by": "Unmanaged"
        }
      ],
      "generation": 3,
      "_meta": {
        "package": {
          "name": "apm"
        },
        "managed_by": "fleet",
        "managed": true
      },
      "status": "GREEN",
      "template": "traces-apm@template",
      "next_generation_managed_by": "Unmanaged",
      "prefer_ilm": true,
      "hidden": false,
      "system": false,
      "allow_custom_routing": false,
      "replicated": false,
      "rollover_on_write": false,
      "failure_store": {
        "enabled": false,
        "rollover_on_write": false,
        "indices": []
      }
    }
  ]
}
  • upgrade to 8.15.1-SNAPSHOT
  • run GET /_data_stream/traces-apm-default
json
{
  "data_streams": [
    {
      "name": "traces-apm-default",
      "timestamp_field": {
        "name": "@timestamp"
      },
      "indices": [
        {
          "index_name": ".ds-traces-apm-default-2024.09.02-000001",
          "index_uuid": "DxvSZVQhQlO-caBMRWQEzQ",
          "prefer_ilm": true,
          "ilm_policy": "traces-apm.traces-default_policy",
          "managed_by": "Index Lifecycle Management"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.02-000002",
          "index_uuid": "oOlQOsG6RJ6rXbybz-sPtg",
          "prefer_ilm": true,
          "ilm_policy": "traces-apm.traces-default_policy",
          "managed_by": "Index Lifecycle Management"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.02-000003",
          "index_uuid": "1duuNfCwT6aS_55IEfzMAQ",
          "prefer_ilm": true,
          "managed_by": "Unmanaged"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.02-000004",
          "index_uuid": "wVXW11CzTAavehkx1FksLw",
          "prefer_ilm": false,
          "ilm_policy": "traces-apm.traces-default_policy",
          "managed_by": "Index Lifecycle Management"
        }
      ],
      "generation": 4,
      "_meta": {
        "package": {
          "name": "apm"
        },
        "managed_by": "fleet",
        "managed": true
      },
      "status": "GREEN",
      "template": "traces-apm@template",
      "ilm_policy": "traces-apm.traces-default_policy",
      "next_generation_managed_by": "Index Lifecycle Management",
      "prefer_ilm": false,
      "hidden": false,
      "system": false,
      "allow_custom_routing": false,
      "replicated": false,
      "rollover_on_write": false,
      "failure_store": {
        "enabled": false,
        "rollover_on_write": false,
        "indices": []
      }
    }
  ]
}
  • perform rollover: POST /traces-apm-default/_rollover/
  • run GET /_data_stream/traces-apm-default
json
{
  "data_streams": [
    {
      "name": "traces-apm-default",
      "timestamp_field": {
        "name": "@timestamp"
      },
      "indices": [
        {
          "index_name": ".ds-traces-apm-default-2024.09.02-000001",
          "index_uuid": "DxvSZVQhQlO-caBMRWQEzQ",
          "prefer_ilm": true,
          "ilm_policy": "traces-apm.traces-default_policy",
          "managed_by": "Index Lifecycle Management"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.02-000002",
          "index_uuid": "oOlQOsG6RJ6rXbybz-sPtg",
          "prefer_ilm": true,
          "ilm_policy": "traces-apm.traces-default_policy",
          "managed_by": "Index Lifecycle Management"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.02-000003",
          "index_uuid": "1duuNfCwT6aS_55IEfzMAQ",
          "prefer_ilm": true,
          "managed_by": "Unmanaged"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.02-000004",
          "index_uuid": "wVXW11CzTAavehkx1FksLw",
          "prefer_ilm": false,
          "ilm_policy": "traces-apm.traces-default_policy",
          "managed_by": "Index Lifecycle Management"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.02-000005",
          "index_uuid": "UJ8VufpYTgyKa-tXaB4BPA",
          "prefer_ilm": false,
          "ilm_policy": "traces-apm.traces-default_policy",
          "managed_by": "Index Lifecycle Management"
        }
      ],
      "generation": 5,
      "_meta": {
        "package": {
          "name": "apm"
        },
        "managed_by": "fleet",
        "managed": true
      },
      "status": "GREEN",
      "template": "traces-apm@template",
      "ilm_policy": "traces-apm.traces-default_policy",
      "next_generation_managed_by": "Index Lifecycle Management",
      "prefer_ilm": false,
      "hidden": false,
      "system": false,
      "allow_custom_routing": false,
      "replicated": false,
      "rollover_on_write": false,
      "failure_store": {
        "enabled": false,
        "rollover_on_write": false,
        "indices": []
      }
    }
  ]
}

@axw
Copy link
Member

axw commented Sep 2, 2024

This, as expected, creates unmanaged indices. We would need to suggest workaround for this in our changelog/release-notes. The simplest way would be to have users with default ILM settings explicitly configure DSL for all datastreams. Since default ILM and DSL are identical, there would be no impact to the users. In future, if the user wants to move to ILM, they can do so by creating a custom component template. WDYT @axw ?

It's the simplest code change, but I'm not convinced that anyone will go ahead and actively switch their data streams to DSL; I expect we'll end up with users indefinitely sticking with the legacy ILM policies and bifurcating the Serverless & Hosted experience. IMO we should take one of two routes:

  1. Automatically switch data streams with default ILM policy to DSL on upgrade. This should be non-breaking, since the retention periods match, and there's no use of hot/warm/cold in the default ILM policies.
  2. Automatically switch all data streams to DSL on upgrade. This would be breaking when users have customised ILM, so we would need to document how users can re-enable ILM by setting prefer_ilm: true in their @custom component templates.

I prefer (2) since users will need to start setting prefer_ilm: true in their @custom component templates if they want to use ILM. The only other alternative I can see is to revert the use of DSL, but I feel like the cat's already out of the bag with 8.15.0 being released.

@simitt
Copy link
Contributor

simitt commented Sep 3, 2024

Upgrade 8.14.x to 8.15.0 with default ILM policy and then upgrade to 8.15.1

This, as expected, creates unmanaged indices. We would need to suggest workaround for this in our changelog/release-notes. The simplest way would be to have users with default ILM settings explicitly configure DSL for all datastreams. Since default ILM and DSL are identical, there would be no impact to the users. In future, if the user wants to move to ILM, they can do so by creating a custom component template. WDYT @axw ?

@lahsivjar it only creates unmanaged indices in 8.15.0, right? With the fix and changes in 8.15.1, my understanding is that as soon as users upgrade to this version, new, managed backing indices will be created. The 8.15.0 ones would still be unmanaged, but any new ones would not. Can you confirm this?

@lahsivjar
Copy link
Contributor Author

@lahsivjar it only creates unmanaged indices in 8.15.0, right? With the fix and changes in 8.15.1, my understanding is that as soon as users upgrade to this version, new, managed backing indices will be created. The 8.15.0 ones would still be unmanaged, but any new ones would not. Can you confirm this?

Yes, this is correct. Only indices created with version 8.15.0 will remain unmanaged.

@lahsivjar
Copy link
Contributor Author

lahsivjar commented Sep 3, 2024

Testing with BC

Create a new 8.15.1+ cluster, with defaults: DLM should be used

✅ Tested
{
  "data_streams": [
    {
      "name": "traces-apm-default",
      "timestamp_field": {
        "name": "@timestamp"
      },
      "indices": [
        {
          "index_name": ".ds-traces-apm-default-2024.09.03-000001",
          "index_uuid": "T5YO3keARTqvNSbSOKFuJQ",
          "prefer_ilm": false,
          "ilm_policy": "traces-apm.traces-default_policy",
          "managed_by": "Data stream lifecycle"
        }
      ],
      "generation": 1,
      "_meta": {
        "managed": true,
        "description": "Index template for traces-apm-*"
      },
      "status": "YELLOW",
      "template": "traces-apm@template",
      "lifecycle": {
        "enabled": true,
        "data_retention": "10d"
      },
      "ilm_policy": "traces-apm.traces-default_policy",
      "next_generation_managed_by": "Data stream lifecycle",
      "prefer_ilm": false,
      "hidden": false,
      "system": false,
      "allow_custom_routing": false,
      "replicated": false,
      "rollover_on_write": false
    }
  ]
}

Create a new 8.15.1+ cluster, with customised ILM (https://www.elastic.co/guide/en/observability/current/apm-ilm-how-to.html#apm-data-streams-custom-three): customised ILM policy should be used

✅ Tested (requires to set `"prefer_ilm": true` in the `@custom` component template)
{
  "data_streams": [
    {
      "name": "traces-apm-default",
      "timestamp_field": {
        "name": "@timestamp"
      },
      "indices": [
        {
          "index_name": ".ds-traces-apm-default-2024.09.03-000001",
          "index_uuid": "T5YO3keARTqvNSbSOKFuJQ",
          "prefer_ilm": false,
          "ilm_policy": "traces-apm.traces-default_policy",
          "managed_by": "Data stream lifecycle"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.03-000002",
          "index_uuid": "2gyoEUu7TvSdcEQm6NDJUw",
          "prefer_ilm": false,
          "ilm_policy": "traces-apm.traces-default_policy",
          "managed_by": "Data stream lifecycle"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.03-000003",
          "index_uuid": "qRE3mbVORzKdhJ7Iy9TafQ",
          "prefer_ilm": true,
          "ilm_policy": "custom-ilmdlm",
          "managed_by": "Index Lifecycle Management"
        }
      ],
      "generation": 3,
      "_meta": {
        "managed": true,
        "description": "Index template for traces-apm-*"
      },
      "status": "YELLOW",
      "template": "traces-apm@template",
      "lifecycle": {
        "enabled": true,
        "data_retention": "10d"
      },
      "ilm_policy": "custom-ilmdlm",
      "next_generation_managed_by": "Index Lifecycle Management",
      "prefer_ilm": true,
      "hidden": false,
      "system": false,
      "allow_custom_routing": false,
      "replicated": false,
      "rollover_on_write": false
    }
  ]
}

Upgrade 8.14.x to 8.15.1+ with defaults: ILM should continue to be used for old indices, DLM should be used for new indices

⚠️ Tested (Note that new indices would continue to use ILM on upgrade)
{
  "data_streams": [
    {
      "name": "traces-apm-default",
      "timestamp_field": {
        "name": "@timestamp"
      },
      "indices": [
        {
          "index_name": ".ds-traces-apm-default-2024.09.03-000001",
          "index_uuid": "Y4BYJa12S965zieMmkKP2g",
          "prefer_ilm": true,
          "ilm_policy": "traces-apm.traces-default_policy",
          "managed_by": "Index Lifecycle Management"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.03-000002",
          "index_uuid": "ksh_hUWmSVCZpBNPdlePvQ",
          "prefer_ilm": true,
          "ilm_policy": "traces-apm.traces-default_policy",
          "managed_by": "Index Lifecycle Management"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.03-000003",
          "index_uuid": "G7ujwqE8QiCY-SWxwlSitQ",
          "prefer_ilm": false,
          "ilm_policy": "traces-apm.traces-default_policy",
          "managed_by": "Index Lifecycle Management"
        }
      ],
      "generation": 3,
      "_meta": {
        "package": {
          "name": "apm"
        },
        "managed_by": "fleet",
        "managed": true
      },
      "status": "YELLOW",
      "template": "traces-apm@template",
      "ilm_policy": "traces-apm.traces-default_policy",
      "next_generation_managed_by": "Index Lifecycle Management",
      "prefer_ilm": false,
      "hidden": false,
      "system": false,
      "allow_custom_routing": false,
      "replicated": false,
      "rollover_on_write": false
    }
  ]
}

Upgrade 8.14.x to 8.15.1+ with customised ILM (same guide as above): ILM should continue to be used for old indices, and also for new indices

✅ Tested
{
  "data_streams": [
    {
      "name": "traces-apm-default",
      "timestamp_field": {
        "name": "@timestamp"
      },
      "indices": [
        {
          "index_name": ".ds-traces-apm-default-2024.09.03-000001",
          "index_uuid": "kU3XXLL0Riq603vV8rOHwA",
          "prefer_ilm": true,
          "ilm_policy": "custom-ilmdlm",
          "managed_by": "Index Lifecycle Management"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.03-000002",
          "index_uuid": "FPO8lBu4T8W29TcsxfAZ8g",
          "prefer_ilm": true,
          "ilm_policy": "custom-ilmdlm",
          "managed_by": "Index Lifecycle Management"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.03-000003",
          "index_uuid": "OWSm1CIVTBK8txzi4E23_g",
          "prefer_ilm": false,
          "ilm_policy": "custom-ilmdlm",
          "managed_by": "Index Lifecycle Management"
        }
      ],
      "generation": 3,
      "_meta": {
        "package": {
          "name": "apm"
        },
        "managed_by": "fleet",
        "managed": true
      },
      "status": "YELLOW",
      "template": "traces-apm@template",
      "ilm_policy": "custom-ilmdlm",
      "next_generation_managed_by": "Index Lifecycle Management",
      "prefer_ilm": false,
      "hidden": false,
      "system": false,
      "allow_custom_routing": false,
      "replicated": false,
      "rollover_on_write": false
    }
  ]
}

Upgrade 8.14.x to 8.15.0 with custom ILM policy: custom ILM policy should work as expected (no bugs for this case)

✅ Tested
{
  "data_streams": [
    {
      "name": "traces-apm-default",
      "timestamp_field": {
        "name": "@timestamp"
      },
      "indices": [
        {
          "index_name": ".ds-traces-apm-default-2024.09.03-000001",
          "index_uuid": "FF5029kASPiBFiXQIIpHFQ",
          "prefer_ilm": true,
          "ilm_policy": "custom-ilmdlm",
          "managed_by": "Index Lifecycle Management"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.03-000002",
          "index_uuid": "hSYm1-G1QOeqc2DmEyTVGw",
          "prefer_ilm": true,
          "ilm_policy": "custom-ilmdlm",
          "managed_by": "Index Lifecycle Management"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.03-000003",
          "index_uuid": "HURNF7zhTYG_8NQGSRTPRw",
          "prefer_ilm": true,
          "ilm_policy": "custom-ilmdlm",
          "managed_by": "Index Lifecycle Management"
        }
      ],
      "generation": 3,
      "_meta": {
        "package": {
          "name": "apm"
        },
        "managed_by": "fleet",
        "managed": true
      },
      "status": "YELLOW",
      "template": "traces-apm@template",
      "ilm_policy": "custom-ilmdlm",
      "next_generation_managed_by": "Index Lifecycle Management",
      "prefer_ilm": true,
      "hidden": false,
      "system": false,
      "allow_custom_routing": false,
      "replicated": false,
      "rollover_on_write": false
    }
  ]
}
✅ Tested (after upgrading the same cluster from 8.15.0 to 8.15.1, continues to use ILM)
{
  "data_streams": [
    {
      "name": "traces-apm-default",
      "timestamp_field": {
        "name": "@timestamp"
      },
      "indices": [
        {
          "index_name": ".ds-traces-apm-default-2024.09.03-000001",
          "index_uuid": "FF5029kASPiBFiXQIIpHFQ",
          "prefer_ilm": true,
          "ilm_policy": "custom-ilmdlm",
          "managed_by": "Index Lifecycle Management"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.03-000002",
          "index_uuid": "hSYm1-G1QOeqc2DmEyTVGw",
          "prefer_ilm": true,
          "ilm_policy": "custom-ilmdlm",
          "managed_by": "Index Lifecycle Management"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.03-000003",
          "index_uuid": "HURNF7zhTYG_8NQGSRTPRw",
          "prefer_ilm": true,
          "ilm_policy": "custom-ilmdlm",
          "managed_by": "Index Lifecycle Management"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.03-000004",
          "index_uuid": "D7YUM1PdSsSpH-taPns1vw",
          "prefer_ilm": false,
          "ilm_policy": "custom-ilmdlm",
          "managed_by": "Index Lifecycle Management"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.03-000005",
          "index_uuid": "gbSyICRJQl-x0T6Dceq1XA",
          "prefer_ilm": false,
          "ilm_policy": "custom-ilmdlm",
          "managed_by": "Index Lifecycle Management"
        }
      ],
      "generation": 5,
      "_meta": {
        "package": {
          "name": "apm"
        },
        "managed_by": "fleet",
        "managed": true
      },
      "status": "YELLOW",
      "template": "traces-apm@template",
      "ilm_policy": "custom-ilmdlm",
      "next_generation_managed_by": "Index Lifecycle Management",
      "prefer_ilm": false,
      "hidden": false,
      "system": false,
      "allow_custom_routing": false,
      "replicated": false,
      "rollover_on_write": false
    }
  ]
}

Upgrade 8.14.x to 8.15.0 with default ILM policy and then upgrade to 8.15.1

⚠️ Tested (indices created in 8.15.0 remain unmanaged but new ones are managed by ILM)
{
  "data_streams": [
    {
      "name": "traces-apm-default",
      "timestamp_field": {
        "name": "@timestamp"
      },
      "indices": [
        {
          "index_name": ".ds-traces-apm-default-2024.09.03-000001",
          "index_uuid": "HdvHJejIRM6BepeJpoZP9Q",
          "prefer_ilm": true,
          "ilm_policy": "traces-apm.traces-default_policy",
          "managed_by": "Index Lifecycle Management"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.03-000002",
          "index_uuid": "oxcuDgYiQu6uEzwcbjRc0w",
          "prefer_ilm": true,
          "ilm_policy": "traces-apm.traces-default_policy",
          "managed_by": "Index Lifecycle Management"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.03-000003",
          "index_uuid": "VIhBWb41RWyObWU06CDwDQ",
          "prefer_ilm": true,
          "managed_by": "Unmanaged"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.03-000004",
          "index_uuid": "WH4nZ_zwQySm5nqyg7fOYQ",
          "prefer_ilm": true,
          "managed_by": "Unmanaged"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.03-000005",
          "index_uuid": "IboaSTnBTtmgYpnCLS-hPw",
          "prefer_ilm": false,
          "ilm_policy": "traces-apm.traces-default_policy",
          "managed_by": "Index Lifecycle Management"
        }
      ],
      "generation": 5,
      "_meta": {
        "package": {
          "name": "apm"
        },
        "managed_by": "fleet",
        "managed": true
      },
      "status": "YELLOW",
      "template": "traces-apm@template",
      "ilm_policy": "traces-apm.traces-default_policy",
      "next_generation_managed_by": "Index Lifecycle Management",
      "prefer_ilm": false,
      "hidden": false,
      "system": false,
      "allow_custom_routing": false,
      "replicated": false,
      "rollover_on_write": false
    }
  ]
}
✅ Tested (on explicitly applying datastream lifecycle config)

The config applied is as per what we recommend in known-issues doc.

{
  "data_streams": [
    {
      "name": "traces-apm-default",
      "timestamp_field": {
        "name": "@timestamp"
      },
      "indices": [
        {
          "index_name": ".ds-traces-apm-default-2024.09.03-000001",
          "index_uuid": "HdvHJejIRM6BepeJpoZP9Q",
          "prefer_ilm": true,
          "ilm_policy": "traces-apm.traces-default_policy",
          "managed_by": "Index Lifecycle Management"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.03-000002",
          "index_uuid": "oxcuDgYiQu6uEzwcbjRc0w",
          "prefer_ilm": true,
          "ilm_policy": "traces-apm.traces-default_policy",
          "managed_by": "Index Lifecycle Management"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.03-000003",
          "index_uuid": "VIhBWb41RWyObWU06CDwDQ",
          "prefer_ilm": true,
          "managed_by": "Data stream lifecycle"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.03-000004",
          "index_uuid": "WH4nZ_zwQySm5nqyg7fOYQ",
          "prefer_ilm": true,
          "managed_by": "Data stream lifecycle"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.03-000005",
          "index_uuid": "IboaSTnBTtmgYpnCLS-hPw",
          "prefer_ilm": false,
          "ilm_policy": "traces-apm.traces-default_policy",
          "managed_by": "Data stream lifecycle"
        },
        {
          "index_name": ".ds-traces-apm-default-2024.09.03-000006",
          "index_uuid": "YAC31m-rTzudlDtiD3LBPg",
          "prefer_ilm": false,
          "ilm_policy": "traces-apm.traces-default_policy",
          "managed_by": "Data stream lifecycle"
        }
      ],
      "generation": 6,
      "_meta": {
        "package": {
          "name": "apm"
        },
        "managed_by": "fleet",
        "managed": true
      },
      "status": "YELLOW",
      "template": "traces-apm@template",
      "lifecycle": {
        "enabled": true,
        "data_retention": "10d"
      },
      "ilm_policy": "traces-apm.traces-default_policy",
      "next_generation_managed_by": "Data stream lifecycle",
      "prefer_ilm": false,
      "hidden": false,
      "system": false,
      "allow_custom_routing": false,
      "replicated": false,
      "rollover_on_write": false
    }
  ]
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment