Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SLO] Rename indices without dot prefix #196340

Open
6 tasks
kdelemme opened this issue Oct 15, 2024 · 10 comments
Open
6 tasks

[SLO] Rename indices without dot prefix #196340

kdelemme opened this issue Oct 15, 2024 · 10 comments
Assignees
Labels
Team:obs-ux-management Observability Management User Experience Team

Comments

@kdelemme
Copy link
Contributor

kdelemme commented Oct 15, 2024

🌮 Summary

Since https://github.com/elastic/serverless-gitops/pull/4615, users cannot create dot-prefixed indices, and therefore SLO transforms and ingest pipelines creating dot-prefixed indices for the rollup data are broken.

2024-10-15 13:39:57	Transform encountered an exception: [org.elasticsearch.xpack.transform.transforms.BulkIndexingException: Bulk index experienced [14] failures and at least 1 irrecoverable [Index [<.slo-observability.sli-v3.3.{2024-10-15||/M{yyyy-MM-dd|UTC}}>] name beginning with a dot (.) is not allowed].; java.lang.IllegalArgumentException: Index [<.slo-observability.sli-v3.3.{2024-10-15||/M{yyyy-MM-dd|UTC}}>] name beginning with a dot (.) is not allowed]; Will automatically retry [1/-1]

The current SLO version (v3.3) uses dot-prefixed indices for both the sli and summary data.

  • Every rollup ingest pipeline (one per SLO rollup transform) uses a date_index_name processor to create new monthly indices: .slo-observability.sli-v${SLO_RESOURCES_VERSION}.YYYY-MM-DD/M
  • Every rollup transform uses the dot-prefixed sli index and the rollup ingest pipeline as destination
  • Every summary transform uses the dot-prefixed sli index as source and the dot-prefixed summary index as destination

We can't reset the SLOs because we need the user's API key to create the rollup transforms with the correct permission on the source index.

🎯 Acceptance criteria

  • Rename dot-prefixed indices to non-dot-prefixed indices
  • Bump version to v4.0 (maybe minor only, i.e. v3.4?)
  • New SLO uses the non-dot-prefixed indices
  • Handle backward compatibility: existing SLO should keep working on stateful (but break on serverless due to index validation)
  • Every read is done on dot-prefixed indices v3.3 and on non-dot-prefixed indices v3.4 (or v4.0)
  • Create/Update/Reset SLO should store the user's API key so unforeseen future upgrade can be done on behalf of the user. Where?

Plan

to be defined

@botelastic botelastic bot added the needs-team Issues missing a team label label Oct 15, 2024
@kdelemme kdelemme added the Team:obs-ux-management Observability Management User Experience Team label Oct 15, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/obs-ux-management-team (Team:obs-ux-management)

@botelastic botelastic bot removed the needs-team Issues missing a team label label Oct 15, 2024
@kdelemme kdelemme self-assigned this Oct 15, 2024
@jasonrhodes
Copy link
Member

Thanks for logging this, @kdelemme -- for right now, I don't know if we want to change SLO index names because dot-prefixed indices makes logical sense for these indices. So for now, I think we won't move forward on THIS, but we'll need to check in with platform to see how they are going to enable this for us if the dot prefix restriction is turned on again.

What would be GREAT is if the secondary auth split ONLY used the stored user's credentials for reading from source indices, and used the internal user for everything else including the write/create side of the transform and ingest pipelines.

@dakrone
Copy link
Member

dakrone commented Oct 30, 2024

@jasonrhodes you mentioned that dot-prefixed indices make sense for these — is there a plan to eventually make these system indices?

@dominiqueclarke dominiqueclarke self-assigned this Oct 30, 2024
@dominiqueclarke
Copy link
Contributor

@dakrone I don't think I have a great understanding of what counts as a system index. Could you clarify? Our component templates are indeed marked as hidden. It's not clear to me what designates an index as a system index. Some setting in elasticsearch? Marking your component template as hidden? Naming it with a dot prefix? Thanks, I really appreciate the context.

@dakrone
Copy link
Member

dakrone commented Oct 31, 2024

I don't think I have a great understanding of what counts as a system index. Could you clarify?

A system index is one where the user is not intended to perform actions directly on the index itself. Rather, the user would interact with Elasticsearch through specialized APIs designed to modify the index in some way. This prevents the user from doing something that could break the functionality of an area.

As an example, the .security index is a system index inside of Elasticsearch is used for storing users and roles. The user cannot directly access this index, however, they have to use the put-role, get-role, put-user, get-user, etc APIs. This allows us to lock down things where a user could potentially break the function of the cluster if they did something like deleting the index, modifying the mappings, changing index settings, etc.

@dominiqueclarke I'm not familiar enough with what data goes into the SLO indices to be able to make a recommendation here, perhaps you can elaborate on how a user interacts with these indices?

@jasonrhodes mentioned dot prefixes make logical sense for these indices, can you explain more on why you think these indices should begin with a dot? They seem to be internal, but a dot prefix is not a requirement for an index to be hidden, so I'd like to understand the motivation there as well.

@kdelemme
Copy link
Contributor Author

@dakrone

I'm not familiar enough with what data goes into the SLO indices to be able to make a recommendation here, perhaps you can elaborate on how a user interacts with these indices?

Technically, the .slo-* related indices are only populated by system, e.g. some transforms. But a user can directly query these indices to graph the data and we also query that from different lens chart directly from Kibana.
Also, we use CCS to access the data (.slo-*) from remote clusters in order to display it in the local cluster.

So we cannot really make them system indices.

When we began working on SLO, we were under the impression that dot-prefixed indices were treated as hidden index by default. And since we didn't want customer to delete the index by mistake, we went with that.

@dakrone
Copy link
Member

dakrone commented Oct 31, 2024

Okay, it sounds like it would be best if these were indices that were hidden, but without a dot-prefix.

@dominiqueclarke
Copy link
Contributor

@dakrone based on the original email announcing this change, we believe that the SLO app's using of dot prefix indices falls squarely within the intended usage. Without specifically quoting that internal email in a public issue I'll enumerate my points

  1. The change is intended to still allow Elastic solutions and products (eg apm, fleet, etc) to create hidden indices. SLO is an Elastic product on the same level as the two products mentioned.
  2. Currently, user created indices may conflict with dot indices. The change to prevent users from creating dot indices resolves. If we rename our index to a non dot index, we risk the same level of confict. We want to avoid conflicts with user created indices by continuing to utilize dot indices.
  3. We want to retain control over the correct retention periods in serverless and leave open the possibility of more complex billing policies surrounding these indices
  4. These indices are never directly created by the user. They are created via an API that defines an SLO. SLOs use transforms as an implementation detail. Unfortunately, the user's credentials are captured and used when generating that transform, and requires the user to have read privileges for the source index and write privileges for the destination. The destination for the transform is a dot prefix index. We intend to reach out to the ML team regarding the ES secondary auth available for the transform API to see if we can somehow use the internal user's privileges for both creating the transform AND writing/creating the dot prefix destination index, while only using the actual user's privileges for READing from the source index.

@jasonrhodes
Copy link
Member

+1 to everything @dominiqueclarke said, this is exactly why it seems like the SLO indices are a prime example of the way dot-prefixed indices are being described in the most recent email communications about them.

@jasonrhodes
Copy link
Member

Note: I've created an issue to test out a possible solution to this situation - #200953

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:obs-ux-management Observability Management User Experience Team
Projects
None yet
Development

No branches or pull requests

5 participants