Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ABC Templates] Proposal on Implementation Details #14649

Open
mgodwan opened this issue Jul 4, 2024 · 1 comment · Fixed by #14659
Open

[ABC Templates] Proposal on Implementation Details #14649

mgodwan opened this issue Jul 4, 2024 · 1 comment · Fixed by #14659
Labels
enhancement Enhancement or improvement to existing feature or request Indexing:Performance RFC Issues requesting major changes Roadmap:Search Project-wide roadmap label v2.16.0 Issues and PRs related to version 2.16.0 v2.17.0

Comments

@mgodwan
Copy link
Member

mgodwan commented Jul 4, 2024

Is your feature request related to a problem? Please describe

Github RFC: #12683

Overview

The proposed templates in RFC #12683 aim to provide a pre-defined set of system templates (made available as part of the distribution), which can allow users to create indices/index-templates for specific use cases without worrying about the granular tuning that are required for those use cases. This document covers certain decisions being proposed to come up with a final proposal around how these templates can be implemented.

Requirements

  1. Application Based Configuration (ABC) templates should provide a sensible set of defaults for use-cases to tune their performance (compute, storage) and usability (index state management)
  2. Suggest Schema based on standards so that different types of integrations can leverage the default settings.
  3. Changes to core settings should be taken care of in the ABC templates as part of new version releases.
  4. New features applicable to the use cases catered by these templates should reflect on these, and apply to the index/index-template as well.
  5. ABC templates should be validated and made available for use as part of the bootstrap process. Any plugin dependent settings should be inferred and validated along with fallback options.

Proposed Implementation Details

Reusing Composable Index/Component Templates Resource

Background: Component Templates allow to declare settings and mappings which can be then used to create composable index templates. Composable Index Templates are defined using a schema which allows a template definition, along with multiple component templates which it can be composed of. These are applied to an index which matches the pattern defined the index template definition. In case of multiple composable index templates matching the index name pattern, the one with the highest priority is used. Some of the properties of index templates today are:

  1. Component Templates today don’t allow to parameterization at all. The settings/mappings can be declared, but there is no way to ensure that users provide certain settings during index creation. Since these are used by index templates today through a composed_of mechanism, we cannot parametrize them without breaking compatibility
  2. Index Templates are used only prior to index creation, and any changes to the index templates are not used once the index has been created.
  3. Only the latest version of these templates is stored in the cluster state.

Proposal: Use component template as the building block (storage model) and filter these templates from being used in composed_of declaration of composable index templates.

  • Component Templates can be used as the building block for ABC templates.
  • ABC Templates can be referred to from an index template or an index [We can start with index templates].
  • ABC Templates will not be tied to any index pattern directly but will need to come from the context definition while creating an index template/index.
  • We can also create specific templates (e.g. OTEL Schema indices) using ABC Templates through a context definition

If we go ahead with this modeling, we can reuse the existing component template storage model, but we will need to add support to parameterize setting values in case to the template definition within the application logic.

We can’t add paramterization support in compsable index templates directly as of today as they break the auto create index flow during rollovers due to which this new internal template layer is needed. Hence, any index template tied with an ABC template will need to declare concrete values for parameters provided in the ABC templates.

e.g.

logs-basic.json // ABC Template

{
  "template": {
    "settings": {
      "codec": "best_compression",
      "merge.policy": "log_byte_size",
      "refresh_interval": "60s"
    }
  },
  "_meta": {
    "__abc_template": true,
    "_version": "1.0.0"
  },
  "version": 1
}



# Users can do the following. Order of application is component template followed by declared template followed by context definition (Context definition supersedes all declarations)
PUT _index_template/my-logs
{
    "pattern": "my-logs-*"
    "template": {
        "settings: {
            "refresh_interval": "60s"
        }
    },
    "context": { 
        "name": "logs-basic", 
        "version": "_latest" # Version is optional, default is latest with upgrades enabled,
        "params": {} # Required if params are present in ABC template
    }
}

Storage Model

  • We can reuse the cluster state component for component templates to store the ABC templates
  • ABC templates will have different application logic to ensure:
    • They are marked as distribution provided templates through a metadata field.
    • They will not be allowed to updated via APIs (i.e. only changes in the repository can allow the templates to be updated)
    • Validation logic for ABC templates will differ to account for support of parameterized settings.
  • Any unused template version can be purged over time (except for latest) if they remain unused in snapshots, indices, etc.

Applying Template Upgrades

Based on the entity referencing the context for ABC template, we will allow to have the context of the entity to be updated when templates are upgraded to a new version.

Any updates to component templates are applied on the index templates referring them today as well. With ABC templates, we will extend this to be applicable for indices as well. By default, the template upgrades will apply to the entity using them in the context.

Template hosting

  • We start with a code based config repository can be created which holds the basic ABC templates
  • These templates will be exposed through an interface which allows for custom dynamic repository along with loading logic applicable for each of them (this allows extension to support loading of generic resources)
  • The repository will support versioning on top of the templates it provides.

Overrides through other Templates/Index Settings

  • Creation: Ordering in index templates and indices will follow as:
    • First component templates are applied [Applicable to index templates only]
    • Declared settings/mappings in index templates are applied thereafter
      • Any settings which coincide with ones declared in ABC templates — Use ABC template settings
    • This is followed by applying ABC templates [i.e. ABC templates supersede any setting coming through component template]. If there is a conflict between what is declared through an index template and through ABC templates, we can either fail during simulation or handle gracefully.
  • Setting Updates:
    • Any settings which have been applied through the context should not be changed during the index lifecycle if the index continues to be associated with the context. We will need to add validation for this in index settings flows.
    • Context can be removed as a whole from the index/index templates if users want to not be tied with the constraints which come with these. This can be implemented at a later point of time.

Handling Cluster upgrades

  • Whenever a new cluster manager gets elected, it can query the repository in which templates are hosted, and apply any differences from the last known version into the applicable resource (i.e. cluster state). With version checks in place, we can limit the additional work and only traverse, and apply the actual changes.

Initial Template Repository

Proposed Defaults

Template Repository will be focused on 2 things:

  1. Applying performance optimizations
  2. Adding default applications

Performance Optimization Settings [In Progress]

Request Logs

refresh_interval: "60s"
index.codec: "zstd_no_dict || best_compression"
merge_policy: "LOG_BYTE_SIZE"

Metrics

refresh_interval: "60s"
index.codec: "zstd_no_dict || best_compression"
merge_policy: "LOG_BYTE_SIZE"

Events

index.optimize_doc_id_lookup.fuzzy_set.enabled: true
index.optimize_doc_id_lookup.fuzzy_set.false_positive_probability: 0.10
index.merge.policy.deletes_pct_allowed: 5.0
merge_policy: "LOG_BYTE_SIZE",
index.codec: "zstd_no_dict || best_compression"

Related component

Indexing:Performance

@mgodwan mgodwan added enhancement Enhancement or improvement to existing feature or request untriaged labels Jul 4, 2024
@mgodwan mgodwan removed the untriaged label Jul 4, 2024
@vikasvb90 vikasvb90 added the RFC Issues requesting major changes label Jul 8, 2024
@mgodwan mgodwan added the v2.16.0 Issues and PRs related to version 2.16.0 label Jul 10, 2024
@github-project-automation github-project-automation bot moved this to Issues and PR's in OpenSearch Roadmap Jul 10, 2024
@mgodwan mgodwan reopened this Jul 17, 2024
@mgodwan mgodwan removed the untriaged label Jul 17, 2024
@getsaurabh02
Copy link
Member

@mgodwan Do we need to update the label as 2.17?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request Indexing:Performance RFC Issues requesting major changes Roadmap:Search Project-wide roadmap label v2.16.0 Issues and PRs related to version 2.16.0 v2.17.0
Projects
Status: New
Development

Successfully merging a pull request may close this issue.

3 participants