Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Tiered caching][Milestone 1] Exposing a new cache stats API #12258

Closed
sgup432 opened this issue Feb 8, 2024 · 4 comments
Closed

[Tiered caching][Milestone 1] Exposing a new cache stats API #12258

sgup432 opened this issue Feb 8, 2024 · 4 comments
Labels
enhancement Enhancement or improvement to existing feature or request Search:Performance

Comments

@sgup432
Copy link
Contributor

sgup432 commented Feb 8, 2024

Is your feature request related to a problem? Please describe

Overview

As part of tiered caching milestone 1(#10870), we are looking to extend IndicesRequestCache with additional caching tiers like disk.
We will need to introduce new stats to expose "per tier" stats in addition to whatever we have.

Current state

Taking IndicesRequestCache as an example, we expose it as part of node indices stats and index stats. It exposes level query parameter which can be used to aggregate stats on dimensions like shardId, indices. Examples below:

  • Node stats -

    • GET /_nodes/stats/indices/request_cache
      • Request cache stats for all nodes in the cluster.
    • GET /_nodes/stats/indices/request_cache?level=indices
      • Aggregating at indices level
    • GET /_nodes/stats/indices/request_cache?level=shard
      • Aggregating at shard level.
  • Index stats - Similar to above, just that this is meant to aggregate only at indices level.

    • GET /<index-name>/_stats/request_cache
    • GET /<index-name>/_stats/request_cache?level=shards

Describe the solution you'd like

Maintaining cache stats for respective tiers with key based dimension support

Considering we are introducing new cache interfaces inside OpenSearch which can be used to implement any caching tier like onHeap, disk etc, we can plan to have a new cache stats under node stats. This way any new each cache tier will maintain its own stats decoupling the logic from consumer using these underlying caches.

Backward compatibility: To main backward compatibility, we will continue to support existing indices request_cache stats like the way it exists today.

API Details below.

Cache stats API (new API):

  • Request

    • GET /_nodes/stats/caches?pretty
    • GET /_nodes/stats/caches/<cache_type>?pretty
    • GET /_nodes/stats/caches/<cache_type>?level=dimension1,dimension2&pretty
  • Path parameters

    • <cache_type>: (Optional, string). Limits the information to a specific cache type within OpenSearch. For example, IndicesRequestCache, QueryCache etc.
  • Query parameters

    • level: (Optional, string): Indicates a dimension for which stats are aggregated for a specific cache type.
      • tier value for level is shared across all cache types.
      • Rest of the dimensions are specific to desired cache types:
        • cache_type: request_cache
          • Valid values for level are:
            • shards, “shardId” → index1[0]
            • tier
            • indices
            • tier, indices
            • tier, shards
  • Response body

    • caches - (Object) Contains stats for desired caches present in OpenSearch.
      • request_cache: (Object): Contains stats for IndicesRequestCache cache type.
        • memory_size_in_bytes
        • evictions
        • hit_count
        • miss_count
        • entries

Examples:
RequestCache stats, Dimension is shard, tier:
GET /_nodes/stats/caches/request_cache?level=tier,shards&pretty

"caches": {
   "request_cache": { // one cache type
        "memory_size_in_bytes" : 3,
        "evictions" : 1,
        "hit_count" : 4,
        "miss_count" : 3,
        "entries" : 2
        "shards": {
            "index1[0]": {
                "tier": {
                    "onheap": {
                        "memory_size_in_bytes" : 2,
                        "evictions" : 1,
                        "hit_count" : 2,
                        "miss_count" : 2,
                        "entries": 1
                    },
                    "disk": {
                        "memory_size_in_bytes" : 1,
                        "evictions" : 0,
                        "hit_count" : 2,
                        "miss_count" : 1,
                        "entries": 1
                    }
                }
            }
        }
    },
    "query_cache": {} // Other cache type
}

Related component

Search:Performance

Describe alternatives you've considered

Update the existing Indices request stats to include per tier stats.
Something like

GET /_node/stats/indices/request_cache?pretty

"request_cache" : {
    "tier": {
        "onheap": {
            "memory_size_in_bytes" : 331136248,
            "evictions" : 47033,
            "hit_count" : 89953,
            "miss_count" : 3328005
        },
        "disk": {
            "memory_size_in_bytes" : 331136248,
            "evictions" : 47033,
            "hit_count" : 89953,
            "miss_count" : 3328005
        }
     }
 }

One of the con of this approach is that each consumer(like IndicesRequestCache) will have to handle their own stats writing logic instead of this being taken care by the underlying caching tier(heap/disk etc) itself.

Additional context

No response

@sgup432 sgup432 added enhancement Enhancement or improvement to existing feature or request untriaged labels Feb 8, 2024
@sgup432
Copy link
Contributor Author

sgup432 commented Feb 12, 2024

@msfroh

@peternied
Copy link
Member

[Triage - attendees 1 2 3 4 5 6 7 8]
@sgup432 Thanks for filing this with the detailed proposal.

@msfroh
Copy link
Collaborator

msfroh commented Feb 14, 2024

Thanks @sgup432 -- I like this approach 👍

Having each cache report its own stats makes a lot of sense (since the cache implementation knows what stats it has and how to structure them).

@sgup432
Copy link
Contributor Author

sgup432 commented May 6, 2024

Closing this issue as all the cache stats PR have been merged.

@sgup432 sgup432 closed this as completed May 6, 2024
@github-project-automation github-project-automation bot moved this from 🆕 New to ✅ Done in Search Project Board May 6, 2024
@github-project-automation github-project-automation bot moved this from In Progress to Done in Performance Roadmap May 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request Search:Performance
Projects
Status: Done
Archived in project
Development

No branches or pull requests

3 participants