Skip to content

Commit

Permalink
Add prometheus.relabel external cache proposal
Browse files Browse the repository at this point in the history
Signed-off-by: Wilfried Roset <[email protected]>
  • Loading branch information
wilfriedroset committed Sep 3, 2024
1 parent 7a2030e commit 9f093a1
Showing 1 changed file with 155 additions and 0 deletions.
155 changes: 155 additions & 0 deletions docs/design/1600-prometheus-relabel-external-cache.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,155 @@
# Proposal: Prometheus relabel external cache

- Author(s): Wilfried ROSET (@wilfriedroset), Pierre BAILHACHE (@pbailhache)
- Last updated: 2024-09-02
- Original issue: <https://github.com/grafana/alloy/issues/1600>

## Abstract

This proposal introduces a way to configure the component `prometheus.relabel` so that it could use an external cache such as `Redis` or `Memcached` instead of `in-memory`.

## Problem

The `prometheus.relabel` component rewrites the label set of each metric passed along to the exported receiver by applying one or more relabeling rules. To do so it uses a relabeling cache stored in memory. For each `prometheus.relabel` component Alloy will create a dedicated relabel cache. This is not a huge issue per se due to the possibility of registering multiple rules into on component to mutualize the cache.

However if you're horizontally scaling Alloy deployment with load-balancing, you will end up with one cache per Alloy instance and those local caches will have overlaps, increasing the footprint of each instance.
Moreover, the cache is tied to the pod, which means that a new Alloy process starts with an empty cache.

For a couple of horizontally scaled Alloy pods it's acceptable but if you plan to have lots of instances processing data horizontally it's not sustainable.

## Proposal

Allow to use `Redis` or `Memcached` instead of the `in-memory` one as cache.

Using [dskit](https://github.com/grafana/dskit/blob/main/cache/cache.go) code to manage connection and client configuration.

We could create an interface to avoid changing the code logic and to abstract the kind of cache used to the component.

## Pros and cons

**Pros:**

- No logic change so impact is expected to be null for users
- Possibility to use an external cache if needed, even having multiple caches for different relabeling components.

**Cons:**

- Config is a bit more complex compared to previous one

## Alternative solutions

The alternative is to do nothing as we deem this improvement unnecessary.

## Compatibility

This proposal offer to deprecate for a couple of release the old way to configure the in-memory cache, then release+2 drop the old way. Doing so allow to migrate the settings to the correct block.

## Implementation

We should add a way to select the cache and its connections options through the components arguments

For example, based on what's done in [Mimir index cache](https://github.com/grafana/mimir/blob/main/pkg/storage/tsdb/index_cache.go#L47):

```golang
type Arguments struct {
// Where the relabelled metrics should be forwarded to.
ForwardTo []storage.Appendable `alloy:"forward_to,attr"`

// The relabelling rules to apply to each metric before it's forwarded.
MetricRelabelConfigs []*alloy_relabel.Config `alloy:"rule,block,optional"`

// DEPRECATED Use CacheConfig and set InMemoryRelabelCacheConfig
InMemoryCacheSizeDeprecated int `alloy:"max_cache_size,attr,optional"`

// The relabelling rules to apply to each metric before it's forwarded.
CacheConfig RelabelCacheConfig `alloy:"cache,block,optional"`
}

type RelabelCacheConfig struct {
cache.BackendConfig `yaml:",inline"`
InMemory , `yaml:"inmemory"`
}

type InMemoryRelabelCacheConfig struct {
CacheSize int `yaml:"cache_size"`
}

------
type BackendConfig struct {
Backend string `yaml:"backend"`
Memcached MemcachedClientConfig `yaml:"memcached"`
Redis RedisClientConfig `yaml:"redis"`
}
```

Configuration should be `in_memory` by default.
`max_cache_size` should still be taken into account but only if `backend = in_memory`. It also should be deprecated and we should redirect to the `InMemoryRelabelCacheConfig.CacheSize` field.

Here is some examples:

- legacy config unchanged

```river
prometheus.relabel "legacy_config" {
forward_to = [...]
max_cache_size = 10000000
rule {
...
}
}
```

- redis config

```river
prometheus.relabel "redis_config" {
forward_to = [...]
cache {
backend = "redis"
redis {
endpoint = "redis.url"
username = "user"
password = "password"
...
}
}
...
}
```

- new in memory config

```river
prometheus.relabel "inmemory_config" {
forward_to = [...]
cache {
backend = "inmemory"
inmemory {
cache_size = 10000000
}
}
...
}
```

- memcached config

```river
prometheus.relabel "memcached_config" {
forward_to = [...]
cache {
backend = "memcached"
memcached {
addresses = "address1, address2"
timeout = 10
...
}
}
...
}
```

## Related open issues

N/A

0 comments on commit 9f093a1

Please sign in to comment.