Skip to content

Releases: grafana/mimir

2.14.0-rc.0

30 Sep 10:40
mimir-2.14.0-rc.0
7435049
Compare
Choose a tag to compare
2.14.0-rc.0 Pre-release
Pre-release

This release contains 593 PRs from 66 authors, including new contributors Adrian Berger, Albert Kerr, Alexander Davis, Alyssa Wada, Aofei Sheng, Bailhache Pierre, Bradley, David Stevens, Davin Kevin, Dennis Haney, Felipe Ferreira, Jeongseup, Nicholas Kress, Paul Farver, Pooya, Rajguru, Sephia Laureencia, Sviat Loginov, Taehyun Kim, Taylor C, Tito Lins, Willem Gillis, William Travis Holton, William Wernert, Yuri Tseretyan. Thank you!

Grafana Mimir version 2.14.0-rc.0 release notes

Grafana Labs is excited to announce version 2.14 of Grafana Mimir.

The highlights that follow include the top features, enhancements, and bug fixes in this release.
For the complete list of changes, refer to the CHANGELOG.

Features and enhancements

The streaming of chunks from store-gateways to queriers is now enabled by default. This reduces the memory usage in queriers. This was an experimental feature since Mimir 2.10, and is now considered stable.

Compactor adds a new cortex_compactor_disk_out_of_space_errors_total counter metric that tracks how many times a compaction fails due to the compactor being out of disk.

The distributor now replies with the Retry-After header on retryable errors by default. This protects Mimir from clients, including Prometheus, that default to retrying very quickly, making recovering from an outage easier. The feature was originally added as experimental in Mimir 2.11.

Incoming OTLP requests were previously size-limited with the distributor's -distributor.max-recv-msg-size configuration. The distributor has a new -distributor.max-otlp-request-size configuration for limiting OTLP requests. The default value is 100 MiB.

Ingesters can be marked as read-only as part of their downscaling procedure. The new prepare-instance-ring-downscale endpoint updates the read-only status of an ingester in the ring.

Important changes

In Grafana Mimir 2.14, the following behavior has changed:

When running a remote read request, the querier honors the time range specified in the read hints.

The default inactivity timeout of active series in ingesters, controlled by the -ingester.active-series-metrics-idle-timeout configuration, is increased from 10m to 20m.

The following featues of store-gateway are changed: -blocks-storage.bucket-store.max-concurrent-queue-timeout is set to five seconds; -blocks-storage.bucket-store.index-header.lazy-loading-concurrency-queue-timeout is set to five seconds; -blocks-storage.bucket-store.max-concurrent is set to 200.

The experimental support for Redis caching is now deprecated and set to be removed in the next major release. Users are encouraged
to switch to use Memcached.

The following deprecated configuration options were removed in this release:

  • The -ingester.return-only-grpc-errors option in the ingester
  • The -ingester.client.circuit-breaker.* options in the ingester
  • The -ingester.limit-inflight-requests-using-grpc-method-limiter option in the ingester
  • The -ingester.client.report-grpc-codes-in-instrumentation-label-enabled option in the distributor and ruler
  • The -distributor.limit-inflight-requests-using-grpc-method-limiter option in the distributor
  • The -distributor.enable-otlp-metadata-storage option in the distributor
  • The -ruler.drain-notification-queue-on-shutdown option in the ruler
  • The -querier.max-query-into-future option in the querier
  • The -querier.prefer-streaming-chunks-from-store-gateways option in the querier and the store-gateway
  • The -query-scheduler.use-multi-algorithm-query-queue option in the querier-scheduler
  • The YAML configuration frontend.align_queries_with_step in the query-frontend

Experimental features

Grafana Mimir 2.14 includes some features that are experimental and disabled by default.
Use these features with caution and report any issues that you encounter:

The ingester added an experimental -ingester.ignore-ooo-exemplars configuration. When set, out-of-order exemplars are no longer reported to the remote write client.

The querier supports the experimental limitk() and limit_ratio() PromQL functions. This feature is disabled by default,
but you can enable it with the -querier.promql-experimental-functions-enabled=true setting in the query-frontend and the querier.

Bug fixes

  • Alertmanager: fix configuration validation gap around unreferenced templates.
  • Alertmanager: fix goroutine leak when stored configuration fails to apply and there is no existing tenant alertmanager.
  • Alertmanager: fix receiver firewall to detect 0.0.0.0 and IPv6 interface-local multicast address as local addresses.
  • Alertmanager: fix per-tenant silence limits not reloaded during runtime.
  • Alertmanager: fix bugs in silences that could cause an existing silence to expire/be deleted when updating the silence fails. This could happen when the updated silence was invalid or exceeded limits.
  • Alertmanager: fix help message for utf-8-strict-mode.
  • Compactor: fix a race condition between different compactor replicas that may cause a deleted block to be referenced as non-deleted in the bucket index.
  • Configuration: multi-line environment variables are flattened during injection to be compatible with YAML syntax.
  • HA Tracker: store correct timestamp for the last-received request from the elected replica.
  • Ingester: fix the sporadic not found error causing an internal server error if label names are queried with matchers during head compaction.
  • Ingester, store-gateway: fix case insensitive regular expressions not correctly matching some Unicode characters.
  • Ingester: fixed timestamp reported in the "the sample has been rejected because its timestamp is too old" error when the write request contains only histograms.
  • Query-frontend: fix -querier.max-query-lookback and -compactor.blocks-retention-period enforcement in query-frontend when one of the two is not set.
  • Query-frontend: "query stats" log includes the actual status_code when the request fails due to an error occurring in the query-frontend itself.
  • Query-frontend: ensure that internal errors result in an HTTP 500 response code instead of a 422 response code.
  • Query-frontend: return annotations generated during evaluation of sharded queries.
  • Query-scheduler: fix a panic in request queueing.
  • Querier: fix the issue where "context canceled" is logged for trace spans for requests to store-gateways that return no series when chunks streaming is enabled.
  • Querier: fix issue where queries can return incorrect results if a single store-gateway returns overlapping chunks for a series.
  • Querier: do not return grpc: the client connection is closing errors as HTTP 499.
  • Querier: fix issue where some native histogram-related warnings were not emitted when rate() was used over native histograms.
  • Querier: fix invalid query results when multiple chunks are merged.
  • Querier: support optional start and end times on /prometheus/api/v1/labels, /prometheus/api/v1/label/<label>/values, and /prometheus/api/v1/series when max_query_into_future: 0.
  • Querier: fix issue where both recently compacted blocks and their source blocks can be skipped during querying if store-gateways are restarting.
  • Ruler: add support for draining any outstanding alert notifications before shutting down. Enable this setting with the -ruler.drain-notification-queue-on-shutdown=true CLI flag.
  • Store-gateway: fixed a case where, on a quick subsequent restart, the previous lazy-loaded index header snapshot was overwritten by a partially loaded one.
  • Store-gateway: store sparse index headers atomically to disk.
  • Ruler: map invalid org-id errors to the 400 status code.

Helm chart improvements

The Grafana Mimir and Grafana Enterprise Metrics Helm charts are released independently.
Refer to the Grafana Mimir Helm chart documentation.

Changelog

2.14.0-rc.0

Grafana Mimir

  • [CHANGE] Update minimal supported version of Go to 1.22. #9134
  • [CHANGE] Store-gateway / querier: enable streaming chunks from store-gateways to queriers by default. #6646
  • [CHANGE] Querier: honor the start/end time range specified in the read hints when executing a remote read request. #8431
  • [CHANGE] Querier: return only samples within the queried start/end time range when executing a remote read request using "SAMPLES" mode. Previously, samples outside of the range could have been returned. Samples outside of the queried time range may still be returned when executing a remote read request using "STREAMED_XOR_CHUNKS" mode. #8463
  • [CHANGE] Querier: Set minimum for -querier.max-concurrent to four to prevent queue starvation with querier-worker queue prioritization algorithm; values below the minimum four are ignored and set to the minimum. #9054
  • [CHANGE] Store-gateway: enabled -blocks-storage.bucket-store.max-concurrent-queue-timeout by default with a timeout of 5 seconds. #8496
  • [CHANGE] Store-gateway: enabled -blocks-storage.bucket-store.index-header.lazy-loading-concurrency-queue-timeout by default with a timeout of 5 seconds . #8667
  • [CHANGE] Distributor: Incoming OTLP requests were previously size-limited by using limit from -distributor.max-recv-msg-size option. We have added option -distributor.max-otlp-request-size for limiting OTLP requests, with default value of 100 MiB. #8574
  • [CHANGE] Distributor: remove metric cortex_distributor_sample_delay_seconds. #8698
  • [CHANGE] Query-frontend: Remove deprecated frontend.align_queries_with_step YAML configuration. The configuration option has been moved to per-tenant and default limits since Mimir 2.12. #8733 #8735
  • [CHANGE] Store-gateway: Change default of -blocks-storage.bucket-store.max-concurrent to 200. #8768
  • [CHANGE] Added new metric `cortex_compa...
Read more

2.13.0

05 Jul 18:13
mimir-2.13.0
4775ec1
Compare
Choose a tag to compare

This release contains 490 PRs from 67 authors, including new contributors Anthony Keydel, Armand Grillet, AvivGuiser, Daniel R. Dagfinrud, David Collier-Brown, David Grant, Dimitris Alo, Enrique Garbi, Erik Sommer, Ian Halliday, InventiveCoder, Kevin Mingtarja, Lasse Hels, Pangidoan Butar, Quentin Bisson, Rens Groothuijsen, René Gärtner, Ross Brunson, Santiago, Seiya, Spyros Panagiotopoulos, Yuri Nikolic, Zied ABID, kahirokunn, lasermoth. Thank you!

Grafana Mimir version 2.13.0 release notes

Grafana Labs is excited to announce version 2.13 of Grafana Mimir.

The highlights that follow include the top features, enhancements, and bug fixes in this release.
For the complete list of changes, refer to the CHANGELOG.

Features and enhancements

  • Improved CPU performance in processing queries with regular expressions that contain many alternations (for example foo|bar|baz|...).

  • Improved OTLP ingestion performance by using native translation, which reduces memory usage in the distributor by up to 30% and CPU usage by up to 8%.

  • Configurable S3 bucket lookup type improves interoperability with S3-compatible providers, such as Tencent and Alibaba. Configure lookup type via -<prefix>.s3.bucket-lookup-type or -common.storage.s3-bucket-lookup-type.

  • Configuration of TLS for S3 buckets is now possible via -<prefix>.s3.http.* and -common.storage.s3.http.* configuration options.

  • Mimirtool can now verify the validity of a Mimir runtime configuration file with the mimirtool runtime-config verify command.

  • Active series are now updated along with owned series. This means the number of active series for a tenant is more accurate after scaling out ingesters. As a result, ingesters now more precisely enforce tenants' series limits. This feature is only enabled when -ingester.track-ingester-owned-series or -ingester.use-ingester-owned-series-for-limits are enabled.

  • Store-gateway can be configured to explicitly disable or enable tenants. This is useful in cases where compaction slows down for a tenant to temporarily exclude another tenant while compaction catches up without affecting other tenants.

  • Remote read (/prometheus/api/v1/read) is becoming a first-class endpoint in Mimir with support in query stats logs and query blocking in the query-frontend. More coming in future releases.

Additionally, the following previously experimental features are now considered stable:

  • Rules tenant federation via the source_tenants field in rule groups.
  • Enabling recording, and alerting rules evaluation on a per-tenant basis.
  • Limiting the number of tenants a federated query can query.

Important changes

In Grafana Mimir 2.13 the following behavior has changed:

  • The default Docker image grafana/mimir is now based on the distroless image gcr.io/distroless/static-debian12.
    See Debugging distroless container images for more details on how to
    work with distroless images.

  • Error logs in the ingester are now sampled at 10% by default. Sampled log lines contain (sampled 1/10). The cortex_discarded_samples_total metric still tracks all discarded samples.

  • Anonymous usage statistics now include actual CPU usage instead of available CPU cores.

  • Continuous-test is no longer a standalone binary and is now part of Mimir as its own target. The published Docker image has been updated to use the new packaging.

The following deprecated configuration options are removed in Grafana Mimir 2.13:

  • The configuration option -log.buffered, which was deprecated in 2.11 and is now enabled by default.

Experimental features

Grafana Mimir 2.13 includes new features that are considered experimental and disabled by default.
Use them with caution and report any issues you encounter:

  • Experimental support for server-side circuit breakers in ingesters on read and write requests. This can be enabled using -ingester.push-circuit-breaker.enabled and -ingester.read-circuit-breaker.enabled options and further configured via the -ingester.push-circuit-breaker.* and -ingester.read-circuit-breaker.* options.

The following configuration options are deprecated and will be removed in a future Grafana Mimir release:

  • evaluation_delay field in the rule group configuration has been deprecated. Please use query_offset instead.

Bug fixes

  • OTLP ingestion: translate all HTTP 5xx errors into one of 502, 503, or 504, so that the otel-collector retries the failed requests.
  • OTLP ingestion: return properly formatted protobuf error messages when ingestion fails.
  • OTLP ingestion: generate target_info metric only when there are metrics in the request and there is at least one configured identifying label.
  • OTLP ingestion: don't discard timeseries paired with invalid exemplars. Instead, try to ingest the timeseries and discard only the invalid exemplars.
  • Subqueries: @ end() and @ start() now work correctly with queries split by time.
  • Native histograms: order exemplars before ingestion to improve success rate when ingesting multiple exemplars.
  • Native histograms: return HTTP 400 on invalid native histogram samples instead of HTTP 500. The metric cortex_discarded_samples_total{reason="invalid-native-histogram"} is now incremented on invalid histogram samples.

Helm chart improvements

The Grafana Mimir and Grafana Enterprise Metrics Helm charts are released independently.
Refer to the Grafana Mimir Helm chart documentation.

Changelog

2.13.0

Grafana Mimir

  • [CHANGE] Build: grafana/mimir docker image is now based on gcr.io/distroless/static-debian12 image. Alpine-based docker image is still available as grafana/mimir-alpine, until Mimir 2.15. #8204 #8235
  • [CHANGE] Ingester: /ingester/flush endpoint is now only allowed to execute only while the ingester is in Running state. The 503 status code is returned if the endpoint is called while the ingester is not in Running state. #7486
  • [CHANGE] Distributor: Include label name in err-mimir-label-value-too-long error message: #7740
  • [CHANGE] Ingester: enabled 1 out 10 errors log sampling by default. All the discarded samples will still be tracked by the cortex_discarded_samples_total metric. The feature can be configured via -ingester.error-sample-rate (0 to log all errors). #7807
  • [CHANGE] Query-frontend: Query results caching and experimental query blocking now utilize the PromQL string-formatted query format rather than the unvalidated query as submitted to the frontend. #7742
    • Query results caching should be more stable as all equivalent queries receive the same cache key, but there may be cache churn on first deploy with the updated format
    • Query blocking can no longer be circumvented with an equivalent query in a different format; see Configure queries to block
  • [CHANGE] Query-frontend: stop using -validation.create-grace-period to clamp how far into the future a query can span. #8075
  • [CHANGE] Clamp GOMAXPROCS to runtime.NumCPU. #8201
  • [CHANGE] Anonymous usage statistics tracking: add CPU usage percentage tracking. #8282
  • [CHANGE] Added new metric cortex_compactor_disk_out_of_space_errors_total which counts how many times a compaction failed due to the compactor being out of disk. #8237
  • [CHANGE] Anonymous usage statistics tracking: report active series in addition to in-memory series. #8279
  • [CHANGE] Ruler: evaluation_delay field in the rule group configuration has been deprecated. Please use query_offset instead (it has the same exact meaning and behaviour). #8295
  • [CHANGE] General: remove -log.buffered. The configuration option has been enabled by default and deprecated since Mimir 2.11. #8395
  • [CHANGE] Ruler: promote tenant federation from experimental to stable. #8400
  • [CHANGE] Ruler: promote -ruler.recording-rules-evaluation-enabled and -ruler.alerting-rules-evaluation-enabled from experimental to stable. #8400
  • [CHANGE] General: promote -tenant-federation.max-tenants from experimental to stable. #8400
  • [FEATURE] Continuous-test: now runable as a module with mimir -target=continuous-test. #7747
  • [FEATURE] Store-gateway: Allow specific tenants to be enabled or disabled via -store-gateway.enabled-tenants or -store-gateway.disabled-tenants CLI flags or their corresponding YAML settings. #7653
  • [FEATURE] New -<prefix>.s3.bucket-lookup-type flag configures lookup style type, used to access bucket in s3 compatible providers. #7684
  • [FEATURE] Querier: add experimental streaming PromQL engine, enabled with -querier.promql-engine=mimir. #7693 #7898 #7899 #8023 #8058 #8096 #8121 #8197 #8230 #8247 #8270 #8276 #8277 #8291 #8303 #8340 #8256 #8348
  • [FEATURE] New /ingester/unregister-on-shutdown HTTP endpoint allows dynamic access to ingesters' -ingester.ring.unregister-on-shutdown configuration. #7739
  • [FEATURE] Server: added experimental PROXY protocol support. The PROXY protocol support can be enabled via -server.proxy-protocol-enabled=true. When enabled, the support is added both to HTTP and gRPC listening ports. #7698
  • [FEATURE] Query-frontend, querier: new experimental /cardinality/active_native_histogram_metrics API to get active native histogram metric names with statistics about active native histogram buckets. #7982 #7986 #...
Read more

2.13.0-rc.2

04 Jul 18:54
mimir-2.13.0-rc.2
bd8bde4
Compare
Choose a tag to compare
2.13.0-rc.2 Pre-release
Pre-release

This release contains 476 PRs from 67 authors, including new contributors Anthony Keydel, Armand Grillet, AvivGuiser, Daniel R. Dagfinrud, David Collier-Brown, David Grant, Dimitris Alo, Enrique Garbi, Erik Sommer, Ian Halliday, InventiveCoder, Kevin Mingtarja, Lasse Hels, Pangidoan Butar, Quentin Bisson, Rens Groothuijsen, René Gärtner, Ross Brunson, Santiago, Seiya, Spyros Panagiotopoulos, Yuri Nikolic, Zied ABID, kahirokunn, lasermoth. Thank you!

Grafana Mimir version 2.13.0-rc.2 release notes

Features and enhancements

  • Improved CPU performance in processing queries with regular expressions that contain many alternations (for example foo|bar|baz|...).

  • Improved OTLP ingestion performance by using native translation, which reduces memory usage in the distributor by up to 30% and CPU usage by up to 8%.

  • Configurable S3 bucket lookup type improves interoperability with S3-compatible providers, such as Tencent and Alibaba. Configure lookup type via -<prefix>.s3.bucket-lookup-type or -common.storage.s3-bucket-lookup-type.

  • Configuration of TLS for S3 buckets is now possible via -<prefix>.s3.http.* and -common.storage.s3.http.* configuration options.

  • Mimirtool can now verify the validity of a Mimir runtime configuration file with the mimirtool runtime-config verify command.

  • Active series are now updated along with owned series. This means the number of active series for a tenant is more accurate after scaling out ingesters. As a result, ingesters now more precisely enforce tenants' series limits. This feature is only enabled when -ingester.track-ingester-owned-series or -ingester.use-ingester-owned-series-for-limits are enabled.

  • Store-gateway can be configured to explicitly disable or enable tenants. This is useful in cases where compaction slows down for a tenant to temporarily exclude another tenant while compaction catches up without affecting other tenants.

  • Remote read (/prometheus/api/v1/read) is becoming a first-class endpoint in Mimir with support in query stats logs and query blocking in the query-frontend. More coming in future releases.

Additionally, the following previously experimental features are now considered stable:

  • Rules tenant federation via the source_tenants field in rule groups.
  • Enabling recording, and alerting rules evaluation on a per-tenant basis.
  • Limiting the number of tenants a federated query can query.

Important changes

In Grafana Mimir 2.13 the following behavior has changed:

  • The default Docker image grafana/mimir is now based on the distroless image gcr.io/distroless/static-debian12.
    See Debugging distroless container images for more details on how to
    work with distroless images.

  • Error logs in the ingester are now sampled at 10% by default. Sampled log lines contain (sampled 1/10). The cortex_discarded_samples_total metric still tracks all discarded samples.

  • Anonymous usage statistics now include actual CPU usage instead of available CPU cores.

  • Continuous-test is no longer a standalone binary and is now part of Mimir as its own target. The published Docker image has been updated to use the new packaging.

The following deprecated configuration options are removed in Grafana Mimir 2.13:

  • The configuration option -log.buffered, which was deprecated in 2.11 and is now enabled by default.

Experimental features

Grafana Mimir 2.13 includes new features that are considered experimental and disabled by default.
Use them with caution and report any issues you encounter:

  • Experimental support for server-side circuit breakers in ingesters on read and write requests. This can be enabled using -ingester.push-circuit-breaker.enabled and -ingester.read-circuit-breaker.enabled options and further configured via the -ingester.push-circuit-breaker.* and -ingester.read-circuit-breaker.* options.

The following configuration options are deprecated and will be removed in a future Grafana Mimir release:

  • evaluation_delay field in the rule group configuration has been deprecated. Please use query_offset instead.

Bug fixes

  • OTLP ingestion: translate all HTTP 5xx errors into one of 502, 503, or 504, so that the otel-collector retries the failed requests.
  • OTLP ingestion: return properly formatted protobuf error messages when ingestion fails.
  • OTLP ingestion: generate target_info metric only when there are metrics in the request and there is at least one configured identifying label.
  • OTLP ingestion: don't discard timeseries paired with invalid exemplars. Instead, try to ingest the timeseries and discard only the invalid exemplars.
  • Subqueries: @ end() and @ start() now work correctly with queries split by time.
  • Native histograms: order exemplars before ingestion to improve success rate when ingesting multiple exemplars.
  • Native histograms: return HTTP 400 on invalid native histogram samples instead of HTTP 500. The metric cortex_discarded_samples_total{reason="invalid-native-histogram"} is now incremented on invalid histogram samples.

Changelog

2.13.0-rc.1

Grafana Mimir

  • [CHANGE] Build: grafana/mimir docker image is now based on gcr.io/distroless/static-debian12 image. Alpine-based docker image is still available as grafana/mimir-alpine, until Mimir 2.15. #8204 #8235
  • [CHANGE] Ingester: /ingester/flush endpoint is now only allowed to execute only while the ingester is in Running state. The 503 status code is returned if the endpoint is called while the ingester is not in Running state. #7486
  • [CHANGE] Distributor: Include label name in err-mimir-label-value-too-long error message: #7740
  • [CHANGE] Ingester: enabled 1 out 10 errors log sampling by default. All the discarded samples will still be tracked by the cortex_discarded_samples_total metric. The feature can be configured via -ingester.error-sample-rate (0 to log all errors). #7807
  • [CHANGE] Query-frontend: Query results caching and experimental query blocking now utilize the PromQL string-formatted query format rather than the unvalidated query as submitted to the frontend. #7742
    • Query results caching should be more stable as all equivalent queries receive the same cache key, but there may be cache churn on first deploy with the updated format
    • Query blocking can no longer be circumvented with an equivalent query in a different format; see Configure queries to block
  • [CHANGE] Query-frontend: stop using -validation.create-grace-period to clamp how far into the future a query can span. #8075
  • [CHANGE] Clamp GOMAXPROCS to runtime.NumCPU. #8201
  • [CHANGE] Anonymous usage statistics tracking: add CPU usage percentage tracking. #8282
  • [CHANGE] Added new metric cortex_compactor_disk_out_of_space_errors_total which counts how many times a compaction failed due to the compactor being out of disk. #8237
  • [CHANGE] Anonymous usage statistics tracking: report active series in addition to in-memory series. #8279
  • [CHANGE] Ruler: evaluation_delay field in the rule group configuration has been deprecated. Please use query_offset instead (it has the same exact meaning and behaviour). #8295
  • [CHANGE] General: remove -log.buffered. The configuration option has been enabled by default and deprecated since Mimir 2.11. #8395
  • [CHANGE] Ruler: promote tenant federation from experimental to stable. #8400
  • [CHANGE] Ruler: promote -ruler.recording-rules-evaluation-enabled and -ruler.alerting-rules-evaluation-enabled from experimental to stable. #8400
  • [CHANGE] General: promote -tenant-federation.max-tenants from experimental to stable. #8400
  • [FEATURE] Continuous-test: now runable as a module with mimir -target=continuous-test. #7747
  • [FEATURE] Store-gateway: Allow specific tenants to be enabled or disabled via -store-gateway.enabled-tenants or -store-gateway.disabled-tenants CLI flags or their corresponding YAML settings. #7653
  • [FEATURE] New -<prefix>.s3.bucket-lookup-type flag configures lookup style type, used to access bucket in s3 compatible providers. #7684
  • [FEATURE] Querier: add experimental streaming PromQL engine, enabled with -querier.promql-engine=mimir. #7693 #7898 #7899 #8023 #8058 #8096 #8121 #8197 #8230 #8247 #8270 #8276 #8277 #8291 #8303 #8340 #8256 #8348
  • [FEATURE] New /ingester/unregister-on-shutdown HTTP endpoint allows dynamic access to ingesters' -ingester.ring.unregister-on-shutdown configuration. #7739
  • [FEATURE] Server: added experimental PROXY protocol support. The PROXY protocol support can be enabled via -server.proxy-protocol-enabled=true. When enabled, the support is added both to HTTP and gRPC listening ports. #7698
  • [FEATURE] Query-frontend, querier: new experimental /cardinality/active_native_histogram_metrics API to get active native histogram metric names with statistics about active native histogram buckets. #7982 #7986 #8008
  • [FEATURE] Alertmanager: Added -alertmanager.max-silences-count and -alertmanager.max-silence-size-bytes to set limits on per tenant silences. Disabled by default. #8241 #8249
  • [FEATURE] Ingester: add experimental support for the server-side circuit breakers when writing to and reading from ingesters. This can be enabled using -ingester.push-circuit-breaker.enabled and -ingester.read-circuit-breaker.enabled options. Further -ingester.push-circuit-breaker.* and -ingester.read-circuit-breaker.* options for configuring circuit-breaker are available. Added metrics cortex_ingester_circuit_breaker_results_total, `cortex_i...
Read more

2.13.0-rc.1

27 Jun 16:09
mimir-2.13.0-rc.1
ace7d6a
Compare
Choose a tag to compare
2.13.0-rc.1 Pre-release
Pre-release

This release contains 476 PRs from 67 authors, including new contributors Anthony Keydel, Armand Grillet, AvivGuiser, Daniel R. Dagfinrud, David Collier-Brown, David Grant, Dimitris Alo, Enrique Garbi, Erik Sommer, Ian Halliday, InventiveCoder, Kevin Mingtarja, Lasse Hels, Pangidoan Butar, Quentin Bisson, Rens Groothuijsen, René Gärtner, Ross Brunson, Santiago, Seiya, Spyros Panagiotopoulos, Yuri Nikolic, Zied ABID, kahirokunn, lasermoth. Thank you!

Grafana Mimir version 2.13.0-rc.0 release notes

Features and enhancements

  • Improved CPU performance in processing queries with regular expressions that contain many alternations (for example foo|bar|baz|...).

  • Improved OTLP ingestion performance by using native translation, which reduces memory usage in the distributor by up to 30% and CPU usage by up to 8%.

  • Configurable S3 bucket lookup type improves interoperability with S3-compatible providers, such as Tencent and Alibaba. Configure lookup type via -<prefix>.s3.bucket-lookup-type or -common.storage.s3-bucket-lookup-type.

  • Configuration of TLS for S3 buckets is now possible via -<prefix>.s3.http.* and -common.storage.s3.http.* configuration options.

  • Mimirtool can now verify the validity of a Mimir runtime configuration file with the mimirtool runtime-config verify command.

  • Active series are now updated along with owned series. This means the number of active series for a tenant is more accurate after scaling out ingesters. As a result, ingesters now more precisely enforce tenants' series limits. This feature is only enabled when -ingester.track-ingester-owned-series or -ingester.use-ingester-owned-series-for-limits are enabled.

  • Store-gateway can be configured to explicitly disable or enable tenants. This is useful in cases where compaction slows down for a tenant to temporarily exclude another tenant while compaction catches up without affecting other tenants.

  • Remote read (/prometheus/api/v1/read) is becoming a first-class endpoint in Mimir with support in query stats logs and query blocking in the query-frontend. More coming in future releases.

Additionally, the following previously experimental features are now considered stable:

  • Rules tenant federation via the source_tenants field in rule groups.
  • Enabling recording, and alerting rules evaluation on a per-tenant basis.
  • Limiting the number of tenants a federated query can query.

Important changes

In Grafana Mimir 2.13 the following behavior has changed:

  • The default Docker image grafana/mimir is now based on the distroless image gcr.io/distroless/static-debian12.
    See Debugging distroless container images for more details on how to
    work with distroless images.

  • Error logs in the ingester are now sampled at 10% by default. Sampled log lines contain (sampled 1/10). The cortex_discarded_samples_total metric still tracks all discarded samples.

  • Anonymous usage statistics now include actual CPU usage instead of available CPU cores.

  • Continuous-test is no longer a standalone binary and is now part of Mimir as its own target. The published Docker image has been updated to use the new packaging.

The following deprecated configuration options are removed in Grafana Mimir 2.13:

  • The configuration option -log.buffered, which was deprecated in 2.11 and is now enabled by default.

Experimental features

Grafana Mimir 2.13 includes new features that are considered experimental and disabled by default.
Use them with caution and report any issues you encounter:

  • Experimental support for server-side circuit breakers in ingesters on read and write requests. This can be enabled using -ingester.push-circuit-breaker.enabled and -ingester.read-circuit-breaker.enabled options and further configured via the -ingester.push-circuit-breaker.* and -ingester.read-circuit-breaker.* options.

The following configuration options are deprecated and will be removed in a future Grafana Mimir release:

  • evaluation_delay field in the rule group configuration has been deprecated. Please use query_offset instead.

Bug fixes

  • OTLP ingestion: translate all HTTP 5xx errors into one of 502, 503, or 504, so that the otel-collector retries the failed requests.
  • OTLP ingestion: return properly formatted protobuf error messages when ingestion fails.
  • OTLP ingestion: generate target_info metric only when there are metrics in the request and there is at least one configured identifying label.
  • OTLP ingestion: don't discard timeseries paired with invalid exemplars. Instead, try to ingest the timeseries and discard only the invalid exemplars.
  • Subqueries: @ end() and @ start() now work correctly with queries split by time.
  • Native histograms: order exemplars before ingestion to improve success rate when ingesting multiple exemplars.
  • Native histograms: return HTTP 400 on invalid native histogram samples instead of HTTP 500. The metric cortex_discarded_samples_total{reason="invalid-native-histogram"} is now incremented on invalid histogram samples.

Changelog

2.13.0-rc.1

Grafana Mimir

  • [CHANGE] Build: grafana/mimir docker image is now based on gcr.io/distroless/static-debian12 image. Alpine-based docker image is still available as grafana/mimir-alpine, until Mimir 2.15. #8204 #8235
  • [CHANGE] Ingester: /ingester/flush endpoint is now only allowed to execute only while the ingester is in Running state. The 503 status code is returned if the endpoint is called while the ingester is not in Running state. #7486
  • [CHANGE] Distributor: Include label name in err-mimir-label-value-too-long error message: #7740
  • [CHANGE] Ingester: enabled 1 out 10 errors log sampling by default. All the discarded samples will still be tracked by the cortex_discarded_samples_total metric. The feature can be configured via -ingester.error-sample-rate (0 to log all errors). #7807
  • [CHANGE] Query-frontend: Query results caching and experimental query blocking now utilize the PromQL string-formatted query format rather than the unvalidated query as submitted to the frontend. #7742
    • Query results caching should be more stable as all equivalent queries receive the same cache key, but there may be cache churn on first deploy with the updated format
    • Query blocking can no longer be circumvented with an equivalent query in a different format; see Configure queries to block
  • [CHANGE] Query-frontend: stop using -validation.create-grace-period to clamp how far into the future a query can span. #8075
  • [CHANGE] Clamp GOMAXPROCS to runtime.NumCPU. #8201
  • [CHANGE] Anonymous usage statistics tracking: add CPU usage percentage tracking. #8282
  • [CHANGE] Added new metric cortex_compactor_disk_out_of_space_errors_total which counts how many times a compaction failed due to the compactor being out of disk. #8237
  • [CHANGE] Anonymous usage statistics tracking: report active series in addition to in-memory series. #8279
  • [CHANGE] Ruler: evaluation_delay field in the rule group configuration has been deprecated. Please use query_offset instead (it has the same exact meaning and behaviour). #8295
  • [CHANGE] General: remove -log.buffered. The configuration option has been enabled by default and deprecated since Mimir 2.11. #8395
  • [CHANGE] Ruler: promote tenant federation from experimental to stable. #8400
  • [CHANGE] Ruler: promote -ruler.recording-rules-evaluation-enabled and -ruler.alerting-rules-evaluation-enabled from experimental to stable. #8400
  • [CHANGE] General: promote -tenant-federation.max-tenants from experimental to stable. #8400
  • [FEATURE] Continuous-test: now runable as a module with mimir -target=continuous-test. #7747
  • [FEATURE] Store-gateway: Allow specific tenants to be enabled or disabled via -store-gateway.enabled-tenants or -store-gateway.disabled-tenants CLI flags or their corresponding YAML settings. #7653
  • [FEATURE] New -<prefix>.s3.bucket-lookup-type flag configures lookup style type, used to access bucket in s3 compatible providers. #7684
  • [FEATURE] Querier: add experimental streaming PromQL engine, enabled with -querier.promql-engine=mimir. #7693 #7898 #7899 #8023 #8058 #8096 #8121 #8197 #8230 #8247 #8270 #8276 #8277 #8291 #8303 #8340 #8256 #8348
  • [FEATURE] New /ingester/unregister-on-shutdown HTTP endpoint allows dynamic access to ingesters' -ingester.ring.unregister-on-shutdown configuration. #7739
  • [FEATURE] Server: added experimental PROXY protocol support. The PROXY protocol support can be enabled via -server.proxy-protocol-enabled=true. When enabled, the support is added both to HTTP and gRPC listening ports. #7698
  • [FEATURE] Query-frontend, querier: new experimental /cardinality/active_native_histogram_metrics API to get active native histogram metric names with statistics about active native histogram buckets. #7982 #7986 #8008
  • [FEATURE] Alertmanager: Added -alertmanager.max-silences-count and -alertmanager.max-silence-size-bytes to set limits on per tenant silences. Disabled by default. #8241 #8249
  • [FEATURE] Ingester: add experimental support for the server-side circuit breakers when writing to and reading from ingesters. This can be enabled using -ingester.push-circuit-breaker.enabled and -ingester.read-circuit-breaker.enabled options. Further -ingester.push-circuit-breaker.* and -ingester.read-circuit-breaker.* options for configuring circuit-breaker are available. Added metrics cortex_ingester_circuit_breaker_results_total, `cortex_i...
Read more

2.13.0-rc.0

21 Jun 18:20
mimir-2.13.0-rc.0
ac21f29
Compare
Choose a tag to compare
2.13.0-rc.0 Pre-release
Pre-release

This release contains 476 PRs from 67 authors, including new contributors Anthony Keydel, Armand Grillet, AvivGuiser, Daniel R. Dagfinrud, David Collier-Brown, David Grant, Dimitris Alo, Enrique Garbi, Erik Sommer, Ian Halliday, InventiveCoder, Kevin Mingtarja, Lasse Hels, Pangidoan Butar, Quentin Bisson, Rens Groothuijsen, René Gärtner, Ross Brunson, Santiago, Seiya, Spyros Panagiotopoulos, Yuri Nikolic, Zied ABID, kahirokunn, lasermoth. Thank you!

Grafana Mimir version 2.13.0-rc.0 release notes

Features and enhancements

  • Improved CPU performance in processing queries with regular expressions that contain many alternations (for example foo|bar|baz|...).

  • Improved OTLP ingestion performance by using native translation, which reduces memory usage in the distributor by up to 30% and CPU usage by up to 8%.

  • Configurable S3 bucket lookup type improves interoperability with S3-compatible providers, such as Tencent and Alibaba. Configure lookup type via -<prefix>.s3.bucket-lookup-type or -common.storage.s3-bucket-lookup-type.

  • Configuration of TLS for S3 buckets is now possible via -<prefix>.s3.http.* and -common.storage.s3.http.* configuration options.

  • Mimirtool can now verify the validity of a Mimir runtime configuration file with the mimirtool runtime-config verify command.

  • Active series are now updated along with owned series. This means the number of active series for a tenant is more accurate after scaling out ingesters. As a result, ingesters now more precisely enforce tenants' series limits. This feature is only enabled when -ingester.track-ingester-owned-series or -ingester.use-ingester-owned-series-for-limits are enabled.

  • Store-gateway can be configured to explicitly disable or enable tenants. This is useful in cases where compaction slows down for a tenant to temporarily exclude another tenant while compaction catches up without affecting other tenants.

  • Remote read (/prometheus/api/v1/read) is becoming a first-class endpoint in Mimir with support in query stats logs and query blocking in the query-frontend. More coming in future releases.

Additionally, the following previously experimental features are now considered stable:

  • Rules tenant federation via the source_tenants field in rule groups.
  • Enabling recording, and alerting rules evaluation on a per-tenant basis.
  • Limiting the number of tenants a federated query can query.

Important changes

In Grafana Mimir 2.13 the following behavior has changed:

  • The default Docker image grafana/mimir is now based on the distroless image gcr.io/distroless/static-debian12.
    See Debugging distroless container images for more details on how to
    work with distroless images.

  • Error logs in the ingester are now sampled at 10% by default. Sampled log lines contain (sampled 1/10). The cortex_discarded_samples_total metric still tracks all discarded samples.

  • Anonymous usage statistics now include actual CPU usage instead of available CPU cores.

  • Continuous-test is no longer a standalone binary and is now part of Mimir as its own target. The published Docker image has been updated to use the new packaging.

The following deprecated configuration options are removed in Grafana Mimir 2.13:

  • The configuration option -log.buffered, which was deprecated in 2.11 and is now enabled by default.

Experimental features

Grafana Mimir 2.13 includes new features that are considered experimental and disabled by default.
Use them with caution and report any issues you encounter:

  • Experimental support for server-side circuit breakers in ingesters on read and write requests. This can be enabled using -ingester.push-circuit-breaker.enabled and -ingester.read-circuit-breaker.enabled options and further configured via the -ingester.push-circuit-breaker.* and -ingester.read-circuit-breaker.* options.

The following configuration options are deprecated and will be removed in a future Grafana Mimir release:

  • evaluation_delay field in the rule group configuration has been deprecated. Please use query_offset instead.

Bug fixes

  • OTLP ingestion: translate all HTTP 5xx errors into one of 502, 503, or 504, so that the otel-collector retries the failed requests.
  • OTLP ingestion: return properly formatted protobuf error messages when ingestion fails.
  • OTLP ingestion: generate target_info metric only when there are metrics in the request and there is at least one configured identifying label.
  • OTLP ingestion: don't discard timeseries paired with invalid exemplars. Instead, try to ingest the timeseries and discard only the invalid exemplars.
  • Subqueries: @ end() and @ start() now work correctly with queries split by time.
  • Native histograms: order exemplars before ingestion to improve success rate when ingesting multiple exemplars.
  • Native histograms: return HTTP 400 on invalid native histogram samples instead of HTTP 500. The metric cortex_discarded_samples_total{reason="invalid-native-histogram"} is now incremented on invalid histogram samples.

Changelog

2.13.0-rc.0

Grafana Mimir

  • [CHANGE] Build: grafana/mimir docker image is now based on gcr.io/distroless/static-debian12 image. Alpine-based docker image is still available as grafana/mimir-alpine, until Mimir 2.15. #8204 #8235
  • [CHANGE] Ingester: /ingester/flush endpoint is now only allowed to execute only while the ingester is in Running state. The 503 status code is returned if the endpoint is called while the ingester is not in Running state. #7486
  • [CHANGE] Distributor: Include label name in err-mimir-label-value-too-long error message: #7740
  • [CHANGE] Ingester: enabled 1 out 10 errors log sampling by default. All the discarded samples will still be tracked by the cortex_discarded_samples_total metric. The feature can be configured via -ingester.error-sample-rate (0 to log all errors). #7807
  • [CHANGE] Query-frontend: Query results caching and experimental query blocking now utilize the PromQL string-formatted query format rather than the unvalidated query as submitted to the frontend. #7742
    • Query results caching should be more stable as all equivalent queries receive the same cache key, but there may be cache churn on first deploy with the updated format
    • Query blocking can no longer be circumvented with an equivalent query in a different format; see Configure queries to block
  • [CHANGE] Query-frontend: stop using -validation.create-grace-period to clamp how far into the future a query can span. #8075
  • [CHANGE] Clamp GOMAXPROCS to runtime.NumCPU. #8201
  • [CHANGE] Anonymous usage statistics tracking: add CPU usage percentage tracking. #8282
  • [CHANGE] Added new metric cortex_compactor_disk_out_of_space_errors_total which counts how many times a compaction failed due to the compactor being out of disk. #8237
  • [CHANGE] Anonymous usage statistics tracking: report active series in addition to in-memory series. #8279
  • [CHANGE] Ruler: evaluation_delay field in the rule group configuration has been deprecated. Please use query_offset instead (it has the same exact meaning and behaviour). #8295
  • [CHANGE] General: remove -log.buffered. The configuration option has been enabled by default and deprecated since Mimir 2.11. #8395
  • [CHANGE] Ruler: promote tenant federation from experimental to stable. #8400
  • [CHANGE] Ruler: promote -ruler.recording-rules-evaluation-enabled and -ruler.alerting-rules-evaluation-enabled from experimental to stable. #8400
  • [CHANGE] General: promote -tenant-federation.max-tenants from experimental to stable. #8400
  • [FEATURE] Continuous-test: now runable as a module with mimir -target=continuous-test. #7747
  • [FEATURE] Store-gateway: Allow specific tenants to be enabled or disabled via -store-gateway.enabled-tenants or -store-gateway.disabled-tenants CLI flags or their corresponding YAML settings. #7653
  • [FEATURE] New -<prefix>.s3.bucket-lookup-type flag configures lookup style type, used to access bucket in s3 compatible providers. #7684
  • [FEATURE] Querier: add experimental streaming PromQL engine, enabled with -querier.promql-engine=mimir. #7693 #7898 #7899 #8023 #8058 #8096 #8121 #8197 #8230 #8247 #8270 #8276 #8277 #8291 #8303 #8340 #8256 #8348
  • [FEATURE] New /ingester/unregister-on-shutdown HTTP endpoint allows dynamic access to ingesters' -ingester.ring.unregister-on-shutdown configuration. #7739
  • [FEATURE] Server: added experimental PROXY protocol support. The PROXY protocol support can be enabled via -server.proxy-protocol-enabled=true. When enabled, the support is added both to HTTP and gRPC listening ports. #7698
  • [FEATURE] Query-frontend, querier: new experimental /cardinality/active_native_histogram_metrics API to get active native histogram metric names with statistics about active native histogram buckets. #7982 #7986 #8008
  • [FEATURE] Alertmanager: Added -alertmanager.max-silences-count and -alertmanager.max-silence-size-bytes to set limits on per tenant silences. Disabled by default. #8241 #8249
  • [FEATURE] Ingester: add experimental support for the server-side circuit breakers when writing to and reading from ingesters. This can be enabled using -ingester.push-circuit-breaker.enabled and -ingester.read-circuit-breaker.enabled options. Further -ingester.push-circuit-breaker.* and -ingester.read-circuit-breaker.* options for configuring circuit-breaker are available. Added metrics cortex_ingester_circuit_breaker_results_total, `cortex...
Read more

2.12.0

03 Apr 22:18
mimir-2.12.0
c7aab9e
Compare
Choose a tag to compare

This release contains 531 PRs from 60 authors, including new contributors Benoit Schipper, Derek Cadzow, Edwin, Itay Kalfon, Ivan Farré Vicente, Jan O. Rundshagen, Jorge Turrado Ferrero, Lukas Monkevicius, Mickaël Canévet, Rafael Sathler, Rajakavitha Kodhandapani, Tim Kotowski, Vladimir Varankin, Zach, Zach Day, Zirko, blut, github-actions[bot], ncharaf, zhehao-grafana. Thank you!

Grafana Mimir version 2.12.0 release notes

Grafana Labs is excited to announce version 2.12 of Grafana Mimir.

The highlights that follow include the top features, enhancements, and bug fixes in this release.
For the complete list of changes, refer to the CHANGELOG.

Features and enhancements

  • Added support to only count series that are considered active through the Cardinality API endpoint /api/v1/cardinality/label_names by passing the count_method parameter.
    If set to active it counts only series that are considered active according to the -ingester.active-series-metrics-idle-timeout flag setting rather than counting all in-memory series.

  • The "Store-gateway: bucket tenant blocks" admin page contains a new column "No Compact".
    If block no compaction marker is set, it specifies the reason and the date the marker is added.

  • The estimated number of compaction jobs based on the current bucket-index is now computed by the compactor.
    The result is tracked by the new cortex_bucket_index_compaction_jobs metric.
    If this computation fails, the cortex_bucket_index_compaction_jobs_errors_total metric is updated instead.
    The estimated number of compaction jobs is also shown in Top tenants, Tenants, and Compactor dashboards.

  • Added mimir-distroless container image built upon a distroless image (gcr.io/distroless/static-debian12).
    This improvement minimizes attack surfaces and potential CVEs by trimming down the dependencies within the image.
    After comprehensive testing, the Mimir maintainers plan to shift from the current image to the distroless version.

Additionally, the following previously experimental features are now considered stable:

  • The number of pre-allocated workers used to forward push requests to the ingesters, configurable via the -distributor.reusable-ingester-push-workers CLI flag on distributors.
    It now defaults to 2000.
    Note that this is a performance optimization, and not a limiting feature.
    If not enough workers available, new goroutines will be spawned.

  • The number of gRPC server workers used to serve the requests, configurable via the -server.grpc.num-workers CLI flag.
    It now defaults to 100.
    Note that this is the number of pre-allocated long-lived workers, and not a limiting feature.
    If not enough workers are available, new goroutines will be spawned.

  • The maximum number of concurrent index header loads across all tenants, configurable via the -blocks-storage.bucket-store.index-header.lazy-loading-concurrency CLI flag on store-gateways.
    It defaults to 4.

  • The maximum time to wait for the query-frontend to become ready before rejecting requests, configurable via the -query-frontend.not-running-timeout CLI flag on query-frontends.
    It now defaults to 2s.

  • The CLI flag that allows queriers to reduce pressure on ingesters by initially querying only the minimum set of ingesters required to reach quorum, -querier.minimize-ingester-requests.
    It is now enabled by default.

  • Spread-minimizing token-related CLI flags: -ingester.ring.token-generation-strategy, -ingester.ring.spread-minimizing-zones and -ingester.ring.spread-minimizing-join-ring-in-order.
    You can read more about this feature in our blog post.

Important changes

In Grafana Mimir 2.12 the following behavior has changed:

  • Store-gateway now persists a sparse version of the index-header to disk on construction and loads sparse index-headers from disk instead of the whole index-header.
    This improves the speed at which index headers are lazy-loaded from disk by up to 90%. The added disk usage is in the order of 1-2%.

  • Alertmanager deprecated the v1 API. All v1 API endpoints now respond with a JSON deprecation notice and a status code of 410.
    All endpoints have a v2 equivalent.
    The list of endpoints is:

    • <alertmanager-web.external-url>/api/v1/alerts
    • <alertmanager-web.external-url>/api/v1/receivers
    • <alertmanager-web.external-url>/api/v1/silence/{id}
    • <alertmanager-web.external-url>/api/v1/silences
    • <alertmanager-web.external-url>/api/v1/status
  • Exemplar's label traceID has been changed to trace_id to be consistent with the OpenTelemetry standard.

  • Errors returned by ingesters now contain only gRPC status codes.
    Previously they contained both gRPC and HTTP status codes.

    {{< admonition type="warning" >}}
    To guarantee backwards compatibility when migrating from a version prior to 2.11, it's necessary to first migrate to version 2.11, and then to version 2.12.
    Otherwise, it might happen that during the migration, some ingester errors with HTTP status code 4xx won't be recognized, and the corresponding request will be repeated.
    {{< /admonition >}}

  • Responses with gRPC status codes are now reported as status_code labels in the cortex_request_duration_seconds and cortex_ingester_client_request_duration_seconds metrics.

  • Responses with HTTP 4xx status codes are now treated as errors and used in status_code label of request duration metric.

The default value of the following CLI flags have been changed:

  • -blocks-storage.tsdb.head-postings-for-matchers-cache-max-bytes from 10MB to 100MB.
  • -blocks-storage.tsdb.block-postings-for-matchers-cache-max-bytes from 10MB to 100MB.
  • -blocks-storage.bucket-store.tenant-sync-concurrency from 10 to 1.
  • -query-frontend.max-cache-freshness from 1m to 10m.
  • -distributor.write-requests-buffer-pooling-enabled from false to true.
  • -locks-storage.bucket-store.block-sync-concurrency from 20 to 4.
  • -memberlist.stream-timeout from 10s to 2s.
  • -server.report-grpc-codes-in-instrumentation-label-enabled from false to true.

The following deprecated configuration options are removed in Grafana Mimir 2.12:

  • The YAML setting frontend.cache_unaligned_requests.
  • Experimental CLI flag -querier.prefer-streaming-chunks-from-ingesters.

The following configuration options are deprecated and will be removed in Grafana Mimir 2.14:

  • The CLI flag -ingester.limit-inflight-requests-using-grpc-method-limiter.
    It now defaults to true.

  • The CLI flag -ingester.return-only-grpc-errors.
    It now defaults to true.

    {{< admonition type="warning" >}}
    To guarantee backwards compatibility when migrating from a version prior to 2.11, it's necessary to first migrate to version 2.11, and then to version 2.12.
    Otherwise, it might happen that during the migration, some ingester errors with HTTP status code 4xx won't be recognized, and the corresponding request will be repeated.
    {{< /admonition >}}

  • The CLI flag -ingester.client.report-grpc-codes-in-instrumentation-label-enabled.
    It now defaults to true.

  • The CLI flag -distributor.limit-inflight-requests-using-grpc-method-limiter.
    It now defaults to true.

  • The CLI flag -distributor.enable-otlp-metadata-storage.
    It now defaults to true.

  • The CLI flag -querier.max-query-into-future.

The following metrics are removed or deprecated:

  • cortex_bucket_store_blocks_loaded_by_duration has been removed.
  • cortex_distributor_sample_delay_seconds has been deprecated and will be removed in Mimir 2.14.

Experimental features

Grafana Mimir 2.12 includes new features that are considered experimental and disabled by default.
Use them with caution and report any issues you encounter:

  • The maximum number of tenant IDs that may be for a federated query can be configured via the -tenant-federation.max-tenants CLI flag on query-frontends.
    By default, it's 0, meaning that the limit is disabled.

  • Sharding of active series queries can be enabled via the -query-frontend.shard-active-series-queries CLI flag on query-frontends.

  • Timely head compaction can be enabled via the -blocks-storage.tsdb.timely-head-compaction-enabled on ingesters.
    If enabled, the head compaction happens when the min block range can no longer be appended, without requiring 1.5x the chunk range worth of data in the head.

  • Streaming of responses from querier to query-frontend can be enabled via the -querier.response-streaming-enabled CLI flag on queriers.
    This is currently supported only for responses from the /api/v1/cardinality/active_series endpoint.

  • The maximum response size for active series queries, in bytes, can be set via the -querier.active-series-results-max-size-bytes CLI flag on queriers.

  • Metric relabeling on a per-tenant basis can be forcefully disabled via the -distributor.metric-relabeling-enabled CLI flag on rulers.
    Metrics relabeling is enabled by default.

  • Query Queue Load Balancing by Query Component. Tenant query queues in the query-scheduler can now be split into subqueues by which query component is expected to be utilized to complete the query: ingesters, store-gateways, both, or uncategorized.
    Dequeuing queries for a given tenant will rotate through the query component subqueues via simple round-robin.
    In the event that the one of the query components (ingesters or store-gateways) experience a slowdown, queries only utilizing the other query component can continue to be serviced.
    This feature is r...

Read more

2.12.0-rc.1

20 Mar 12:21
mimir-2.12.0-rc.1
6c30057
Compare
Choose a tag to compare
2.12.0-rc.1 Pre-release
Pre-release

This release contains 525 PRs from 60 authors, including new contributors Benoit Schipper, Derek Cadzow, Edwin, Itay Kalfon, Ivan Farré Vicente, Jan O. Rundshagen, Jorge Turrado Ferrero, Lukas Monkevicius, Mickaël Canévet, Rafael Sathler, Rajakavitha Kodhandapani, Tim Kotowski, Vladimir Varankin, Zach, Zach Day, Zirko, blut, github-actions[bot], ncharaf, zhehao-grafana. Thank you!

Grafana Mimir version 2.12.0-rc.1 release notes

Grafana Labs is excited to announce version 2.12 of Grafana Mimir.

The highlights that follow include the top features, enhancements, and bug fixes in this release.
For the complete list of changes, refer to the CHANGELOG.

Features and enhancements

  • Added support to only count series that are considered active through the Cardinality API endpoint /api/v1/cardinality/label_names by passing the count_method parameter.
    If set to active it counts only series that are considered active according to the -ingester.active-series-metrics-idle-timeout flag setting rather than counting all in-memory series.

  • The "Store-gateway: bucket tenant blocks" admin page contains a new column "No Compact".
    If block no compaction marker is set, it specifies the reason and the date the marker is added.

  • The estimated number of compaction jobs based on the current bucket-index is now computed by the compactor.
    The result is tracked by the new cortex_bucket_index_compaction_jobs metric.
    If this computation fails, the cortex_bucket_index_compaction_jobs_errors_total metric is updated instead.
    The estimated number of compaction jobs is also shown in Top tenants, Tenants, and Compactor dashboards.

  • Added mimir-distroless container image built upon a distroless image (gcr.io/distroless/static-debian12).
    This improvement minimizes attack surfaces and potential CVEs by trimming down the dependencies within the image.
    After comprehensive testing, the Mimir maintainers plan to shift from the current image to the distroless version.

Additionally, the following previously experimental features are now considered stable:

  • The number of pre-allocated workers used to forward push requests to the ingesters, configurable via the -distributor.reusable-ingester-push-workers CLI flag on distributors.
    It now defaults to 2000.
    Note that this is a performance optimization, and not a limiting feature.
    If not enough workers available, new goroutines will be spawned.

  • The number of gRPC server workers used to serve the requests, configurable via the -server.grpc.num-workers CLI flag.
    It now defaults to 100.
    Note that this is the number of pre-allocated long-lived workers, and not a limiting feature.
    If not enough workers are available, new goroutines will be spawned.

  • The maximum number of concurrent index header loads across all tenants, configurable via the -blocks-storage.bucket-store.index-header.lazy-loading-concurrency CLI flag on store-gateways.
    It defaults to 4.

  • The maximum time to wait for the query-frontend to become ready before rejecting requests, configurable via the -query-frontend.not-running-timeout CLI flag on query-frontends.
    It now defaults to 2s.

  • The CLI flag that allows queriers to reduce pressure on ingesters by initially querying only the minimum set of ingesters required to reach quorum, -querier.minimize-ingester-requests.
    It is now enabled by default.

  • Spread-minimizing token-related CLI flags: -ingester.ring.token-generation-strategy, -ingester.ring.spread-minimizing-zones and -ingester.ring.spread-minimizing-join-ring-in-order.
    You can read more about this feature in our blog post.

Important changes

In Grafana Mimir 2.12 the following behavior has changed:

  • Store-gateway now persists a sparse version of the index-header to disk on construction and loads sparse index-headers from disk instead of the whole index-header.
    This improves the speed at which index headers are lazy-loaded from disk by up to 90%. The added disk usage is in the order of 1-2%.

  • Alertmanager deprecated the v1 API. All v1 API endpoints now respond with a JSON deprecation notice and a status code of 410.
    All endpoints have a v2 equivalent.
    The list of endpoints is:

    • <alertmanager-web.external-url>/api/v1/alerts
    • <alertmanager-web.external-url>/api/v1/receivers
    • <alertmanager-web.external-url>/api/v1/silence/{id}
    • <alertmanager-web.external-url>/api/v1/silences
    • <alertmanager-web.external-url>/api/v1/status
  • Exemplar's label traceID has been changed to trace_id to be consistent with the OpenTelemetry standard.

  • Errors returned by ingesters now contain only gRPC status codes.
    Previously they contained both gRPC and HTTP status codes.
    To guarantee backwards compatibility when migrating from a version prior to 2.11, it's necessary to first migrate to version 2.11, and then to version 2.12.
    Otherwise, it might happen that during the migration, some ingester errors with HTTP status code 4xx won't be recognized, and the corresponding request will be repeated.

  • Responses with gRPC status codes are now reported as status_code labels in the cortex_request_duration_seconds and cortex_ingester_client_request_duration_seconds metrics.

  • Responses with HTTP 4xx status codes are now treated as errors and used in status_code label of request duration metric.

The default value of the following CLI flags have been changed:

  • -blocks-storage.tsdb.head-postings-for-matchers-cache-max-bytes from 10MB to 100MB.
  • -blocks-storage.tsdb.block-postings-for-matchers-cache-max-bytes from 10MB to 100MB.
  • -blocks-storage.bucket-store.tenant-sync-concurrency from 10 to 1.
  • -query-frontend.max-cache-freshness from 1m to 10m.
  • -distributor.write-requests-buffer-pooling-enabled from false to true.
  • -locks-storage.bucket-store.block-sync-concurrency from 20 to 4.
  • -memberlist.stream-timeout from 10s to 2s.
  • -server.report-grpc-codes-in-instrumentation-label-enabled from false to true.

The following deprecated configuration options are removed in Grafana Mimir 2.12:

  • The YAML setting frontend.cache_unaligned_requests.
  • Experimental CLI flag -querier.prefer-streaming-chunks-from-ingesters.

The following configuration options are deprecated and will be removed in Grafana Mimir 2.14:

  • The CLI flag -ingester.limit-inflight-requests-using-grpc-method-limiter.
    It now defaults to true.

  • The CLI flag -ingester.return-only-grpc-errors.
    It now defaults to true.
    To guarantee backwards compatibility when migrating from a version prior to 2.11, it's necessary to first migrate to version 2.11, and then to version 2.12.
    Otherwise, it might happen that during the migration, some ingester errors with HTTP status code 4xx won't be recognized, and the corresponding request will be repeated.

  • The CLI flag -ingester.client.report-grpc-codes-in-instrumentation-label-enabled.
    It now defaults to true.

  • The CLI flag -distributor.limit-inflight-requests-using-grpc-method-limiter.
    It now defaults to true.

  • The CLI flag -distributor.enable-otlp-metadata-storage.
    It now defaults to true.

  • The CLI flag -querier.max-query-into-future.

The following metrics are removed or deprecated:

  • cortex_bucket_store_blocks_loaded_by_duration has been removed.
  • cortex_distributor_sample_delay_seconds has been deprecated and will be removed in Mimir 2.14.

Experimental features

Grafana Mimir 2.12 includes new features that are considered experimental and disabled by default.
Use them with caution and report any issues you encounter:

  • The maximum number of tenant IDs that may be for a federated query can be configured via the -tenant-federation.max-tenants CLI flag on query-frontends.
    By default, it's 0, meaning that the limit is disabled.

  • Sharding of active series queries can be enabled via the -query-frontend.shard-active-series-queries CLI flag on query-frontends.

  • Timely head compaction can be enabled via the -blocks-storage.tsdb.timely-head-compaction-enabled on ingesters.
    If enabled, the head compaction happens when the min block range can no longer be appended, without requiring 1.5x the chunk range worth of data in the head.

  • Streaming of responses from querier to query-frontend can be enabled via the -querier.response-streaming-enabled CLI flag on queriers.
    This is currently supported only for responses from the /api/v1/cardinality/active_series endpoint.

  • The maximum response size for active series queries, in bytes, can be set via the -querier.active-series-results-max-size-bytes CLI flag on queriers.

  • Metric relabeling on a per-tenant basis can be forcefully disabled via the -distributor.metric-relabeling-enabled CLI flag on rulers.
    Metrics relabeling is enabled by default.

  • Query Queue Load Balancing by Query Component. Tenant query queues in the query-scheduler can now be split into subqueues by which query component is expected to be utilized to complete the query: ingesters, store-gateways, both, or uncategorized.
    Dequeuing queries for a given tenant will rotate through the query component subqueues via simple round-robin.
    In the event that the one of the query components (ingesters or store-gateways) experience a slowdown, queries only utilizing the the other query component can continue to be serviced.
    This feature is recommended to be enabled.
    The following CLI flags must be set to true in order to be in effect:

    • `-query...
Read more

2.12.0-rc.0

12 Mar 19:10
mimir-2.12.0-rc.0
d0ac52d
Compare
Choose a tag to compare
2.12.0-rc.0 Pre-release
Pre-release

This release contains 525 PRs from 60 authors, including new contributors Benoit Schipper, Derek Cadzow, Edwin, Itay Kalfon, Ivan Farré Vicente, Jan O. Rundshagen, Jorge Turrado Ferrero, Lukas Monkevicius, Mickaël Canévet, Rafael Sathler, Rajakavitha Kodhandapani, Tim Kotowski, Vladimir Varankin, Zach, Zach Day, Zirko, blut, github-actions[bot], ncharaf, zhehao-grafana. Thank you!

Grafana Mimir version 2.12.0-rc.0 release notes

Grafana Labs is excited to announce version 2.12 of Grafana Mimir.

The highlights that follow include the top features, enhancements, and bug fixes in this release.
For the complete list of changes, refer to the CHANGELOG.

Features and enhancements

  • Added support to only count series that are considered active through the Cardinality API endpoint /api/v1/cardinality/label_names by passing the count_method parameter.
    If set to active it counts only series that are considered active according to the -ingester.active-series-metrics-idle-timeout flag setting rather than counting all in-memory series.

  • The "Store-gateway: bucket tenant blocks" admin page contains a new column "No Compact".
    If block no compaction marker is set, it specifies the reason and the date the marker is added.

  • The estimated number of compaction jobs based on the current bucket-index is now computed by the compactor.
    The result is tracked by the new cortex_bucket_index_compaction_jobs metric.
    If this computation fails, the cortex_bucket_index_compaction_jobs_errors_total metric is updated instead.
    The estimated number of compaction jobs is also shown in Top tenants, Tenants, and Compactor dashboards.

  • Added mimir-distroless container image built upon a distroless image (gcr.io/distroless/static-debian12).
    This improvement minimizes attack surfaces and potential CVEs by trimming down the dependencies within the image.
    After comprehensive testing, the Mimir maintainers plan to shift from the current image to the distroless version.

Additionally, the following previously experimental features are now considered stable:

  • The number of pre-allocated workers used to forward push requests to the ingesters, configurable via the -distributor.reusable-ingester-push-workers CLI flag on distributors.
    It now defaults to 2000.
    Note that this is a performance optimization, and not a limiting feature.
    If not enough workers available, new goroutines will be spawned.

  • The number of gRPC server workers used to serve the requests, configurable via the -server.grpc.num-workers CLI flag.
    It now defaults to 100.
    Note that this is the number of pre-allocated long-lived workers, and not a limiting feature.
    If not enough workers are available, new goroutines will be spawned.

  • The maximum number of concurrent index header loads across all tenants, configurable via the -blocks-storage.bucket-store.index-header.lazy-loading-concurrency CLI flag on store-gateways.
    It defaults to 4.

  • The maximum time to wait for the query-frontend to become ready before rejecting requests, configurable via the -query-frontend.not-running-timeout CLI flags on query-frontends.
    It now defaults to 2s.

  • Spread-minimizing token-related CLI flags: -ingester.ring.token-generation-strategy, -ingester.ring.spread-minimizing-zones and -ingester.ring.spread-minimizing-join-ring-in-order.
    You can read more about this feature in our blog post.

Important changes

In Grafana Mimir 2.12 the following behavior has changed:

  • Store-gateway now persists a sparse version of the index-header to disk on construction and loads sparse index-headers from disk instead of the whole index-header.
    This improves the speed at which index headers are lazy-loaded from disk by up to 90%. The added disk usage is in the order of 1-2%.

  • Alertmanager deprecated the v1 API. All v1 API endpoints now respond with a JSON deprecation notice and a status code of 410.
    All endpoints have a v2 equivalent.
    The list of endpoints is:

    • <alertmanager-web.external-url>/api/v1/alerts
    • <alertmanager-web.external-url>/api/v1/receivers
    • <alertmanager-web.external-url>/api/v1/silence/{id}
    • <alertmanager-web.external-url>/api/v1/silences
    • <alertmanager-web.external-url>/api/v1/status
  • Exemplar's label traceID has been changed to trace_id to be consistent with the OpenTelemetry standard.

  • Errors returned by ingesters now contain only gRPC status codes.
    Previously they contained both gRPC and HTTP status codes.
    To guarantee backwards compatibility when migrating from a version prior to 2.11, it's necessary to first migrate to version 2.11, and then to version 2.12.
    Otherwise, it might happen that during the migration, some ingester errors with HTTP status code 4xx won't be recognized, and the corresponding request will be repeated.

  • Responses with gRPC status codes are now reported as status_code labels in the cortex_request_duration_seconds and cortex_ingester_client_request_duration_seconds metrics.

  • Responses with HTTP 4xx status codes are now treated as errors and used in status_code label of request duration metric.

The default value of the following CLI flags have been changed:

  • -blocks-storage.tsdb.head-postings-for-matchers-cache-max-bytes from 10MB to 100MB.
  • -blocks-storage.tsdb.block-postings-for-matchers-cache-max-bytes from 10MB to 100MB.
  • -blocks-storage.bucket-store.tenant-sync-concurrency from 10 to 1.
  • -query-frontend.max-cache-freshness from 1m to 10m.
  • -distributor.write-requests-buffer-pooling-enabled from false to true.
  • -locks-storage.bucket-store.block-sync-concurrency from 20 to 4.
  • -memberlist.stream-timeout from 10s to 2s.
  • -server.report-grpc-codes-in-instrumentation-label-enabled from false to true.

The following deprecated configuration options are removed in Grafana Mimir 2.12:

  • The YAML setting frontend.cache_unaligned_requests.

The following configuration options are deprecated and will be removed in Grafana Mimir 2.14:

  • The CLI flag -ingester.limit-inflight-requests-using-grpc-method-limiter.
    It now defaults to true.

  • The CLI flag -ingester.return-only-grpc-errors.
    It now defaults to true.
    To guarantee backwards compatibility when migrating from a version prior to 2.11, it's necessary to first migrate to version 2.11, and then to version 2.12.
    Otherwise, it might happen that during the migration, some ingester errors with HTTP status code 4xx won't be recognized, and the corresponding request will be repeated.

  • The CLI flag -ingester.client.report-grpc-codes-in-instrumentation-label-enabled.
    It now defaults to true.

  • The CLI flag -distributor.limit-inflight-requests-using-grpc-method-limiter.
    It now defaults to true.

  • The CLI flag -distributor.enable-otlp-metadata-storage.
    It now defaults to true.

  • The CLI flag -querier.max-query-into-future.

The following metrics are removed or deprecated:

  • cortex_bucket_store_blocks_loaded_by_duration has been removed.
  • cortex_distributor_sample_delay_seconds has been deprecated and will be removed in Mimir 2.14.

Experimental features

Grafana Mimir 2.12 includes new features that are considered experimental and disabled by default.
Use them with caution and report any issues you encounter:

  • The maximum number of tenant IDs that may be for a federated query can be configured via the -tenant-federation.max-tenants CLI flag on query-frontends.
    By default, it's 0, meaning that the limit is disabled.

  • Sharding of active series queries can be enabled via the -query-frontend.shard-active-series-queries CLI flag on query-frontends.

  • Timely head compaction can be enabled via the -blocks-storage.tsdb.timely-head-compaction-enabled on ingesters.
    If enabled, the head compaction happens when the min block range can no longer be appended, without requiring 1.5x the chunk range worth of data in the head.

  • Streaming of responses from querier to query-frontend can be enabled via the -querier.response-streaming-enabled CLI flag on queriers.
    This is currently supported only for responses from the /api/v1/cardinality/active_series endpoint.

  • The maximum response size for active series queries, in bytes, can be set via the -querier.active-series-results-max-size-bytes CLI flag on queriers.

  • Metric relabeling on a per-tenant basis can be forcefully disabled via the -distributor.metric-relabeling-enabled CLI flag on rulers.
    Metrics relabeling is enabled by default.

  • Query Queue Load Balancing by Query Component. Tenant query queues in the query-scheduler can now be split into subqueues by which query component is expected to be utilized to complete the query: ingesters, store-gateways, both, or uncategorized.
    Dequeuing queries for a given tenant will rotate through the query component subqueues via simple round-robin.
    In the event that the one of the query components (ingesters or store-gateways) experience a slowdown, queries only utilizing the the other query component can continue to be serviced.
    This feature is recommended to be enabled.
    The following CLI flags must be set to true in order to be in effect:

    • -query-frontend.additional-query-queue-dimensions-enabled on the query-frontend.
    • -query-scheduler.additional-query-queue-dimensions-enabled on the query-scheduler.
  • Owned series tracking in ingesters can be enabled via the -ingester.track-ingester-owned-series CLI flag.
    When enabled, inge...

Read more

2.11.0

26 Dec 17:26
mimir-2.11.0
c8939ea
Compare
Choose a tag to compare

This release contains 532 PRs from 55 authors, including new contributors Benjamin, Dominik Kepinski, Jonathan Donzallaz, Juraj Michálek, Kai.Ke, Ludovic Terrier, Luke, Maciej Lech, Matthew Penner, Michael Potter, Mihai Țimbota-Belin, Rasmus Werner Salling, Ying WANG, chencs, fayzal-g, kalle (jag), sarthaktyagi-505, whoami. Thank you!

Grafana Mimir version 2.11.0 release notes

Grafana Labs is excited to announce version 2.11 of Grafana Mimir.

The highlights that follow include the top features, enhancements, and bugfixes in this release. For the complete list of changes, see the changelog.

Features and enhancements

  • Sampled logging of errors in the ingester. A high-traffic Mimir cluster can occasionally become bogged down logging high volumes of repeated errors. You can now reduce the amount of errors outputted to logs by setting a sample rate via the -ingester.error-sample-rate CLI flag.
  • Add total request size instance limit for ingesters. This limit protects the ingesters against requests that together may cause an OOM. Enable this feature by setting the -ingester.instance-limits.max-inflight-push-requests-bytes CLI flag in combination with the -ingester.limit-inflight-requests-using-grpc-method-limiter CLI flag.
  • Reduce the resolution of incoming native histograms samples if the incoming sample has too many buckets compared to -validation.max-native-histogram-buckets. This is enabled by default but can be turned off by setting the -validation.reduce-native-histogram-over-max-buckets CLI flag to false.
  • Improved query-scheduler performance under load. This is particularly apparent for clusters with large numbers of queriers.
  • Ingester to querier chunks streaming reduces the memory utilization of queriers and reduces the likelihood of OOMs.
  • Ingester query request minimization reduces the number of query requests to ingesters, improving performance and resource utilization for both ingesters and queriers.

Experimental features

Grafana Mimir 2.11 includes new features that are considered experimental and disabled by default. Please use them with caution and report any issue you encounter:

  • Block specified queries on a per-tenant basis. This is configured via the blocked_queries limit. See the docs for more information.
  • Store metadata when ingesting metrics via OTLP. This makes metric description and type available when ingesting metrics via OTLP. You can enable this feature by setting the CLI flag -distributor.enable-otlp-metadata-storage to true.
  • Reject gRPC push requests that the ingester/distributor is unable to accept before reading them into memory. You can enable this feature by using the -ingester.limit-inflight-requests-using-grpc-method-limiter and/or the -distributor.limit-inflight-requests-using-grpc-method-limiter CLI flags for the ingester and/or the distributor, respectively.
  • Customize the memcached client write and read buffer size. The buffer allocated for each memcached connection can be configured via the following CLI flags:
    • For the blocks storage:
      • -blocks-storage.bucket-store.chunks-cache.memcached.read-buffer-size-bytes
      • -blocks-storage.bucket-store.chunks-cache.memcached.write-buffer-size-bytes
      • -blocks-storage.bucket-store.index-cache.memcached.read-buffer-size-bytes
      • -blocks-storage.bucket-store.index-cache.memcached.write-buffer-size-bytes
      • -blocks-storage.bucket-store.metadata-cache.memcached.read-buffer-size-bytes
      • -blocks-storage.bucket-store.metadata-cache.memcached.write-buffer-size-bytes
    • For the query frontend:
      • -query-frontend.results-cache.memcached.read-buffer-size-bytes
      • -query-frontend.results-cache.memcached.write-buffer-size-bytes
    • For the ruler storage:
      • -ruler-storage.cache.memcached.read-buffer-size-bytes
      • -ruler-storage.cache.memcached.write-buffer-size-bytes
  • Configure the number of long-living workers used to process gRPC requests. This can decrease CPU usage by reducing the number of stack allocations. Configure this feature by using the -server.grpc.num-workers CLI flag.
  • Enforce a limit in bytes on the PostingsForMatchers cache used by ingesters. This limit can be configured via the -blocks-storage.tsdb.head-postings-for-matchers-cache-max-bytes and -blocks-storage.tsdb.block-postings-for-matchers-cache-max-bytes CLI flags.
  • Pre-allocate the pool of workers in the distributor that are used to send push requests to ingesters. This can decrease CPU usage by reducing the number of stack allocations. You can enable this feature by using the -distributor.reusable-ingester-push-worker flag.
  • Include a Retry-After header in recoverable error responses from the distributor. This can protect your Mimir cluster from clients including Prometheus that default to retrying very quickly. Enable this feature by setting the -distributor.retry-after-header.enabled CLI flag.

Helm chart improvements

The Grafana Mimir and Grafana Enterprise Metrics Helm chart is now released independently. See the Grafana Mimir Helm chart documentation.

Important changes

In Grafana Mimir 2.11 the following behavior has changed:

  • The utilization-based read path limiter now operates on Go heap size instead of RSS from the Linux proc file system.

The following configuration options had been previously deprecated and are removed in Grafana Mimir 2.11:

  • The CLI flag -querier.iterators.
  • The CLI flag -query.batch-iterators.
  • The CLI flag -blocks-storage.bucket-store.bucket-index.enabled.
  • The CLI flag -blocks-storage.bucket-store.chunk-pool-min-bucket-size-bytes.
  • The CLI flag -blocks-storage.bucket-store.chunk-pool-max-bucket-size-bytes.
  • The CLI flag -blocks-storage.bucket-store.max-chunk-pool-bytes.

The following configuration options are deprecated and will be removed in Grafana Mimir 2.13:

  • The CLI flag -log.buffered; this is now the default behavior.

The following metrics are removed:

  • cortex_query_frontend_workers_enqueued_requests_total; use cortex_query_frontend_enqueue_duration_seconds_count instead.

The following configuration option defaults were changed:

  • The CLI flag -blocks-storage.bucket-store.index-header.sparse-persistence-enabled now defaults to true.
  • The default value for the CLI flag -blocks-storage.bucket-store.index-header.lazy-loading-concurrency was changed from 0 to 4.
  • The default value for the CLI flag -blocks-storage.tsdb.series-hash-cache-max-size-bytes was changed from 1GB to 350MB.
  • The default value for the CLI flag -blocks-storage.tsdb.early-head-compaction-min-estimated-series-reduction-percentage was changed from 10 to 15.

Bug fixes

  • Ingester: Respect context cancelation during query execution. PR 6085
  • Distributor: Return 529 when ingestion rate limit is hit and the distributor.service_overload_status_code_on_rate_limit_enabled flag is active. PR 6549
  • Query-scheduler: Prevent accumulation of stale querier connections. PR 6100
  • Packaging: Fix preremove script preventing upgrades on RHEL based OS. PR 6067

Changelog

2.11.0

Grafana Mimir

  • [CHANGE] The following deprecated configurations have been removed: #6673 #6779 #6808 #6814
    • -querier.iterators
    • -querier.batch-iterators
    • -blocks-storage.bucket-store.max-chunk-pool-bytes
    • -blocks-storage.bucket-store.chunk-pool-min-bucket-size-bytes
    • -blocks-storage.bucket-store.chunk-pool-max-bucket-size-bytes
    • -blocks-storage.bucket-store.bucket-index.enabled
  • [CHANGE] Querier: Split worker GRPC config into separate client configs for the frontend and scheduler to allow TLS to be configured correctly when specifying the tls_server_name. The GRPC config specified under -querier.frontend-client.* will no longer apply to the scheduler client, and will need to be set explicitly under -querier.scheduler-client.*. #6445 #6573
  • [CHANGE] Store-gateway: enable sparse index headers by default. Sparse index headers reduce the time to load an index header up to 90%. #6005
  • [CHANGE] Store-gateway: lazy-loading concurrency limit default value is now 4. #6004
  • [CHANGE] General: enabled -log.buffered by default. The -log.buffered has been deprecated and will be removed in Mimir 2.13. #6131
  • [CHANGE] Ingester: changed default -blocks-storage.tsdb.series-hash-cache-max-size-bytes setting from 1GB to 350MB. The new default cache size is enough to store the hashes for all series in a ingester, assuming up to 2M in-memory series per ingester and using the default 13h retention period for local TSDB blocks in the ingesters. #6130
  • [CHANGE] Query-frontend: removed cortex_query_frontend_workers_enqueued_requests_total. Use cortex_query_frontend_enqueue_duration_seconds_count instead. #6121
  • [CHANGE] Ingester / querier: enable ingester to querier chunks streaming by default and mark it as stable. #6174
  • [CHANGE] Ingester / querier: enable ingester query request minimisation by default and mark it as stable. #6174
  • [CHANGE] Ingester: changed the default value for the experimental configuration parameter -blocks-storage.tsdb.early-head-compaction-min-estimated-series-reduction-percentage from 10 to 15. #6186
  • [CHANGE] Ingester: /ingester/push HTTP endpoint has been removed. This endpoint was added for testing and troubleshooting, but was never documented or used for anything. #6299
  • [CHANGE] Experimental...
Read more

2.11.0-rc.0

20 Dec 20:46
mimir-2.11.0-rc.0
229cba4
Compare
Choose a tag to compare
2.11.0-rc.0 Pre-release
Pre-release

This release contains 531 PRs from 55 authors, including new contributors Benjamin, Dominik Kepinski, Jonathan Donzallaz, Juraj Michálek, Kai.Ke, Ludovic Terrier, Luke, Maciej Lech, Matthew Penner, Michael Potter, Mihai Țimbota-Belin, Rasmus Werner Salling, Ying WANG, chencs, fayzal-g, kalle (jag), renovate[bot], sarthaktyagi-505, whoami. Thank you!

Grafana Mimir version 2.11.0-rc.0 release notes

Grafana Labs is excited to announce version 2.11 of Grafana Mimir.

The highlights that follow include the top features, enhancements, and bugfixes in this release. For the complete list of changes, see the changelog.

Features and enhancements

  • Sampled logging of errors in the ingester. A high-traffic Mimir cluster can occasionally become bogged down logging high volumes of repeated errors. You can now reduce the amount of errors outputted to logs by setting a sample rate via the -ingester.error-sample-rate CLI flag.
  • Add total request size instance limit for ingesters. This limit protects the ingesters against requests that together may cause an OOM. Enable this feature by setting the -ingester.instance-limits.max-inflight-push-requests-bytes CLI flag in combination with the -ingester.limit-inflight-requests-using-grpc-method-limiter CLI flag.
  • Reduce the resolution of incoming native histograms samples if the incoming sample has too many buckets compared to -validation.max-native-histogram-buckets. This is enabled by default but can be turned off by setting the -validation.reduce-native-histogram-over-max-buckets CLI flag to false.
  • Improved query-scheduler performance under load. This is particularly apparent for clusters with large numbers of queriers.
  • Ingester to querier chunks streaming reduces the memory utilization of queriers and reduces the likelihood of OOMs.
  • Ingester query request minimization reduces the number of query requests to ingesters, improving performance and resource utilization for both ingesters and queriers.

Experimental features

Grafana Mimir 2.11 includes new features that are considered experimental and disabled by default. Please use them with caution and report any issue you encounter:

  • Block specified queries on a per-tenant basis. This is configured via the blocked_queries limit. See the docs for more information.
  • Store metadata when ingesting metrics via OTLP. This makes metric description and type available when ingesting metrics via OTLP. You can enable this feature by setting the CLI flag -distributor.enable-otlp-metadata-storage to true.
  • Reject gRPC push requests that the ingester/distributor is unable to accept before reading them into memory. You can enable this feature by using the -ingester.limit-inflight-requests-using-grpc-method-limiter and/or the -distributor.limit-inflight-requests-using-grpc-method-limiter CLI flags for the ingester and/or the distributor, respectively.
  • Customize the memcached client write and read buffer size. The buffer allocated for each memcached connection can be configured via the following CLI flags:
    • For the blocks storage:
      • -blocks-storage.bucket-store.chunks-cache.memcached.read-buffer-size-bytes
      • -blocks-storage.bucket-store.chunks-cache.memcached.write-buffer-size-bytes
      • -blocks-storage.bucket-store.index-cache.memcached.read-buffer-size-bytes
      • -blocks-storage.bucket-store.index-cache.memcached.write-buffer-size-bytes
      • -blocks-storage.bucket-store.metadata-cache.memcached.read-buffer-size-bytes
      • -blocks-storage.bucket-store.metadata-cache.memcached.write-buffer-size-bytes
    • For the query frontend:
      • -query-frontend.results-cache.memcached.read-buffer-size-bytes
      • -query-frontend.results-cache.memcached.write-buffer-size-bytes
    • For the ruler storage:
      • -ruler-storage.cache.memcached.read-buffer-size-bytes
      • -ruler-storage.cache.memcached.write-buffer-size-bytes
  • Configure the number of long-living workers used to process gRPC requests. This can decrease CPU usage by reducing the number of stack allocations. Configure this feature by using the -server.grpc.num-workers CLI flag.
  • Enforce a limit in bytes on the PostingsForMatchers cache used by ingesters. This limit can be configured via the -blocks-storage.tsdb.head-postings-for-matchers-cache-max-bytes and -blocks-storage.tsdb.block-postings-for-matchers-cache-max-bytes CLI flags.
  • Pre-allocate the pool of workers in the distributor that are used to send push requests to ingesters. This can decrease CPU usage by reducing the number of stack allocations. You can enable this feature by using the -distributor.reusable-ingester-push-worker flag.
  • Include a Retry-After header in recoverable error responses from the distributor. This can protect your Mimir cluster from clients including Prometheus that default to retrying very quickly. Enable this feature by setting the -distributor.retry-after-header.enabled CLI flag.

Helm chart improvements

The Grafana Mimir and Grafana Enterprise Metrics Helm chart is now released independently. See the Grafana Mimir Helm chart documentation.

Important changes

In Grafana Mimir 2.11 the following behavior has changed:

  • The utilization-based read path limiter now operates on Go heap size instead of RSS from the Linux proc file system.

The following configuration options had been previously deprecated and are removed in Grafana Mimir 2.11:

  • The CLI flag -querier.iterators.
  • The CLI flag -query.batch-iterators.
  • The CLI flag -blocks-storage.bucket-store.bucket-index.enabled.
  • The CLI flag -blocks-storage.bucket-store.chunk-pool-min-bucket-size-bytes.
  • The CLI flag -blocks-storage.bucket-store.chunk-pool-max-bucket-size-bytes.
  • The CLI flag -blocks-storage.bucket-store.max-chunk-pool-bytes.

The following configuration options are deprecated and will be removed in Grafana Mimir 2.13:

  • The CLI flag -log.buffered; this is now the default behavior.

The following metrics are removed:

  • cortex_query_frontend_workers_enqueued_requests_total; use cortex_query_frontend_enqueue_duration_seconds_count instead.

The following configuration option defaults were changed:

  • The CLI flag -blocks-storage.bucket-store.index-header.sparse-persistence-enabled now defaults to true.
  • The default value for the CLI flag -blocks-storage.bucket-store.index-header.lazy-loading-concurrency was changed from 0 to 4.
  • The default value for the CLI flag -blocks-storage.tsdb.series-hash-cache-max-size-bytes was changed from 1GB to 350MB.
  • The default value for the CLI flag -blocks-storage.tsdb.early-head-compaction-min-estimated-series-reduction-percentage was changed from 10 to 15.

Bug fixes

  • Ingester: Respect context cancelation during query execution. PR 6085
  • Distributor: Return 529 when ingestion rate limit is hit and the distributor.service_overload_status_code_on_rate_limit_enabled flag is active. PR 6549
  • Query-scheduler: Prevent accumulation of stale querier connections. PR 6100
  • Packaging: Fix preremove script preventing upgrades on RHEL based OS. PR 6067

Changelog

2.11.0-rc.0

Grafana Mimir

  • [CHANGE] The following deprecated configurations have been removed: #6673 #6779 #6808 #6814
    • -querier.iterators
    • -querier.batch-iterators
    • -blocks-storage.bucket-store.max-chunk-pool-bytes
    • -blocks-storage.bucket-store.chunk-pool-min-bucket-size-bytes
    • -blocks-storage.bucket-store.chunk-pool-max-bucket-size-bytes
    • -blocks-storage.bucket-store.bucket-index.enabled
  • [CHANGE] Querier: Split worker GRPC config into separate client configs for the frontend and scheduler to allow TLS to be configured correctly when specifying the tls_server_name. The GRPC config specified under -querier.frontend-client.* will no longer apply to the scheduler client, and will need to be set explicitly under -querier.scheduler-client.*. #6445 #6573
  • [CHANGE] Store-gateway: enable sparse index headers by default. Sparse index headers reduce the time to load an index header up to 90%. #6005
  • [CHANGE] Store-gateway: lazy-loading concurrency limit default value is now 4. #6004
  • [CHANGE] General: enabled -log.buffered by default. The -log.buffered has been deprecated and will be removed in Mimir 2.13. #6131
  • [CHANGE] Ingester: changed default -blocks-storage.tsdb.series-hash-cache-max-size-bytes setting from 1GB to 350MB. The new default cache size is enough to store the hashes for all series in a ingester, assuming up to 2M in-memory series per ingester and using the default 13h retention period for local TSDB blocks in the ingesters. #6130
  • [CHANGE] Query-frontend: removed cortex_query_frontend_workers_enqueued_requests_total. Use cortex_query_frontend_enqueue_duration_seconds_count instead. #6121
  • [CHANGE] Ingester / querier: enable ingester to querier chunks streaming by default and mark it as stable. #6174
  • [CHANGE] Ingester / querier: enable ingester query request minimisation by default and mark it as stable. #6174
  • [CHANGE] Ingester: changed the default value for the experimental configuration parameter -blocks-storage.tsdb.early-head-compaction-min-estimated-series-reduction-percentage from 10 to 15. #6186
  • [CHANGE] Ingester: /ingester/push HTTP endpoint has been removed. This endpoint was added for testing and troubleshooting, but was never documented or used for anything. #6299...
Read more