Releases: thanos-io/thanos
v0.32.0-rc.1
- #6612: Store: Fix missing flush when handling pushed down queries
v0.32.0-rc.0
v0.32.0-rc.0 is out after a long wait, as we were busy fixing a rather challenging issue!
Thank you to all contributors who have contributed to this release. It wouldn't be possible without you.
Some of the highlights include support for PromQL query explanations in the UI when using the thanos
PromQL engine, AZ-aware replication for Receive and other new flags, tools bucket replicate
improvements, and lots of optimizations and bug/race fixes!
Do take note of some of the breaking metric name changes and the change in container image user.
You can find the changelog with all of the details below. Let's also celebrate all our new contributors!
Please try it out and let us know if you spot any problems! Full-release/next rc will be in 3 days!
Changes
Added
- #6437 Receive: make tenant stats limit configurable
- #6369 Receive: add az-aware replication support for Ketama algorithm
- #6185 Tracing: tracing in OTLP support configuring service_name.
- #6192 Store: add flag
bucket-web-label
to select the label to use as timeline title in web UI - #6195 Receive: add flag
tsdb.too-far-in-future.time-window
to prevent clock skewed samples to pollute TSDB head and block all valid incoming samples. - #6273 Mixin: Allow specifying an instance name filter in dashboards
- #6163 Receiver: Add hidden flag
--receive-forward-max-backoff
to configure the max backoff for forwarding requests. - #5777 Receive: Allow specifying tenant-specific external labels in Router Ingestor.
- #6352 Store: Expose store gateway query stats in series response hints.
- #6420 Index Cache: Cache expanded postings.
- #6441 Compact: Compactor will set
index_stats
inmeta.json
file with max series and chunk size information. - #6466 Mixin (Receive): add limits alerting for configuration reload and meta-monitoring.
- #6467 Mixin (Receive): add alert for tenant reaching head series limit.
- #6528 Index Cache: Add histogram metric
thanos_store_index_cache_stored_data_size_bytes
for item size. - #6560 Thanos ruler: add flag to optionally disable adding Thanos params when querying metrics
- #6574 Tools: Add min and max compactions range flags to
bucket replicate
command. - #6593 Store: Add
thanos_bucket_store_chunk_refetches_total
metric to track number of chunk refetches. - #6264 Query: Add Thanos logo in navbar
- #6234 Query: Add ability to switch between
thanos
andprometheus
engines dynamically via UI and API. - #6346 Query: Add ability to generate SQL-like query explanations when
thanos
engine is used.
Fixed
- #6503 *: Change the engine behind
ContentPathReloader
to be completely independent of any filesystem concept. This effectively fixes this configuration reload when used with Kubernetes ConfigMaps, Secrets, or other volume mounts. - #6456 Store: fix crash when computing set matches from regex pattern
- #6427 Receive: increased log level for failed uploads to
error
- #6172 query-frontend: return JSON formatted errors for invalid PromQL expression in the split by interval middleware.
- #6171 Store: fix error handling on limits.
- #6183 Receiver: fix off by one in multitsdb flush that will result in empty blocks if the head only contains one sample
- #6197 Exemplar OTel: Fix exemplar for otel to use traceId instead of spanId and sample only if trace is sampled
- #6207 Receive: Remove the shipper once a tenant has been pruned.
- #6216 Receiver: removed hard-coded value of EnableExemplarStorage flag and set it according to max-exemplar value.
- #6222 mixin(Receive): Fix tenant series received dashboard widget.
- #6218 mixin(Store): handle ResourceExhausted as a non-server error. As a consequence, this error won't contribute to Store's grpc errors alerts.
- #6271 Receive: Fix segfault in
LabelValues
during head compaction. - #6306 Tracing: tracing in OTLP utilize the OTEL_TRACES_SAMPLER env variable
- #6330 Store: Fix inconsistent error for series limits.
- #6342 Cache/Redis: Upgrade
rueidis
to v1.0.2 to to improve error handling while shrinking a redis cluster. - #6325 Store: return gRPC resource exhausted error for byte limiter.
- #6399 *: Fix double-counting bug in http_request_duration metric
- #6428 Report gRPC connnection errors in the logs.
- #6519 Reloader: Use timeout for initial apply.
- #6509 Store Gateway: Remove
memWriter
fromfileWriter
to reduce memory usage when sync index headers. - #6556 Thanos compact: respect block-files-concurrency setting when downsampling
- #6592 Query Frontend: fix bugs in vertical sharding
without
andunion
function to allow more queries to be shardable. - #6317 *: Fix internal label deduplication bug, by resorting store response set.
- #6189 Rule: Fix panic when calling API
/api/v1/rules?type=alert
.
Changed
- #6049 Compact: breaking
⚠️ Replace group with resolution in compact metrics to avoid cardinality explosion on compact metrics for large numbers of groups. - #6168 Receiver: Make ketama hashring fail early when configured with number of nodes lower than the replication factor.
- #6201 Query-Frontend: Disable absent and absent_over_time for vertical sharding.
- #6212 Query-Frontend: Disable scalar for vertical sharding.
- #6107 breaking
⚠️ Change default user id in container image from 0(root) to 1001 - #6228 Conditionally generate debug messages in ProxyStore to avoid memory bloat.
- #6231 mixins: Add code/grpc-code dimension to error widgets.
- #6244 mixin(Rule): Add rule evaluation failures to the Rule dashboard.
- #6303 Store: added and start using streamed snappy encoding for postings list instead of block based one. This leads to constant memory usage during decompression. This approximately halves memory usage when decompressing a postings list in index cache.
- #6071 Query Frontend: breaking
⚠️ Add experimental native histogram support for which we updated and aligned with the Prometheus common model, which is used for caching so a cache reset required. - #6163 Receiver: changed default max backoff from 30s to 5s for forwarding requests. Can be configured with
--receive-forward-max-backoff
. - #6327 *: breaking
⚠️ Use histograms instead of summaries for instrumented handlers. - #6322 Logging: Avoid expensive log.Valuer evaluation for disallowed levels.
- #6358 Query: Add +Inf bucket to query duration metrics
- #6363 Store: Check context error when expanding postings.
- #6405 Index Cache: Change postings cache key to include the encoding format used so that older Thanos versions would not try to decode it during the deployment of a new version.
- #6479 Store: breaking
⚠️ Renamethanos_bucket_store_cached_series_fetch_duration_seconds
tothanos_bucket_store_series_fetch_duration_seconds
andthanos_bucket_store_cached_postings_fetch_duration_seconds
tothanos_bucket_store_postings_fetch_duration_seconds
. - [#6474](https://github.com/thanos-io/...
v0.31.0
What's Changed
Added
- #5990 Cache/Redis: add support for Redis Sentinel via new option
master_name
. - #6008 *: Add counter metric
gate_queries_total
to gate. - #5926 Receiver: Add experimental string interning in writer. Can be enabled with a hidden flag
--writer.intern
. - #5773 Store: Support disabling cache index header file by setting
--disable-caching-index-header-file
. When toggled, Stores can run without needing persistent disks. - #5653 Receive: Allow setting hashing algorithm per tenant in hashrings config.
- #6074 *: Add histogram metrics
thanos_store_server_series_requested
andthanos_store_server_chunks_requested
to all Stores. - #6074 *: Allow configuring series and sample limits per
Series
request for all Stores. - #6104 Store: Support S3 session token.
- #5548 Query: Add experimental support for load balancing across multiple Store endpoints.
- #6148 Query-frontend: Add
traceID
to slow query detected log line. - #6153 Query-frontend: Add
remote_user
(from http basic auth) andremote_addr
to slow query detected log line.
Fixed
- #5995 Sidecar: Loads TLS certificate during startup.
- #6044 Receive: Mark out-of-window errors as conflict when out-of-window samples ingestion is used.
- #6050 Store: Re-try bucket store initial sync upon failure.
- #6067 Receive: Fix panic when querying uninitialized TSDBs.
- #6082 Query: Don't error when no stores are matched.
- #6098 Cache/Redis: Upgrade
rueidis
to v0.0.93 to fix potential panic when the client-side caching is disabled. - #6103 Mixins(Rule): Fix expression for long rule evaluations.
- #6121 Receive: Deduplicate meta-monitoring queries for Active Series Limiting.
- #6137 Downsample: Repair of non-empty XOR chunks during 1h downsampling.
- #6125 Query Frontend: Fix vertical shardable instant queries do not produce sorted results for
sort
,sort_desc
,topk
andbottomk
functions. - #6203 Receive: Fix panic in head compaction under high query load.
Changed
- #6010 *: Upgrade Prometheus to v0.42.0.
- #5999 *: Upgrade Alertmanager dependency to v0.25.0.
- #5887 Tracing: Make sure rate limiting sampler is the default, as was the case in version pre-0.29.0.
- #5997 Rule: switch to miekgdns DNS resolver as the default one.
- #6035 Tools (replicate): Support all types of matchers to match blocks for replication. Change matcher parameter from string slice to a single string.
- #6131 Store: breaking
⚠️ Use Histograms instead of Summaries for bucket metrics.
v0.31.0-rc.1
- #6203 Receive: Fix panic in head compaction under high query load.
v0.31.0-rc.0
What's Changed
Added
- #5990 Cache/Redis: add support for Redis Sentinel via new option
master_name
. - #6008 *: Add counter metric
gate_queries_total
to gate. - #5926 Receiver: Add experimental string interning in writer. Can be enabled with a hidden flag
--writer.intern
. - #5773 Store: Support disabling cache index header file by setting
--disable-caching-index-header-file
. When toggled, Stores can run without needing persistent disks. - #5653 Receive: Allow setting hashing algorithm per tenant in hashrings config.
- #6074 *: Add histogram metrics
thanos_store_server_series_requested
andthanos_store_server_chunks_requested
to all Stores. - #6074 *: Allow configuring series and sample limits per
Series
request for all Stores. - #6104 Store: Support S3 session token.
- #5548 Query: Add experimental support for load balancing across multiple Store endpoints.
- #6148 Query-frontend: Add
traceID
to slow query detected log line. - #6153 Query-frontend: Add
remote_user
(from http basic auth) andremote_addr
to slow query detected log line.
Fixed
- #5995 Sidecar: Loads TLS certificate during startup.
- #6044 Receive: Mark out-of-window errors as conflict when out-of-window samples ingestion is used.
- #6050 Store: Re-try bucket store initial sync upon failure.
- #6067 Receive: Fix panic when querying uninitialized TSDBs.
- #6082 Query: Don't error when no stores are matched.
- #6098 Cache/Redis: Upgrade
rueidis
to v0.0.93 to fix potential panic when the client-side caching is disabled. - #6103 Mixins(Rule): Fix expression for long rule evaluations.
- #6121 Receive: Deduplicate meta-monitoring queries for Active Series Limiting.
- #6137 Downsample: Repair of non-empty XOR chunks during 1h downsampling.
- #6125 Query Frontend: Fix vertical shardable instant queries do not produce sorted results for
sort
,sort_desc
,topk
andbottomk
functions.
Changed
- #6010 *: Upgrade Prometheus to v0.42.0.
- #5999 *: Upgrade Alertmanager dependency to v0.25.0.
- #5887 Tracing: Make sure rate limiting sampler is the default, as was the case in version pre-0.29.0.
- #5997 Rule: switch to miekgdns DNS resolver as the default one.
- #6035 Tools (replicate): Support all types of matchers to match blocks for replication. Change matcher parameter from string slice to a single string.
- #6131 Store: breaking
⚠️ Use Histograms instead of Summaries for bucket metrics.
New Contributors
- @JoaoBraveCoding made their first contribution in #5997
- @rueian made their first contribution in #5998
- @ashwinsrinivasmurthy made their first contribution in #6042
- @farodin91 made their first contribution in #6044
- @danielmellado made their first contribution in #6052
- @xBazilio made their first contribution in #6066
- @PradyumnaKrishna made their first contribution in #6063
- @Kartik-Garg made their first contribution in #6050
- @pmoncadaisla made their first contribution in #6084
- @sshantel made their first contribution in #6087
- @jkowall made their first contribution in #6078
- @m-messiah made their first contribution in #6058
- @abaguas made their first contribution in #6121
- @domjaeg made their first contribution in #6136
- @romanegunkov made their first contribution in #6134
- @harry671003 made their first contribution in #6125
Full Changelog: v0.30.0...v0.31.0-rc.0
v0.30.2
v0.30.1
This release contains a very small fix for the new Redis client. In the previous release, it was impossible to enable multiple caches using the new Redis client because it tries to register metrics more than once. As a result, for example, it was impossible to use Redis in Thanos Store with index cache and caching bucket enabled.
What's Changed
Full Changelog: v0.30.0...v0.30.1
v0.30.0
v0.30 brings many important fixes & optimizations to compaction, store gateway, receive replication and querying. Make sure to try the new PromQL engine which is more & more efficient every week.
NOTE: Querier's
query.promql-engine
flag enabling the new PromQL engine is now unhidden. We encourage users to use new experimental PromQL engine for efficiency reasons.
Furthermore, we recommend you use Redis as a caching client (if you use store GW or query frontend caching) and Ketama algorithm as receiver hashing algorithm ( --receive.hashrings-algorithm=ketama
- introducing consistent hashing to receiver).
Changes
Fixed
- #5716 DNS: Fix miekgdns resolver LookupSRV to work with CNAME records.
- #5844 Query Frontend: Fixes @ modifier time range when splitting queries by interval.
- #5854 Query Frontend:
lookback_delta
param is now handled in query frontend. - #5860 Query: Fixed bug of not showing query warnings in Thanos UI.
- #5856 Store: Fixed handling of debug logging flag.
- #5230 Rule: Stateless ruler support restoring
for
state from query API servers. The query API servers should be able to access the remote write storage. - #5880 Query Frontend: Fixes some edge cases of query sharding analysis.
- #5893 Cache: Fixed redis client not respecting
SetMultiBatchSize
config value. - #5966 Query: Stop relying on non-existent hints for mint and maxt when selecting series for the
api/v1/series
HTTP endpoint. - #5948 Store:
chunks_fetched_duration
wrong calculation. - #5910: Receive: Fixed ketama quorum bug that was could cause success response for failed replication. This also optimize heavily receiver CPU use.
Added
- #5814 Store: Added metric
thanos_bucket_store_postings_size_bytes
that shows the distribution of how many postings (in bytes) were needed for each Series() call in Thanos Store. Useful for determining limits. - #5703 StoreAPI: Added
hash
field to series' chunks. Store gateway and receive implements that field and proxy leverage that for quicker deduplication. - #5801 Store: Added a new flag
--store.grpc.downloaded-bytes-limit
that limits the number of bytes downloaded in each Series/LabelNames/LabelValues call. Usethanos_bucket_store_postings_size_bytes
for determining the limits. - #5836 Receive: Added hidden flag
tsdb.memory-snapshot-on-shutdown
to enable experimental TSDB feature to snapshot on shutdown. This is intended to speed up receiver restart. - #5839 Receive: Added parameter
--tsdb.out-of-order.time-window
to set time window for experimental out-of-order samples ingestion. Disabled by default (set to 0s). Please note if you enable this option and you use compactor, make sure you set the--enable-vertical-compaction
flag, otherwise you might risk compactor halt. - #5889 Query Frontend: Added support for vertical sharding
label_replace
andlabel_join
functions. - #5865 Compact: Retry on sync metas error.
- #5819 Store: Added a few objectives for Store's data summaries (touched/fetched amount and sizes). They are: 50, 95, and 99 quantiles.
- #5837 Store: Added streaming retrival of series from object storage.
- #5940 Objstore: Support for authenticating to Swift using application credentials.
- #5945 Tools: Added new
no-downsample
marker to skip blocks when downsampling viathanos tools bucket mark --marker=no-downsample-mark.json
. This will skip downsampling for blocks with the new marker. - #5977 Tools: Added remove flag on bucket mark command to remove deletion, no-downsample or no-compact markers on the block
Changed
- #5785 Query:
thanos_store_nodes_grpc_connections
now trimmsexternal_labels
label name longer than 1000 character. It also allows customizations in what labels to preserve usingquery.conn-metric.label
flag. - #5542 Mixin: Added query concurrency panel to Querier dashboard.
- #5846 Query Frontend: vertical query sharding supports subqueries.
- #5909 Receive: Compact tenant head after no appends have happened for 1.5
tsdb.max-block-size
. - #5593 Cache: Switched Redis client to Rueidis. Rueidis is faster and provides client-side caching. It is highly recommended to use it so that repeated requests for the same key would not be needed.
- #5896 *: Upgraded Prometheus to v0.40.7 without implementing native histogram support. Querying native histograms will fail with
Error executing query: invalid chunk encoding "<unknown>"
and native histograms in write requests are ignored. - #5838 Mixin: Added data touched type to Store dashboard.
- #5922 Compact: Retry on clean, partial marked errors when possible.
Removed
- #5824 Mixin: Remove noisy
ThanosReceiveTrafficBelowThreshold
alert.
New Contributors
- @rajivharlalka made their first contribution in #5631
- @Atharva-Shinde made their first contribution in #5716
- @clwluvw made their first contribution in #5856
- @VicThomas made their first contribution in #5884
- @karster made their first contribution in #5886
- @sumanpaikdev made their first contribution in #5868
- @abbyssoul made their first contribution in #5893
- @juanrh made their first contribution in #5795
- @hyder made their first contribution in #5928
- @aarnq made their first contribution in #5940
- @4orty made their first contribution in #5953
- @jatinagwal made their first contribution in #5967
- @RohitKochhar made their first contribution in #5945
- @rabenhorst made their first contribution in #5896
- @kama910 made their first contribution in #5981
- @Vishvsalvi made their first contribution in #5979
- @maheshbaliga made their first contribution in #5977
Commits
- CHANGELOG: mark 0.29.0 as in progress by @GiedriusS in #5808
- store: add histogram for postings size by @GiedriusS in #5814
- Store/Receivers: Calculating chunk hashes on stores/receivers by @pedro-stanaka in #5703
- Use pre-calculated hashes by @fpetkovski in #5817
- Short-circuit chunk dedup in proxy by @fpetkovski in #5816
- deps: Updated promql-engine to latest. by @bwplotka in #5821
- Query: Trim very long external labels and add cmd flag to optionally specify metric labels to collect by @utukJ in #5785
- CircleCI: Replace checkout step with custom command by @matej-g in #5829
- store: add downloaded bytes limit by @GiedriusS in #5801
- Mixin: Remove low ingestion rate warning for receiver by @matej-g in #5824
- Mixin: Remove low ingestion rate warning for receiver (fix tests) by @matej-g in #5831
- Fix Typo's in recieve.md by @rajivharlalka in #5631
- add panel Query Concurrency to dashboard mixin. by @raptorsun in #5542
- docs: Added guide for Community Office Hours shepherding. by @bwplotka in #5568
- *: Clean up stale bot config file by @matej-g in #5834
- Receive: Add experimental snapshot on shutdown by @matej-g in #5836
- Feature...
v0.30.0-rc.0
v0.30 brings many important fixes & optimizations to compaction, store gateway, receive replication and querying. Make sure to try the new PromQL engine which is more & more efficient every week.
NOTE: Querier's
query.promql-engine
flag enabling the new PromQL engine is now unhidden. We encourage users to use new experimental PromQL engine for efficiency reasons.
Furthermore, we recommend you use Redis as a caching client (if you use store GW or query frontend caching) and Ketama algorithm as receiver hashing algorithm ( --receive.hashrings-algorithm=ketama
- introducing consistent hashing to receiver).
Enjoy & Happy Christmas Holidays! 🎉
Changes
Fixed
- #5716 DNS: Fix miekgdns resolver LookupSRV to work with CNAME records.
- #5844 Query Frontend: Fixes @ modifier time range when splitting queries by interval.
- #5854 Query Frontend:
lookback_delta
param is now handled in query frontend. - #5860 Query: Fixed bug of not showing query warnings in Thanos UI.
- #5856 Store: Fixed handling of debug logging flag.
- #5230 Rule: Stateless ruler support restoring
for
state from query API servers. The query API servers should be able to access the remote write storage. - #5880 Query Frontend: Fixes some edge cases of query sharding analysis.
- #5893 Cache: Fixed redis client not respecting
SetMultiBatchSize
config value. - #5966 Query: Stop relying on non-existent hints for mint and maxt when selecting series for the
api/v1/series
HTTP endpoint. - #5948 Store:
chunks_fetched_duration
wrong calculation. - #5910: Receive: Fixed ketama quorum bug that was could cause success response for failed replication. This also optimize heavily receiver CPU use.
Added
- #5814 Store: Added metric
thanos_bucket_store_postings_size_bytes
that shows the distribution of how many postings (in bytes) were needed for each Series() call in Thanos Store. Useful for determining limits. - #5703 StoreAPI: Added
hash
field to series' chunks. Store gateway and receive implements that field and proxy leverage that for quicker deduplication. - #5801 Store: Added a new flag
--store.grpc.downloaded-bytes-limit
that limits the number of bytes downloaded in each Series/LabelNames/LabelValues call. Usethanos_bucket_store_postings_size_bytes
for determining the limits. - #5836 Receive: Added hidden flag
tsdb.memory-snapshot-on-shutdown
to enable experimental TSDB feature to snapshot on shutdown. This is intended to speed up receiver restart. - #5839 Receive: Added parameter
--tsdb.out-of-order.time-window
to set time window for experimental out-of-order samples ingestion. Disabled by default (set to 0s). Please note if you enable this option and you use compactor, make sure you set the--enable-vertical-compaction
flag, otherwise you might risk compactor halt. - #5889 Query Frontend: Added support for vertical sharding
label_replace
andlabel_join
functions. - #5865 Compact: Retry on sync metas error.
- #5819 Store: Added a few objectives for Store's data summaries (touched/fetched amount and sizes). They are: 50, 95, and 99 quantiles.
- #5837 Store: Added streaming retrival of series from object storage.
- #5940 Objstore: Support for authenticating to Swift using application credentials.
- #5945 Tools: Added new
no-downsample
marker to skip blocks when downsampling viathanos tools bucket mark --marker=no-downsample-mark.json
. This will skip downsampling for blocks with the new marker. - #5977 Tools: Added remove flag on bucket mark command to remove deletion, no-downsample or no-compact markers on the block
Changed
- #5785 Query:
thanos_store_nodes_grpc_connections
now trimmsexternal_labels
label name longer than 1000 character. It also allows customizations in what labels to preserve usingquery.conn-metric.label
flag. - #5542 Mixin: Added query concurrency panel to Querier dashboard.
- #5846 Query Frontend: vertical query sharding supports subqueries.
- #5909 Receive: Compact tenant head after no appends have happened for 1.5
tsdb.max-block-size
. - #5593 Cache: Switched Redis client to Rueidis. Rueidis is faster and provides client-side caching. It is highly recommended to use it so that repeated requests for the same key would not be needed.
- #5896 *: Upgraded Prometheus to v0.40.7 without implementing native histogram support. Querying native histograms will fail with
Error executing query: invalid chunk encoding "<unknown>"
and native histograms in write requests are ignored. - #5838 Mixin: Added data touched type to Store dashboard.
- #5922 Compact: Retry on clean, partial marked errors when possible.
Removed
- #5824 Mixin: Remove noisy
ThanosReceiveTrafficBelowThreshold
alert.
New Contributors
- @rajivharlalka made their first contribution in #5631
- @Atharva-Shinde made their first contribution in #5716
- @clwluvw made their first contribution in #5856
- @VicThomas made their first contribution in #5884
- @karster made their first contribution in #5886
- @sumanpaikdev made their first contribution in #5868
- @abbyssoul made their first contribution in #5893
- @juanrh made their first contribution in #5795
- @hyder made their first contribution in #5928
- @aarnq made their first contribution in #5940
- @4orty made their first contribution in #5953
- @jatinagwal made their first contribution in #5967
- @RohitKochhar made their first contribution in #5945
- @rabenhorst made their first contribution in #5896
- @kama910 made their first contribution in #5981
- @Vishvsalvi made their first contribution in #5979
- @maheshbaliga made their first contribution in #5977
Commits
- CHANGELOG: mark 0.29.0 as in progress by @GiedriusS in #5808
- store: add histogram for postings size by @GiedriusS in #5814
- Store/Receivers: Calculating chunk hashes on stores/receivers by @pedro-stanaka in #5703
- Use pre-calculated hashes by @fpetkovski in #5817
- Short-circuit chunk dedup in proxy by @fpetkovski in #5816
- deps: Updated promql-engine to latest. by @bwplotka in #5821
- Query: Trim very long external labels and add cmd flag to optionally specify metric labels to collect by @utukJ in #5785
- CircleCI: Replace checkout step with custom command by @matej-g in #5829
- store: add downloaded bytes limit by @GiedriusS in #5801
- Mixin: Remove low ingestion rate warning for receiver by @matej-g in #5824
- Mixin: Remove low ingestion rate warning for receiver (fix tests) by @matej-g in #5831
- Fix Typo's in recieve.md by @rajivharlalka in #5631
- add panel Query Concurrency to dashboard mixin. by @raptorsun in #5542
- docs: Added guide for Community Office Hours shepherding. by @bwplotka in #5568
- *: Clean up stale bot config file by @matej-g in #5834
- Receive: Add experimental snapshot on shutdown by @matej-g in https://github.co...
v0.29.0
v0.29.0
is out after 69 days of work since v0.28.0
! Thank you to all 35 contributors who have contributed to this release. It wouldn't be the same without you. v0.29.0
has no changes since the release candidate.
Some of the highlights include OpenTelemetry support, Azure support has been improved with a new SDK, increased query speed, receive has new features to limit series per tenant.
First, let's celebrate new contributors, and then you can find the changelog where you can find all of the details. Please try it out and let us know if you spot any problems!
New Contributors
- @nikitapecasa made their first contribution in #5448
- @shenxn made their first contribution in #5455
- @SrushtiSapkale made their first contribution in #5447
- @chris-ng-scmp made their first contribution in #5466
- @olasd made their first contribution in #5477
- @BouchaaraAdil made their first contribution in #5465
- @eharcevs made their first contribution in #5453
- @bishal7679 made their first contribution in #5486
- @naveensrinivasan made their first contribution in #5364
- @Firxiao made their first contribution in #5496
- @Akshit42-hue made their first contribution in #5529
- @audig made their first contribution in #5534
- @Juneezee made their first contribution in #5574
- @raptorsun made their first contribution in #5439
- @oronsh made their first contribution in #5596
- @jzelinskie made their first contribution in #5611
- @tusharxoxoxo made their first contribution in #5620
- @zvlb made their first contribution in #5573
- @padhiar-aditya made their first contribution in #5670
- @pedro-stanaka made their first contribution in #5666
- @prajain12 made their first contribution in #5678
- @xdavidwu made their first contribution in #5656
- @Abirdcfly made their first contribution in #5660
- @sdufel made their first contribution in #5684
- @mtlang made their first contribution in #5690
- @vhbfernandes made their first contribution in #5696
- @davinci26 made their first contribution in #5702
- @haanhvu made their first contribution in #5641
- @wanjunlei made their first contribution in #5723
- @dbut023 made their first contribution in #5674
- @utukJ made their first contribution in #5738
- @isantospardo made their first contribution in #5744
- @amincheloh made their first contribution in #5769
- @Rahulkumar2002 made their first contribution in #5749
- @aarontams made their first contribution in #5778
Fixed
- #5642 Receive: Log labels correctly in writer debug messages.
- #5655 Receive: Fix recreating already pruned tenants.
- #5702 Store: Upgrade minio-go/v7 to fix panic caused by leaked goroutines.
- #5736 Compact: Fix crash in GatherNoCompactionMarkFilter.NoCompactMarkedBlocks.
- #5763 Compact: Enable metadata cache.
- #5759 Compact: Fix missing duration log key.
- #5799 Query Frontend: Fixed sharding behaviour for vector matches. Now queries with sharding should work properly where the query looks like:
foo and without (lbl) bar
.
Added
- #5565 Receive: Allow remote write request limits to be defined per file and tenant (experimental).
- #5654 Query: add
--grpc-compression
flag that controls the compression used in gRPC client. With the flag it is now possible to compress the traffic between Query and StoreAPI nodes - you get lower network usage in exchange for a bit higher CPU/RAM usage.
- #5650 Query Frontend: Add sharded queries metrics.
thanos_frontend_sharding_middleware_queries_total
shows how many queries were sharded or not sharded. - #5658 Query Frontend: Introduce new optional parameters (
query-range.min-split-interval
,query-range.max-split-interval
,query-range.horizontal-shards
) to implement more dynamic horizontal query splitting. - #5721 Store: Add metric
thanos_bucket_store_empty_postings_total
for number of empty postings when fetching series. - #5723 Compactor: Support disable block viewer UI.
- #5674 Query Frontend/Store: Add support connecting to redis using TLS.
- #5734 Store: Support disable block viewer UI.
- #5411 Tracing: Add OpenTelemetry Protocol exporter.
- #5779 Objstore: Support specifying S3 storage class.
- #5741 Query: add metrics on how much data is being selected by downstream Store APIs.
- #5673 Receive: Reload tenant limit configuration on file change.
- #5749 Query Frontend: Added small LRU cache to cache query analysis results.
Changed
- #5738 Global: replace
crypto/sha256
withminio/sha256-simd
to make hash calculation faster in metadata and reloader packages. - #5648 Query Frontend: cache vertical shards in query-frontend.
- #5753 Build with Go 1.19.
- #5255 Query: Use k-way merging for the proxying logic. The proxying sub-system now uses much less resources (~25-80% less CPU usage, ~30-50% less RAM usage according to our benchmarks). Reduces query duration by a few percent on queries with lots of series.
- #5690 Compact: update
--debug.accept-malformed-index
flag to apply to downsampling. Previously the flag only applied to compaction, and fatal errors would still occur when downsampling was attempted. - #5707 Objstore: Update objstore to latest version which includes a refactored Azure Storage Account implementation with a new SDK.
- #5641 Store: Remove hardcoded labels in shard matcher.
- #5641 Query: Inject unshardable le label in query analyzer.
- #5685 Receive: Make active/head series limiting configuration per tenant by adding it to new limiting config.
- #5411 Tracing: Change Jaeger exporter from OpenTracing to OpenTelemetry. Options
RPC Metrics
,Gen128Bit
andDisabled
are now deprecated and won't have any effect when set⚠️ . - #5767 *: Upgrade Prometheus to v2.39.0.
- #5771 *: Upgrade Prometheus to v2.39.1.
Full Changelog: v0.28.1...v0.29.0