-
Notifications
You must be signed in to change notification settings - Fork 203
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge
prometheus.write.queue
into main. (#1564)
* readme * fix readme * Add filequeue functionality (#1601) * Checkin for file queue * add comment * Update internal/component/prometheus/remote/queue/filequeue/filequeue.go Co-authored-by: Piotr <[email protected]> * Update internal/component/prometheus/remote/queue/filequeue/filequeue.go Co-authored-by: Piotr <[email protected]> * Update internal/component/prometheus/remote/queue/filequeue/filequeue.go Co-authored-by: Piotr <[email protected]> * Update internal/component/prometheus/remote/queue/filequeue/filequeue.go Co-authored-by: Piotr <[email protected]> * Update internal/component/prometheus/remote/queue/filequeue/filequeue.go Co-authored-by: Piotr <[email protected]> * Update internal/component/prometheus/remote/queue/filequeue/filequeue.go Co-authored-by: Piotr <[email protected]> * Update internal/component/prometheus/remote/queue/filequeue/filequeue.go Co-authored-by: Piotr <[email protected]> * naming and error handling feedback from PR * Update internal/component/prometheus/remote/queue/filequeue/filequeue.go Co-authored-by: Piotr <[email protected]> * Update internal/component/prometheus/remote/queue/filequeue/filequeue.go Co-authored-by: Piotr <[email protected]> * Update internal/component/prometheus/remote/queue/filequeue/filequeue.go Co-authored-by: Piotr <[email protected]> * drop benchmark * rename get to pop --------- Co-authored-by: Piotr <[email protected]> * Adding the serialization features. (#1666) * Adding the serialization features. * Dont test this with race condition since we access vars directly. * Fix test. * Fix typo in file name and return early in DeserializeToSeriesGroup. * Update internal/component/prometheus/remote/queue/serialization/appender.go Co-authored-by: Piotr <[email protected]> * Update internal/component/prometheus/remote/queue/serialization/serializer.go Co-authored-by: Piotr <[email protected]> * Rename to indicate that TimeSeries are Put/Get from a pool. * Remove func that was about the same number of lines as inlining. * Update internal/component/prometheus/remote/queue/types/serialization.go Co-authored-by: Piotr <[email protected]> * Update internal/component/prometheus/remote/queue/serialization/serializer.go Co-authored-by: Piotr <[email protected]> * Change benchmark to be more specific. --------- Co-authored-by: Piotr <[email protected]> * Network wal pr (#1717) * Checkin the networking items. * Fix for config updating and tests. * Update internal/component/prometheus/remote/queue/network/loop.go Co-authored-by: William Dumont <[email protected]> * Update internal/component/prometheus/remote/queue/network/loop.go Co-authored-by: Piotr <[email protected]> * pr feedback * pr feedback * simplify stats * PR feedback --------- Co-authored-by: William Dumont <[email protected]> Co-authored-by: Piotr <[email protected]> * Component (#1823) * Checkin the networking items. * Fix for config updating and tests. * Update internal/component/prometheus/remote/queue/network/loop.go Co-authored-by: William Dumont <[email protected]> * Update internal/component/prometheus/remote/queue/network/loop.go Co-authored-by: Piotr <[email protected]> * pr feedback * pr feedback * simplify stats * simplify stats * Initial push. * docs and some renaming * Update docs/sources/reference/components/prometheus/prometheus.remote.queue.md Co-authored-by: Clayton Cornell <[email protected]> * Update docs/sources/reference/components/prometheus/prometheus.remote.queue.md Co-authored-by: Clayton Cornell <[email protected]> * Update docs/sources/reference/components/prometheus/prometheus.remote.queue.md Co-authored-by: Clayton Cornell <[email protected]> * Update docs/sources/reference/components/prometheus/prometheus.remote.queue.md Co-authored-by: Clayton Cornell <[email protected]> * Update docs/sources/reference/components/prometheus/prometheus.remote.queue.md Co-authored-by: Clayton Cornell <[email protected]> * Update docs/sources/reference/components/prometheus/prometheus.remote.queue.md Co-authored-by: Clayton Cornell <[email protected]> * Update docs/sources/reference/components/prometheus/prometheus.remote.queue.md Co-authored-by: Clayton Cornell <[email protected]> * Changes and testing. * Update docs. * Update docs. * Fix race conditions in unit tests. * Tweaking unit tests. * lower threshold more. * lower threshold more. * Fix deadlock in manager tests. * rollback to previous * Update docs/sources/reference/components/prometheus/prometheus.remote.queue.md Co-authored-by: Clayton Cornell <[email protected]> * Update docs/sources/reference/components/prometheus/prometheus.remote.queue.md Co-authored-by: Clayton Cornell <[email protected]> * Update docs/sources/reference/components/prometheus/prometheus.remote.queue.md Co-authored-by: Clayton Cornell <[email protected]> * Update docs/sources/reference/components/prometheus/prometheus.remote.queue.md Co-authored-by: Clayton Cornell <[email protected]> * Update docs/sources/reference/components/prometheus/prometheus.remote.queue.md Co-authored-by: Clayton Cornell <[email protected]> * Update docs/sources/reference/components/prometheus/prometheus.remote.queue.md Co-authored-by: Clayton Cornell <[email protected]> * Update docs/sources/reference/components/prometheus/prometheus.remote.queue.md Co-authored-by: Clayton Cornell <[email protected]> * Update docs/sources/reference/components/prometheus/prometheus.remote.queue.md Co-authored-by: Paulin Todev <[email protected]> * Docs PR feedback * Update docs/sources/reference/components/prometheus/prometheus.remote.queue.md Co-authored-by: Piotr <[email protected]> * PR feedback * PR feedback * PR feedback * PR feedback * Fix typo * Fix typo * Fix bug. * Fix docs --------- Co-authored-by: William Dumont <[email protected]> Co-authored-by: Piotr <[email protected]> Co-authored-by: Clayton Cornell <[email protected]> Co-authored-by: Paulin Todev <[email protected]> * Change name to write instead of remote. * Fix issue. * Fix issue. * Dont depend on random sync.pool behavior. * small clarification on changelog. * PR feedback --------- Co-authored-by: Piotr <[email protected]> Co-authored-by: William Dumont <[email protected]> Co-authored-by: Clayton Cornell <[email protected]> Co-authored-by: Paulin Todev <[email protected]>
- Loading branch information
1 parent
843afc3
commit eb1c840
Showing
46 changed files
with
9,547 additions
and
31 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
280 changes: 280 additions & 0 deletions
280
docs/sources/reference/components/prometheus/prometheus.write.queue.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,280 @@ | ||
--- | ||
canonical: https://grafana.com/docs/alloy/latest/reference/components/prometheus/prometheus.write.queue/ | ||
description: Learn about prometheus.write.queue | ||
title: prometheus.write.queue | ||
--- | ||
|
||
|
||
<span class="badge docs-labels__stage docs-labels__item">Experimental</span> | ||
|
||
# prometheus.write.queue | ||
|
||
`prometheus.write.queue` collects metrics sent from other components into a | ||
Write-Ahead Log (WAL) and forwards them over the network to a series of | ||
user-supplied endpoints. Metrics are sent over the network using the | ||
[Prometheus Remote Write protocol][remote_write-spec]. | ||
|
||
You can specify multiple `prometheus.write.queue` components by giving them different labels. | ||
|
||
You should consider everything here extremely experimental and highly subject to change. | ||
[remote_write-spec]: https://prometheus.io/docs/specs/remote_write_spec/ | ||
|
||
|
||
|
||
## Usage | ||
|
||
```alloy | ||
prometheus.write.queue "LABEL" { | ||
endpoint "default "{ | ||
url = REMOTE_WRITE_URL | ||
... | ||
} | ||
... | ||
} | ||
``` | ||
|
||
## Arguments | ||
|
||
The following arguments are supported: | ||
|
||
Name | Type | Description | Default | Required | ||
---- | ---- | ----------- | ------- | -------- | ||
`ttl` | `time` | `duration` | How long the samples can be queued for before they are discarded. | `2h` | no | ||
|
||
## Blocks | ||
|
||
The following blocks are supported inside the definition of | ||
`prometheus.write.queue`: | ||
|
||
Hierarchy | Block | Description | Required | ||
--------- | ----- | ----------- | -------- | ||
persistence | [persistence][] | Configuration for persistence | no | ||
endpoint | [endpoint][] | Location to send metrics to. | no | ||
endpoint > basic_auth | [basic_auth][] | Configure basic_auth for authenticating to the endpoint. | no | ||
|
||
The `>` symbol indicates deeper levels of nesting. For example, `endpoint > | ||
basic_auth` refers to a `basic_auth` block defined inside an | ||
`endpoint` block. | ||
|
||
[endpoint]: #endpoint-block | ||
[basic_auth]: #basic_auth-block | ||
[persistence]: #persistence-block | ||
|
||
### persistence block | ||
|
||
The `persistence` block describes how often and at what limits to write to disk. Persistence settings | ||
are shared for each `endpoint`. | ||
|
||
The following arguments are supported: | ||
|
||
Name | Type | Description | Default | Required | ||
---- | ---- |-------------------------------------------------------------------------------|---------| -------- | ||
`max_signals_to_batch` | `uint` | The maximum number of signals before they are batched to disk. | `10000` | no | ||
`batch_interval` | `duration` | How often to batch signals to disk if `max_signals_to_batch` is not reached. | `5s` | no | ||
|
||
|
||
### endpoint block | ||
|
||
The `endpoint` block describes a single location to send metrics to. Multiple | ||
`endpoint` blocks can be provided to send metrics to multiple locations. Each | ||
`endpoint` will have its own WAL folder. | ||
|
||
The following arguments are supported: | ||
|
||
Name | Type | Description | Default | Required | ||
---- | ---- |------------------------------------------------------------------| ------ | -------- | ||
`url` | `string` | Full URL to send metrics to. | | yes | ||
`write_timeout` | `duration` | Timeout for requests made to the URL. | `"30s"` | no | ||
`retry_backoff` | `duration` | How often to wait between retries. | `1s` | no | ||
`max_retry_attempts` | Maximum number of retries before dropping the batch. | `0` | no | ||
`batch_count` | `uint` | How many series to queue in each queue. | `1000` | no | ||
`flush_interval` | `duration` | How often to wait until sending if `batch_count` is not trigger. | `1s` | no | ||
`parallelism` | `uint` | How many parallel batches to write. | 10 | no | ||
`external_labels` | `map(string)` | Labels to add to metrics sent over the network. | | no | ||
|
||
### basic_auth block | ||
|
||
{{< docs/shared lookup="reference/components/basic-auth-block.md" source="alloy" version="<ALLOY_VERSION>" >}} | ||
|
||
|
||
## Exported fields | ||
|
||
The following fields are exported and can be referenced by other components: | ||
|
||
Name | Type | Description | ||
---- | ---- | ----------- | ||
`receiver` | `MetricsReceiver` | A value that other components can use to send metrics to. | ||
|
||
## Component health | ||
|
||
`prometheus.write.queue` is only reported as unhealthy if given an invalid | ||
configuration. In those cases, exported fields are kept at their last healthy | ||
values. | ||
|
||
## Debug information | ||
|
||
`prometheus.write.queue` does not expose any component-specific debug | ||
information. | ||
|
||
## Debug metrics | ||
|
||
The following metrics are provided for backward compatibility. | ||
They generally behave the same, but there are likely edge cases where they differ. | ||
|
||
* `prometheus_remote_write_wal_storage_created_series_total` (counter): Total number of created | ||
series appended to the WAL. | ||
* `prometheus_remote_write_wal_storage_removed_series_total` (counter): Total number of series | ||
removed from the WAL. | ||
* `prometheus_remote_write_wal_samples_appended_total` (counter): Total number of samples | ||
appended to the WAL. | ||
* `prometheus_remote_write_wal_exemplars_appended_total` (counter): Total number of exemplars | ||
appended to the WAL. | ||
* `prometheus_remote_storage_samples_total` (counter): Total number of samples | ||
sent to remote storage. | ||
* `prometheus_remote_storage_exemplars_total` (counter): Total number of | ||
exemplars sent to remote storage. | ||
* `prometheus_remote_storage_metadata_total` (counter): Total number of | ||
metadata entries sent to remote storage. | ||
* `prometheus_remote_storage_samples_failed_total` (counter): Total number of | ||
samples that failed to send to remote storage due to non-recoverable errors. | ||
* `prometheus_remote_storage_exemplars_failed_total` (counter): Total number of | ||
exemplars that failed to send to remote storage due to non-recoverable errors. | ||
* `prometheus_remote_storage_metadata_failed_total` (counter): Total number of | ||
metadata entries that failed to send to remote storage due to | ||
non-recoverable errors. | ||
* `prometheus_remote_storage_samples_retries_total` (counter): Total number of | ||
samples that failed to send to remote storage but were retried due to | ||
recoverable errors. | ||
* `prometheus_remote_storage_exemplars_retried_total` (counter): Total number of | ||
exemplars that failed to send to remote storage but were retried due to | ||
recoverable errors. | ||
* `prometheus_remote_storage_metadata_retried_total` (counter): Total number of | ||
metadata entries that failed to send to remote storage but were retried due | ||
to recoverable errors. | ||
* `prometheus_remote_storage_samples_dropped_total` (counter): Total number of | ||
samples which were dropped after being read from the WAL before being sent to | ||
remote_write because of an unknown reference ID. | ||
* `prometheus_remote_storage_exemplars_dropped_total` (counter): Total number | ||
of exemplars that were dropped after being read from the WAL before being | ||
sent to remote_write because of an unknown reference ID. | ||
* `prometheus_remote_storage_enqueue_retries_total` (counter): Total number of | ||
times enqueue has failed because a shard's queue was full. | ||
* `prometheus_remote_storage_sent_batch_duration_seconds` (histogram): Duration | ||
of send calls to remote storage. | ||
* `prometheus_remote_storage_queue_highest_sent_timestamp_seconds` (gauge): | ||
Unix timestamp of the latest WAL sample successfully sent by a queue. | ||
* `prometheus_remote_storage_samples_pending` (gauge): The number of samples | ||
pending in shards to be sent to remote storage. | ||
* `prometheus_remote_storage_exemplars_pending` (gauge): The number of | ||
exemplars pending in shards to be sent to remote storage. | ||
* `prometheus_remote_storage_samples_in_total` (counter): Samples read into | ||
remote storage. | ||
* `prometheus_remote_storage_exemplars_in_total` (counter): Exemplars read into | ||
remote storage. | ||
|
||
Metrics that are new to `prometheus.write.queue`. These are highly subject to change. | ||
|
||
* `alloy_queue_series_serializer_incoming_signals` (counter): Total number of series written to serialization. | ||
* `alloy_queue_metadata_serializer_incoming_signals` (counter): Total number of metadata written to serialization. | ||
* `alloy_queue_series_serializer_incoming_timestamp_seconds` (gauge): Highest timestamp of incoming series. | ||
* `alloy_queue_series_serializer_errors` (gauge): Number of errors for series written to serializer. | ||
* `alloy_queue_metadata_serializer_errors` (gauge): Number of errors for metadata written to serializer. | ||
* `alloy_queue_series_network_timestamp_seconds` (gauge): Highest timestamp written to an endpoint. | ||
* `alloy_queue_series_network_sent` (counter): Number of series sent successfully. | ||
* `alloy_queue_metadata_network_sent` (counter): Number of metadata sent successfully. | ||
* `alloy_queue_network_series_failed` (counter): Number of series failed. | ||
* `alloy_queue_network_metadata_failed` (counter): Number of metadata failed. | ||
* `alloy_queue_network_series_retried` (counter): Number of series retried due to network issues. | ||
* `alloy_queue_network_metadata_retried` (counter): Number of metadata retried due to network issues. | ||
* `alloy_queue_network_series_retried_429` (counter): Number of series retried due to status code 429. | ||
* `alloy_queue_network_metadata_retried_429` (counter): Number of metadata retried due to status code 429. | ||
* `alloy_queue_network_series_retried_5xx` (counter): Number of series retried due to status code 5xx. | ||
* `alloy_queue_network_metadata_retried_5xx` (counter): Number of metadata retried due to status code 5xx. | ||
* `alloy_queue_network_series_network_duration_seconds` (histogram): Duration writing series to endpoint. | ||
* `alloy_queue_network_metadata_network_duration_seconds` (histogram): Duration writing metadata to endpoint. | ||
* `alloy_queue_network_series_network_errors` (counter): Number of errors writing series to network. | ||
* `alloy_queue_network_metadata_network_errors` (counter): Number of errors writing metadata to network. | ||
|
||
## Examples | ||
|
||
The following examples show you how to create `prometheus.write.queue` components that send metrics to different destinations. | ||
|
||
### Send metrics to a local Mimir instance | ||
|
||
You can create a `prometheus.write.queue` component that sends your metrics to a local Mimir instance: | ||
|
||
```alloy | ||
prometheus.write.queue "staging" { | ||
// Send metrics to a locally running Mimir. | ||
endpoint "mimir" { | ||
url = "http://mimir:9009/api/v1/push" | ||
basic_auth { | ||
username = "example-user" | ||
password = "example-password" | ||
} | ||
} | ||
} | ||
// Configure a prometheus.scrape component to send metrics to | ||
// prometheus.write.queue component. | ||
prometheus.scrape "demo" { | ||
targets = [ | ||
// Collect metrics from the default HTTP listen address. | ||
{"__address__" = "127.0.0.1:12345"}, | ||
] | ||
forward_to = [prometheus.write.queue.staging.receiver] | ||
} | ||
``` | ||
|
||
## Technical details | ||
|
||
`prometheus.write.queue` uses [snappy][] for compression. | ||
`prometheus.write.queue` sends native histograms by default. | ||
Any labels that start with `__` will be removed before sending to the endpoint. | ||
|
||
### Data retention | ||
|
||
Data is written to disk in blocks utilizing [snappy][] compression. These blocks are read on startup and resent if they are still within the TTL. | ||
Any data that has not been written to disk, or that is in the network queues is lost if {{< param "PRODUCT_NAME" >}} is restarted. | ||
|
||
### Retries | ||
|
||
`prometheus.write.queue` will retry sending data if the following errors or HTTP status codes are returned: | ||
|
||
* Network errors. | ||
* HTTP 429 errors. | ||
* HTTP 5XX errors. | ||
|
||
`prometheus.write.queue` will not retry sending data if any other unsuccessful status codes are returned. | ||
|
||
### Memory | ||
|
||
`prometheus.write.queue` is meant to be memory efficient. | ||
You can adjust the `max_signals_to_batch`, `parallelism`, and `batch_size` to control how much memory is used. | ||
A higher `max_signals_to_batch` allows for more efficient disk compression. | ||
A higher `parallelism` allows more parallel writes, and `batch_size` allows more data sent at one time. | ||
This can allow greater throughput at the cost of more memory on both {{< param "PRODUCT_NAME" >}} and the endpoint. | ||
The defaults are suitable for most common usages. | ||
|
||
<!-- START GENERATED COMPATIBLE COMPONENTS --> | ||
|
||
## Compatible components | ||
|
||
`prometheus.write.queue` has exports that can be consumed by the following components: | ||
|
||
- Components that consume [Prometheus `MetricsReceiver`](../../../compatibility/#prometheus-metricsreceiver-consumers) | ||
|
||
{{< admonition type="note" >}} | ||
Connecting some components may not be sensible or components may require further configuration to make the connection work correctly. | ||
Refer to the linked documentation for more details. | ||
{{< /admonition >}} | ||
|
||
<!-- END GENERATED COMPATIBLE COMPONENTS --> | ||
|
||
[snappy]: https://en.wikipedia.org/wiki/Snappy_(compression) | ||
[Stop]: ../../../../set-up/run/ | ||
[run]: ../../../cli/run/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.