Skip to content

Commit

Permalink
Update Explorer monitoring documentation to include the object cleane…
Browse files Browse the repository at this point in the history
…r metrics.

Update the Explorer dashboard screenshot.
  • Loading branch information
opudrovs committed Nov 17, 2023
1 parent 3523a6d commit ea162ab
Show file tree
Hide file tree
Showing 2 changed files with 47 additions and 1 deletion.
Binary file modified website/docs/explorer/imgs/explorer-query-metrics.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
48 changes: 47 additions & 1 deletion website/docs/explorer/operations.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -141,7 +141,7 @@ The following metrics are available to monitor its health.

##### Cluster Watcher

The metric `collector_cluster_watcher` provides the number of the cluster watchers it the following `status`:
The metric `collector_cluster_watcher` provides the number of the cluster watchers in the following `status`:
- Starting: a cluster watcher is starting at the back of detecting that a new cluster has been registered.
- Started: cluster watcher has been started and collecting events from the remote cluster. This is the stable state.
- Stopping: a cluster has been deregistered so its cluster watcher is no longer required. In the process of stopping it.
Expand Down Expand Up @@ -225,6 +225,52 @@ indexer_inflight_requests{action="Add"} 0
indexer_inflight_requests{action="Remove"} 0
```

#### Management

Explorer management path is composed of two components exporting metrics:

- Objects Cleaner Status
- Objects Cleaner Remove Objects Requests

The following metrics are available to monitor its health.

##### Objects Cleaner Status

The metric `objects_cleaner_status` provides telemetry on the objects cleaner's `status` which can take on the following values:
- Starting: objects cleaner is starting after starting the API server.
- Started: objects cleaner has been started and is watching for objects which are expired, according to their `RetentionPolicy`, to remove them from the datastore or indexer.
- Stopped: objects cleaner has been stopped after stopping collection.

```
objects_cleaner_status{status="started"} 1
objects_cleaner_status{status="starting"} 0
```

##### Objects Cleaner Remove Objects Requests

**Request Latency:** histogram with the latency of the cleaner remove objects requests.

- `action` is the `RemoveObjects` operation
- `status` is the result of the operation. It could be either `success` or `error`

```
objects_cleaner_latency_seconds_bucket{action="RemoveObjects",status="success",le="0.01"} 5
```
```
objects_cleaner_latency_seconds_sum{action="RemoveObjects",status="success"} 0.013658576
```
```
objects_cleaner_latency_seconds_count{action="RemoveObjects",status="success"} 5
```

**Requests In Flight:** gauge with the number of inflight requests being handled at the same time.

- `action` is the `RemoveObjects` operation

```
objects_cleaner_inflight_requests{action="RemoveObjects"} 0
```

### Dashboard

Use Explorer dashboard to monitor its [golden signals](https://sre.google/sre-book/monitoring-distributed-systems/#xref_monitoring_golden-signals)
Expand Down

0 comments on commit ea162ab

Please sign in to comment.