-
Notifications
You must be signed in to change notification settings - Fork 167
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Align formatting of APM rules with other rules docs (#4366)
* split apm alerts and format like other rule docs * fill out individual apm rule pages * address feedback * apm ui -> applications ui --------- Co-authored-by: bmorelli25 <[email protected]>
- Loading branch information
1 parent
ec217ce
commit 49453be
Showing
29 changed files
with
601 additions
and
186 deletions.
There are no files selected for viewing
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,91 @@ | ||
[[apm-anomaly-rule]] | ||
= APM Anomaly rule | ||
|
||
APM Anomaly rules trigger when the latency, throughput, or failed transaction rate of a service is abnormal. | ||
|
||
[discrete] | ||
[[apm-anomaly-rule-filters-conditions]] | ||
== Filters and conditions | ||
|
||
Because some parts of an application may be more important than others, you might have a different tolerance | ||
for abnormal performance across services in your application. You can filter the services in your application to | ||
apply an APM Anomaly rule to specific services (`SERVICE`), transaction types (`TYPE`), and environments (`ENVIRONMENT`). | ||
|
||
Then, you can specify which conditions should result in an alert. This includes specifying: | ||
|
||
* The types of anomalies that are detected (`DETECTOR TYPES`): `latency`, `throughput`, and/or `failed transaction rate`. | ||
* The severity level (`HAS ANOMALY WITH SEVERITY`): `critical`, `major`, `minor`, `warning`. | ||
|
||
.Example | ||
**** | ||
This example creates a rule for all production services that would result in an alert when a critical latency | ||
anomaly is detected: | ||
image::apm-anomaly-rule-filters-conditions.png[width=600] | ||
**** | ||
|
||
[discrete] | ||
== Rule schedule | ||
|
||
include::../shared/alerting-and-rules/generic-apm-rule-schedule.asciidoc[] | ||
|
||
[discrete] | ||
== Advanced options | ||
|
||
include::../shared/alerting-and-rules/generic-apm-advanced-options.asciidoc[] | ||
|
||
[discrete] | ||
== Actions | ||
|
||
Extend your rules by connecting them to actions that use built-in integrations. | ||
|
||
[discrete] | ||
=== Action types | ||
|
||
Supported built-in integrations include: | ||
|
||
include::../shared/alerting-and-rules/alerting-connectors.asciidoc[] | ||
|
||
[discrete] | ||
=== Action frequency | ||
|
||
include::../shared/alerting-and-rules/generic-apm-action-frequency.asciidoc[] | ||
|
||
[discrete] | ||
[[apm-anomaly-rule-action-variables]] | ||
=== Action variables | ||
|
||
A default message is provided as a starting point for your alert. | ||
If you want to customize the message, add more context to the message by clicking the icon above | ||
the message text box and selecting from a list of available variables. | ||
|
||
TIP: To add variables to alert messages, use https://mustache.github.io/[Mustache] template syntax, for example `{{variable.name}}`. | ||
|
||
image::apm-anomaly-rule-action-variables.png[width=600] | ||
|
||
The following variables are specific to this rule type. | ||
You an also specify {kibana-ref}/rule-action-variables.html[variables common to all rules]. | ||
|
||
`context.alertDetailsUrl`:: | ||
Link to the alert troubleshooting view for further context and details. This will be an empty string if the server.publicBaseUrl is not configured. | ||
|
||
`context.environment`:: | ||
The transaction type the alert is created for. | ||
|
||
`context.reason`:: | ||
A concise description of the reason for the alert. | ||
|
||
`context.serviceName`:: | ||
The service the alert is created for. | ||
|
||
`context.threshold`:: | ||
Any trigger value above this value will cause the alert to fire. | ||
|
||
`context.transactionType`:: | ||
The transaction type the alert is created for. | ||
|
||
`context.triggerValue`:: | ||
The value that breached the threshold and triggered the alert. | ||
|
||
`context.viewInAppUrl`:: | ||
Link to the alert source. |
121 changes: 121 additions & 0 deletions
121
docs/en/observability/apm-error-count-threshold-rule.asciidoc
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,121 @@ | ||
[[apm-error-count-threshold-rule]] | ||
= Error count threshold rule | ||
|
||
Alert when the number of errors in a service exceeds a defined threshold. Error count rules can be set at the | ||
environment level, service level, and error group level. | ||
|
||
[discrete] | ||
[[apm-error-count-threshold-rule-filters-conditions]] | ||
== Filters and conditions | ||
|
||
Filter the errors coming from your application to apply an Error count threshold rule to a specific | ||
service (`SERVICE`), environment (`ENVIRONMENT`) or error grouping key (`ERROR GROUPING KEY`). | ||
Alternatively, you can use a {kibana-ref}/kuery-query.html[KQL filter] to limit the scope of the alert | ||
by toggling on the *Use KQL Filter* option. | ||
|
||
[TIP] | ||
==== | ||
Similar errors are grouped together to make it easy to quickly see which errors are affecting your services and to take actions to rectify them. Each group of errors has a unique _error grouping key_ — a hash of the stack trace and other properties. | ||
==== | ||
|
||
Then, you can specify which conditions should result in an alert. This includes specifying: | ||
|
||
* The number of errors that occurred (`IS ABOVE`). | ||
* The timeframe in which the errors must occur (`FOR THE LAST`) in seconds, minutes, hours, or days. | ||
|
||
.Example | ||
**** | ||
This example creates a rule for all production services that would result in an alert when there are 25 errors | ||
in the last five minutes: | ||
image::apm-error-count-rule-filters-conditions.png[width=600] | ||
Alternatively, you can use a KQL filter to limit the scope of the alert: | ||
. Toggle on *Use KQL Filter*. | ||
. Add a filter: | ||
+ | ||
[source,txt] | ||
------ | ||
service.environment:"Production" | ||
------ | ||
**** | ||
|
||
[discrete] | ||
== Groups | ||
|
||
include::../shared/alerting-and-rules/generic-apm-group-by.asciidoc[] | ||
|
||
[discrete] | ||
== Rule schedule | ||
|
||
include::../shared/alerting-and-rules/generic-apm-rule-schedule.asciidoc[] | ||
|
||
[discrete] | ||
== Advanced options | ||
|
||
include::../shared/alerting-and-rules/generic-apm-advanced-options.asciidoc[] | ||
|
||
[discrete] | ||
== Actions | ||
|
||
Extend your rules by connecting them to actions that use built-in integrations. | ||
|
||
[discrete] | ||
=== Action types | ||
|
||
Supported built-in integrations include: | ||
|
||
include::../shared/alerting-and-rules/alerting-connectors.asciidoc[] | ||
|
||
[discrete] | ||
=== Action frequency | ||
|
||
include::../shared/alerting-and-rules/generic-apm-action-frequency.asciidoc[] | ||
|
||
[discrete] | ||
=== Action variables | ||
|
||
A default message is provided as a starting point for your alert. | ||
If you want to customize the message, add more context to the message by clicking the icon above | ||
the message text box and selecting from a list of available variables. | ||
|
||
TIP: To add variables to alert messages, use https://mustache.github.io/[Mustache] template syntax, for example `{{variable.name}}`. | ||
|
||
image::apm-error-count-rule-action-variables.png[width=600] | ||
|
||
The following variables are specific to this rule type. | ||
You an also specify {kibana-ref}/rule-action-variables.html[variables common to all rules]. | ||
|
||
`context.alertDetailsUrl`:: | ||
Link to the alert troubleshooting view for further context and details. This will be an empty string if the server.publicBaseUrl is not configured. | ||
|
||
`context.environment`:: | ||
The transaction type the alert is created for | ||
|
||
`context.errorGroupingKey`:: | ||
The error grouping key the alert is created for | ||
|
||
`context.errorGroupingName`:: | ||
The error grouping name the alert is created for | ||
|
||
`context.interval`:: | ||
The length and unit of the time period where the alert conditions were met | ||
|
||
`context.reason`:: | ||
A concise description of the reason for the alert | ||
|
||
`context.serviceName`:: | ||
The service the alert is created for | ||
|
||
`context.threshold`:: | ||
Any trigger value above this value will cause the alert to fir | ||
|
||
`context.transactionName`:: | ||
The transaction name the alert is created for | ||
|
||
`context.triggerValue`:: | ||
The value that breached the threshold and triggered the alert | ||
|
||
`context.viewInAppUrl`:: | ||
Link to the alert source |
Oops, something went wrong.