From 2b98dfdd09a1798066df712bec9fae215e1ee807 Mon Sep 17 00:00:00 2001 From: DeDe Morton Date: Mon, 29 Jan 2024 11:05:21 -0800 Subject: [PATCH] [DOCS] Update Observability docs to fix problems found during testing (#175636) ## Summary Adds fixes that have already been made in the serverless docs. Closes https://github.com/elastic/staging-serverless-observability-docs/issues/181 ### Checklist n/a Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com> --- docs/apm/advanced-queries.asciidoc | 4 ++-- docs/apm/filters.asciidoc | 4 ++-- docs/apm/infrastructure.asciidoc | 4 ++-- docs/apm/machine-learning.asciidoc | 8 ++++---- docs/apm/service-maps.asciidoc | 9 ++++----- docs/apm/services.asciidoc | 12 ++++++++---- docs/apm/transactions.asciidoc | 10 +++++----- 7 files changed, 27 insertions(+), 24 deletions(-) diff --git a/docs/apm/advanced-queries.asciidoc b/docs/apm/advanced-queries.asciidoc index 8aac22c742433..c74615f40a647 100644 --- a/docs/apm/advanced-queries.asciidoc +++ b/docs/apm/advanced-queries.asciidoc @@ -57,7 +57,7 @@ and *Discover* supports all of the example APM app queries shown on this page. [[discover-queries]] ==== Discover queries -One example where you may want to make use of *Discover*, +One example where you may want to make use of *Discover* is to view _all_ transactions for an endpoint instead of just a sample. TIP: Starting in v7.6, you can view ten samples per bucket in the APM app, instead of just one. @@ -77,7 +77,7 @@ that took between 13 and 14 milliseconds. Here's what Discover returns: image::apm/images/advanced-discover.png[View all transactions in bucket] You can now explore the data until you find a specific transaction that you're interested in. -Copy that transaction's `transaction.id`, and paste it into the APM app to view the data in the context of the APM app: +Copy that transaction's `transaction.id` and paste it into the APM app to view the data in the context of the APM app: [role="screenshot"] image::apm/images/specific-transaction-search.png[View specific transaction in apm app] diff --git a/docs/apm/filters.asciidoc b/docs/apm/filters.asciidoc index 8ff39e3c1dcf0..e3d085b771a85 100644 --- a/docs/apm/filters.asciidoc +++ b/docs/apm/filters.asciidoc @@ -7,8 +7,8 @@ ++++ Global filters are ways you can filter data across the APM app based on a specific -time range or environment. They are available in the Services, Transactions, Errors, -Metrics, and Traces views, and any filter applied will persist as you move between pages. +time range or environment. When viewing a specific service, the filter persists +as you move between tabs. [role="screenshot"] image::apm/images/global-filters.png[Global filters available in the APM app in Kibana] diff --git a/docs/apm/infrastructure.asciidoc b/docs/apm/infrastructure.asciidoc index 8ca919ffca6c4..ff6343061ca24 100644 --- a/docs/apm/infrastructure.asciidoc +++ b/docs/apm/infrastructure.asciidoc @@ -4,7 +4,7 @@ beta::[] -The *Infrastructure* tab provides information about the containers, pods, and hosts, +The *Infrastructure* tab provides information about the containers, pods, and hosts that the selected service is linked to. [role="screenshot"] @@ -12,4 +12,4 @@ image::apm/images/infra.png[Example view of the Infrastructure tab in APM app in IT ops and software reliability engineers (SREs) can use this tab to quickly find a service's underlying infrastructure resources when debugging a problem. -Knowing what infrastructure is related to a service allows you to remediate issues by restarting, killing hanging instances, changing configuration, rolling back deployments, scaling up, scaling out, etc. \ No newline at end of file +Knowing what infrastructure is related to a service allows you to remediate issues by restarting, killing hanging instances, changing configuration, rolling back deployments, scaling up, scaling out, and so on. \ No newline at end of file diff --git a/docs/apm/machine-learning.asciidoc b/docs/apm/machine-learning.asciidoc index 8f82cb2a00f10..f01cdf70b6e05 100644 --- a/docs/apm/machine-learning.asciidoc +++ b/docs/apm/machine-learning.asciidoc @@ -10,7 +10,7 @@ The Machine learning integration initiates a new job predefined to calculate ano With this integration, you can quickly pinpoint anomalous transactions and see the health of any upstream and downstream services. -Machine learning jobs are created per environment, and are based on a service's average response time. +Machine learning jobs are created per environment and are based on a service's average response time. Because jobs are created at the environment level, you can add new services to your existing environments without the need for additional machine learning jobs. @@ -40,7 +40,7 @@ To enable machine learning anomaly detection: . From the Services overview, Traces overview, or Service Map tab, select **Anomaly detection**. -. Click **Create ML Job**. +. Click **Create Job**. . Machine learning jobs are created at the environment level. Select all of the service environments that you want to enable anomaly detection in. @@ -50,7 +50,7 @@ Anomalies will surface for all services and transaction types within the selecte That's it! After a few minutes, the job will begin calculating results; it might take additional time for results to appear on your service maps. -Existing jobs can be managed in *Machine Learning jobs management*. +To manage existing jobs, click **Manage jobs**. [float] [[warning-ml-integration]] @@ -66,7 +66,7 @@ image::apm/images/apm-anomaly-alert.png[Example view of anomaly alert in the APM [[unkown-ml-integration]] === Unknown service health -After enabling anomaly detection, service health may display as "Unknown". There are three reasons why this can occur: +After enabling anomaly detection, service health may display as "Unknown". Here are some reasons why this can occur: 1. No machine learning job exists. See <> to enable anomaly detection and create a machine learning job. 2. There is no machine learning data for the job. If you just created the machine learning job you'll need to wait a few minutes for data to be available. Alternatively, if the service or its enviroment are new, you'll need to wait for more trace data. diff --git a/docs/apm/service-maps.asciidoc b/docs/apm/service-maps.asciidoc index a0c9dda2188cb..85c8efa4adb5d 100644 --- a/docs/apm/service-maps.asciidoc +++ b/docs/apm/service-maps.asciidoc @@ -62,9 +62,8 @@ This can be useful if you have two or more services, in separate environments, b Use the environment drop-down to only see the data you're interested in, like `dev` or `production`. If there's a specific service that interests you, select that service to highlight its connections. -Clicking **Focus map** will refocus the map on that specific service and lock the connection highlighting. -From here, select **Service Details**, or click on the **Transaction** tab to jump to the Transaction overview -for the selected service. +Click **Focus map** to refocus the map on the selected service and lock the connection highlighting. +From here, select **Service Details**, or click the **Transactions** tab to jump to the Transaction overview for the selected service. You can also use the tabs at the top of the page to easily jump to the **Errors** or **Metrics** overview. [role="screenshot"] @@ -74,7 +73,7 @@ image::apm/images/service-maps-java.png[Example view of service maps in the APM [[service-map-anomaly-detection]] === Anomaly detection with machine learning -Machine learning jobs can be created to calculate anomaly scores on APM transaction durations within the selected service. +You can create machine learning jobs to calculate anomaly scores on APM transaction durations within the selected service. When these jobs are active, service maps will display a color-coded anomaly indicator based on the detected anomaly score: [horizontal] @@ -85,7 +84,7 @@ image:apm/images/red-service.png[APM red service]:: Max anomaly score **≥75**. [role="screenshot"] image::apm/images/apm-service-map-anomaly.png[Example view of anomaly scores on service maps in the APM app] -If an anomaly has been detected, click *view anomalies* to view the anomaly detection metric viewer in the Machine learning app. +If an anomaly has been detected, click *View anomalies* to view the anomaly detection metric viewer in the Machine learning app. This time series analysis will display additional details on the severity and time of the detected anomalies. To learn how to create a machine learning job, see <>. diff --git a/docs/apm/services.asciidoc b/docs/apm/services.asciidoc index 7ce3354ecbf7e..c4deeb7e40322 100644 --- a/docs/apm/services.asciidoc +++ b/docs/apm/services.asciidoc @@ -33,9 +33,14 @@ image::apm/images/apm-service-group.png[Example view of service group in the APM To enable Service groups, open {kib} and navigate to **Stack Management** > **Advanced Settings** > **Observability**, and enable the **Service groups feature**. -To create a service group, navigate to **Observability** > **APM** > **Services** and select **Create group**. -Specify a name, color, and description. -Then, using the <>, specify a query to select services for the group. +To create a service group: + +. Navigate to **Observability** > **APM** > **Services**. +. Switch to **Service groups**. +. Click **Create group**. +. Specify a name, color, and description. +. Click **Select services**. +. Specify a <> query to select services for the group. Services that match the query within the last 24 hours will be assigned to the group. [NOTE] @@ -54,4 +59,3 @@ Not sure where to get started? Here are some sample queries you can build from: * Group services by environment--in this example, "production": `service.environment : "production"` * Group services by name--this example groups those that end in "beat": `service.name : *beat` (matches services named "Auditbeat", "Heartbeat", "Filebeat", etc.) -* Group services with a high transaction duration in the last 24 hours: `transaction.duration.us >= 50000000` diff --git a/docs/apm/transactions.asciidoc b/docs/apm/transactions.asciidoc index b0a1c9ee858de..284d28c7b60c0 100644 --- a/docs/apm/transactions.asciidoc +++ b/docs/apm/transactions.asciidoc @@ -8,7 +8,7 @@ APM agents automatically collect performance metrics on HTTP requests, database [role="screenshot"] image::apm/images/apm-transactions-overview.png[Example view of transactions table in the APM app in Kibana] -The *Latency*, *Throughput*, *Failed transaction rate*, *Average duration by span type*, and *Cold start rate* +The *Latency*, *Throughput*, *Failed transaction rate*, *Time spent by span type*, and *Cold start rate* charts display information on all transactions associated with the selected service: *Latency*:: @@ -38,7 +38,7 @@ These spans will set `event.outcome=failure` and increase the failed transaction If there is no HTTP status, both transactions and spans are considered successful unless an error is reported. ==== -*Average duration by span type*:: +*Time spent by span type*:: Visualize where your application is spending most of its time. For example, is your app spending time in external calls, database processing, or application code execution? + @@ -106,10 +106,10 @@ image::apm/images/apm-transactions-overview.png[Example view of response time di [[transaction-duration-distribution]] ==== Latency distribution -A plot of all transaction durations for the given time period. -The screenshot below shows a typical distribution, +The latency distribution shows a plot of all transaction durations for the given time period. +The following screenshot shows a typical distribution and indicates most of our requests were served quickly -- awesome! -It's the requests on the right, the ones taking longer than average, that we probably need to focus on. +The requests on the right are taking longer than average; we probably need to focus on them. [role="screenshot"] image::apm/images/apm-transaction-duration-dist.png[Example view of latency distribution graph]