Skip to content

Commit

Permalink
Create Filter and Aggregate log data section (#3246)
Browse files Browse the repository at this point in the history
  • Loading branch information
mdbirnstiehl authored Oct 10, 2023
1 parent da83710 commit af0a069
Show file tree
Hide file tree
Showing 8 changed files with 1,239 additions and 922 deletions.
Binary file added docs/en/observability/images/logs-kql-filter.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/en/observability/images/time-filter-icon.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 4 additions & 0 deletions docs/en/observability/index.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,10 @@ include::logs-checklist.asciidoc[leveloffset=+2]

include::logs-stream.asciidoc[leveloffset=+2]

include::logs-parse.asciidoc[leveloffset=+2]

include::logs-filter.asciidoc[leveloffset=+2]

include::monitor-logs.asciidoc[leveloffset=+2]

include::tail-logs.asciidoc[leveloffset=+3]
Expand Down
336 changes: 336 additions & 0 deletions docs/en/observability/logs-filter.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,336 @@
[[logs-filter-and-aggregate]]
= Filter and aggregate logs

Filter and aggregate your log data to find specific information, gain insight, and monitor your systems more efficiently. You can filter and aggregate based on structured fields like timestamps, log levels, and IP addresses that you've extracted from your log data.

This guide shows you how to:

* <<logs-filter>> — Narrow down your log data by applying specific criteria.
* <<logs-aggregate>> — Analyze and summarize data to find patterns and gain insight.

[discrete]
[[logs-filter-and-aggregate-prereq]]
== Before you get started

The examples on this page use the following ingest pipeline and index template, which you can set in *Dev Tools*. If you haven't used ingest pipelines and index templates to parse your log data and extract structured fields yet, start with the <<logs-parse>> documentation.

Set the ingest pipeline with the following command:

[source,console]
----
PUT _ingest/pipeline/logs-example-default
{
"description": "Extracts the timestamp log level and host ip",
"processors": [
{
"dissect": {
"field": "message",
"pattern": "%{@timestamp} %{log.level} %{host.ip} %{message}"
}
}
]
}
----

Set the index template with the following command:

[source,console]
----
PUT _index_template/logs-example-default-template
{
"index_patterns": [ "logs-example-*" ],
"data_stream": { },
"priority": 500,
"template": {
"settings": {
"index.default_pipeline":"logs-example-default"<4>
}
},
"composed_of": [
"logs-mappings",
"logs-settings",
"logs@custom",
"ecs@dynamic_templates"
],
"ignore_missing_component_templates": ["logs@custom"],
}
----

[discrete]
[[logs-filter]]
== Filter logs

Filter your data using the fields you've extracted so you can focus on log data with specific log levels, timestamp ranges, or host IPs. You can filter your log data in different ways:

- <<logs-filter-logs-explorer>> – Filter and visualize log data in {kib} using Logs Explorer.
- <<logs-filter-qdsl>> – Filter log data from Dev Tools using Query DSL.

[discrete]
[[logs-filter-logs-explorer]]
=== Filter logs in Logs Explorer

Logs Explorer is a {kib} tool that automatically provides views of your log data based on integrations and data streams. You can find Logs Explorer in the Observability menu under *Logs*. Use the {kibana-ref}/kuery-query.html[{kib} Query Language (KQL)] in the search bar to narrow down the log data displayed in Logs Explorer.

For example, you might want to look into an event that occurred within a specific time range (between September 14th and 15th).

Add some logs with varying timestamps and log levels to your data stream:

. In {kib}, go to *Management* -> *Dev Tools*.
. In the *Console* tab, run the following command:

[source,console]
----
POST logs-example-default/_bulk
{ "create": {} }
{ "message": "2023-09-15T08:15:20.234Z WARN 192.168.1.101 Disk usage exceeds 90%." }
{ "create": {} }
{ "message": "2023-09-14T10:30:45.789Z ERROR 192.168.1.102 Critical system failure detected." }
{ "create": {} }
{ "message": "2023-09-10T14:20:45.789Z ERROR 192.168.1.105 Database connection lost." }
{ "create": {} }
{ "message": "2023-09-20T09:40:32.345Z INFO 192.168.1.106 User logout initiated." }
----

From Logs Explorer, add the following KQL query in the search bar to filter for logs with within your timestamp range with log levels of `WARN` and `ERROR`:

[source,text]
----
@timestamp >= "2023-09-14T00:00:00" and @timestamp <= "2023-09-15T23:59:59" and log.level : "ERROR" or log.level : "WARN"
----

Under the *Documents* tab, you'll see the filtered log data matching your query.

[role="screenshot"]
image::images/logs-kql-filter.png[Filter data by log level using KQL]

Make sure the logs you're looking for fall within the time range in Logs Explorer. If you don't see your logs, update the time range by clicking the image:images/time-filter-icon.png[calendar icon, width=36px].

For more on using Logs Explorer, refer to the {kibana-ref}/discover.html[Discover] documentation.

[discrete]
[[logs-filter-qdsl]]
=== Filter logs with Query DSL

{ref}/query-dsl.html[Query DSL] is a JSON-based language used to send requests to {es} and retrieve data from indices and data streams. Filter your log data using Query DSL from *Dev Tools*.

For example, you might want to troubleshoot an issue that happened on a specific date or at a specific time. To do this, use a boolean query with a {ref}/query-dsl-range-query.html[range query] to filter for the specific timestamp range and a {ref}/query-dsl-term-query.html[term query] to filter for `WARN` and `ERROR` log levels.

First, from *Dev Tools*, add some logs with varying timestamps and log levels to your data stream with the following command:

[source,console]
----
POST logs-example-default/_bulk
{ "create": {} }
{ "message": "2023-09-15T08:15:20.234Z WARN 192.168.1.101 Disk usage exceeds 90%." }
{ "create": {} }
{ "message": "2023-09-14T10:30:45.789Z ERROR 192.168.1.102 Critical system failure detected." }
{ "create": {} }
{ "message": "2023-09-10T14:20:45.789Z ERROR 192.168.1.105 Database connection lost." }
{ "create": {} }
{ "message": "2023-09-20T09:40:32.345Z INFO 192.168.1.106 User logout initiated." }
----

Let's say you want to look into an event that occurred between September 14th and 15th. The following boolean query filters for logs with timestamps during those days that also have a log level of `ERROR` or `WARN`.

[source,console]
----
POST /logs-example-default/_search
{
"query": {
"bool": {
"filter": [
{
"range": {
"@timestamp": {
"gte": "2023-09-14T00:00:00",
"lte": "2023-09-15T23:59:59"
}
}
},
{
"terms": {
"log.level": ["WARN", "ERROR"]
}
}
]
}
}
}
----

The filtered results should show `WARN` and `ERROR` logs that occurred within the timestamp range:

[source,JSON]
----
{
...
"hits": {
...
"hits": [
{
"_index": ".ds-logs-example-default-2023.09.25-000001",
"_id": "JkwPzooBTddK4OtTQToP",
"_score": 0,
"_source": {
"message": "192.168.1.101 Disk usage exceeds 90%.",
"log": {
"level": "WARN"
},
"@timestamp": "2023-09-15T08:15:20.234Z"
}
},
{
"_index": ".ds-logs-example-default-2023.09.25-000001",
"_id": "A5YSzooBMYFrNGNwH75O",
"_score": 0,
"_source": {
"message": "192.168.1.102 Critical system failure detected.",
"log": {
"level": "ERROR"
},
"@timestamp": "2023-09-14T10:30:45.789Z"
}
}
]
}
}
----

[discrete]
[[logs-aggregate]]
== Aggregate logs
Use aggregation to analyze and summarize your log data to find patterns and gain insight. {ref}/search-aggregations-bucket.html[Bucket aggregations] organize log data into meaningful groups making it easier to identify patterns, trends, and anomalies within your logs.

For example, you might want to understand error distribution by analyzing the count of logs per log level.

First, from *Dev Tools*, add some logs with varying log levels to your data stream using the following command:

[source,console]
----
POST logs-example-default/_bulk
{ "create": {} }
{ "message": "2023-09-15T08:15:20.234Z WARN 192.168.1.101 Disk usage exceeds 90%." }
{ "create": {} }
{ "message": "2023-09-14T10:30:45.789Z ERROR 192.168.1.102 Critical system failure detected." }
{ "create": {} }
{ "message": "2023-09-15T12:45:55.123Z INFO 192.168.1.103 Application successfully started." }
{ "create": {} }
{ "message": "2023-09-14T15:20:10.789Z WARN 192.168.1.104 Network latency exceeding threshold." }
{ "create": {} }
{ "message": "2023-09-10T14:20:45.789Z ERROR 192.168.1.105 Database connection lost." }
{ "create": {} }
{ "message": "2023-09-20T09:40:32.345Z INFO 192.168.1.106 User logout initiated." }
{ "create": {} }
{ "message": "2023-09-21T15:20:55.678Z DEBUG 192.168.1.102 Database connection established." }
----

Next, run this command to aggregate your log data using the `log.level` field:

[source,console]
----
POST logs-example-default/_search?size=0&filter_path=aggregations
{
"size": 0,<1>
"aggs": {
"log_level_distribution": {
"terms": {
"field": "log.level"
}
}
}
}
----
<1> Searches with an aggregation return both the query results and the aggregation, so you would see the logs matching the data and the aggregation. Setting `size` to `0` limits the results to aggregations.

The results should show the number of logs in each log level:

[source,JSON]
----
{
"aggregations": {
"error_distribution": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "ERROR",
"doc_count": 2
},
{
"key": "INFO",
"doc_count": 2
},
{
"key": "WARN",
"doc_count": 2
},
{
"key": "DEBUG",
"doc_count": 1
}
]
}
}
}
----

You can also combine aggregations and queries. For example, you might want to limit the scope of the previous aggregation by adding a range query:

[source,console]
----
GET /logs-example-default/_search
{
"size": 0,
"query": {
"range": {
"@timestamp": {
"gte": "2023-09-14T00:00:00",
"lte": "2023-09-15T23:59:59"
}
}
},
"aggs": {
"my-agg-name": {
"terms": {
"field": "log.level"
}
}
}
}
----

The results should show an aggregate of logs that occurred within your timestamp range:

[source,JSON]
----
{
...
"hits": {
...
"hits": []
},
"aggregations": {
"my-agg-name": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "WARN",
"doc_count": 2
},
{
"key": "ERROR",
"doc_count": 1
},
{
"key": "INFO",
"doc_count": 1
}
]
}
}
}
----

For more on aggregation types and available aggregations, refer to the {ref}/search-aggregations.html[Aggregations] documentation.

11 changes: 7 additions & 4 deletions docs/en/observability/logs-overview.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,11 @@
<titleabbrev>Logs</titleabbrev>
++++

Elastic Observability allows you to deploy and manage logs at a petabyte scale, giving you insights into your logs in minutes. You can also search across your logs in one place, troubleshoot in real-time, and detect patterns and outliers with categorization and anomaly detection. For more information, see the following links:
Elastic Observability allows you to deploy and manage logs at a petabyte scale, giving you insights into your logs in minutes. You can also search across your logs in one place, troubleshoot in real-time, and detect patterns and outliers with categorization and anomaly detection. For more information, refer to the following links:

- The <<logs-checklist>> is an overview on sending log data, configuring logs, and analyzing logs.
- <<logs-stream>> walks you through sending a log file to Elasticsearch using the standalone {agent} and querying it in {kib}.
- <<monitor-logs>> has information on visualizing and analyzing logs.
- <<logs-checklist, Logs resource guide>> – See an overview on sending log data, configuring logs, and analyzing logs.
- <<logs-stream>> – Send log files to {es} using a standalone {agent}.
- <<logs-parse>> – Parse your log data and extract structured fields then use those structured fields to filter and analyze your logs.
- <<logs-filter>> – Filter and aggregate your log data to find specific information, gain insight, and monitor your systems more efficiently.
- <<monitor-logs>> – Find information on visualizing and analyzing logs in {kib}.
- <<logs-troubleshooting>> – Find solutions for errors you might encounter while onboarding your logs.
Loading

0 comments on commit af0a069

Please sign in to comment.