-
Notifications
You must be signed in to change notification settings - Fork 167
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Create Filter and Aggregate log data section (#3246)
- Loading branch information
1 parent
da83710
commit af0a069
Showing
8 changed files
with
1,239 additions
and
922 deletions.
There are no files selected for viewing
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,336 @@ | ||
[[logs-filter-and-aggregate]] | ||
= Filter and aggregate logs | ||
|
||
Filter and aggregate your log data to find specific information, gain insight, and monitor your systems more efficiently. You can filter and aggregate based on structured fields like timestamps, log levels, and IP addresses that you've extracted from your log data. | ||
|
||
This guide shows you how to: | ||
|
||
* <<logs-filter>> — Narrow down your log data by applying specific criteria. | ||
* <<logs-aggregate>> — Analyze and summarize data to find patterns and gain insight. | ||
|
||
[discrete] | ||
[[logs-filter-and-aggregate-prereq]] | ||
== Before you get started | ||
|
||
The examples on this page use the following ingest pipeline and index template, which you can set in *Dev Tools*. If you haven't used ingest pipelines and index templates to parse your log data and extract structured fields yet, start with the <<logs-parse>> documentation. | ||
|
||
Set the ingest pipeline with the following command: | ||
|
||
[source,console] | ||
---- | ||
PUT _ingest/pipeline/logs-example-default | ||
{ | ||
"description": "Extracts the timestamp log level and host ip", | ||
"processors": [ | ||
{ | ||
"dissect": { | ||
"field": "message", | ||
"pattern": "%{@timestamp} %{log.level} %{host.ip} %{message}" | ||
} | ||
} | ||
] | ||
} | ||
---- | ||
|
||
Set the index template with the following command: | ||
|
||
[source,console] | ||
---- | ||
PUT _index_template/logs-example-default-template | ||
{ | ||
"index_patterns": [ "logs-example-*" ], | ||
"data_stream": { }, | ||
"priority": 500, | ||
"template": { | ||
"settings": { | ||
"index.default_pipeline":"logs-example-default"<4> | ||
} | ||
}, | ||
"composed_of": [ | ||
"logs-mappings", | ||
"logs-settings", | ||
"logs@custom", | ||
"ecs@dynamic_templates" | ||
], | ||
"ignore_missing_component_templates": ["logs@custom"], | ||
} | ||
---- | ||
|
||
[discrete] | ||
[[logs-filter]] | ||
== Filter logs | ||
|
||
Filter your data using the fields you've extracted so you can focus on log data with specific log levels, timestamp ranges, or host IPs. You can filter your log data in different ways: | ||
|
||
- <<logs-filter-logs-explorer>> – Filter and visualize log data in {kib} using Logs Explorer. | ||
- <<logs-filter-qdsl>> – Filter log data from Dev Tools using Query DSL. | ||
|
||
[discrete] | ||
[[logs-filter-logs-explorer]] | ||
=== Filter logs in Logs Explorer | ||
|
||
Logs Explorer is a {kib} tool that automatically provides views of your log data based on integrations and data streams. You can find Logs Explorer in the Observability menu under *Logs*. Use the {kibana-ref}/kuery-query.html[{kib} Query Language (KQL)] in the search bar to narrow down the log data displayed in Logs Explorer. | ||
|
||
For example, you might want to look into an event that occurred within a specific time range (between September 14th and 15th). | ||
|
||
Add some logs with varying timestamps and log levels to your data stream: | ||
|
||
. In {kib}, go to *Management* -> *Dev Tools*. | ||
. In the *Console* tab, run the following command: | ||
|
||
[source,console] | ||
---- | ||
POST logs-example-default/_bulk | ||
{ "create": {} } | ||
{ "message": "2023-09-15T08:15:20.234Z WARN 192.168.1.101 Disk usage exceeds 90%." } | ||
{ "create": {} } | ||
{ "message": "2023-09-14T10:30:45.789Z ERROR 192.168.1.102 Critical system failure detected." } | ||
{ "create": {} } | ||
{ "message": "2023-09-10T14:20:45.789Z ERROR 192.168.1.105 Database connection lost." } | ||
{ "create": {} } | ||
{ "message": "2023-09-20T09:40:32.345Z INFO 192.168.1.106 User logout initiated." } | ||
---- | ||
|
||
From Logs Explorer, add the following KQL query in the search bar to filter for logs with within your timestamp range with log levels of `WARN` and `ERROR`: | ||
|
||
[source,text] | ||
---- | ||
@timestamp >= "2023-09-14T00:00:00" and @timestamp <= "2023-09-15T23:59:59" and log.level : "ERROR" or log.level : "WARN" | ||
---- | ||
|
||
Under the *Documents* tab, you'll see the filtered log data matching your query. | ||
|
||
[role="screenshot"] | ||
image::images/logs-kql-filter.png[Filter data by log level using KQL] | ||
|
||
Make sure the logs you're looking for fall within the time range in Logs Explorer. If you don't see your logs, update the time range by clicking the image:images/time-filter-icon.png[calendar icon, width=36px]. | ||
|
||
For more on using Logs Explorer, refer to the {kibana-ref}/discover.html[Discover] documentation. | ||
|
||
[discrete] | ||
[[logs-filter-qdsl]] | ||
=== Filter logs with Query DSL | ||
|
||
{ref}/query-dsl.html[Query DSL] is a JSON-based language used to send requests to {es} and retrieve data from indices and data streams. Filter your log data using Query DSL from *Dev Tools*. | ||
|
||
For example, you might want to troubleshoot an issue that happened on a specific date or at a specific time. To do this, use a boolean query with a {ref}/query-dsl-range-query.html[range query] to filter for the specific timestamp range and a {ref}/query-dsl-term-query.html[term query] to filter for `WARN` and `ERROR` log levels. | ||
|
||
First, from *Dev Tools*, add some logs with varying timestamps and log levels to your data stream with the following command: | ||
|
||
[source,console] | ||
---- | ||
POST logs-example-default/_bulk | ||
{ "create": {} } | ||
{ "message": "2023-09-15T08:15:20.234Z WARN 192.168.1.101 Disk usage exceeds 90%." } | ||
{ "create": {} } | ||
{ "message": "2023-09-14T10:30:45.789Z ERROR 192.168.1.102 Critical system failure detected." } | ||
{ "create": {} } | ||
{ "message": "2023-09-10T14:20:45.789Z ERROR 192.168.1.105 Database connection lost." } | ||
{ "create": {} } | ||
{ "message": "2023-09-20T09:40:32.345Z INFO 192.168.1.106 User logout initiated." } | ||
---- | ||
|
||
Let's say you want to look into an event that occurred between September 14th and 15th. The following boolean query filters for logs with timestamps during those days that also have a log level of `ERROR` or `WARN`. | ||
|
||
[source,console] | ||
---- | ||
POST /logs-example-default/_search | ||
{ | ||
"query": { | ||
"bool": { | ||
"filter": [ | ||
{ | ||
"range": { | ||
"@timestamp": { | ||
"gte": "2023-09-14T00:00:00", | ||
"lte": "2023-09-15T23:59:59" | ||
} | ||
} | ||
}, | ||
{ | ||
"terms": { | ||
"log.level": ["WARN", "ERROR"] | ||
} | ||
} | ||
] | ||
} | ||
} | ||
} | ||
---- | ||
|
||
The filtered results should show `WARN` and `ERROR` logs that occurred within the timestamp range: | ||
|
||
[source,JSON] | ||
---- | ||
{ | ||
... | ||
"hits": { | ||
... | ||
"hits": [ | ||
{ | ||
"_index": ".ds-logs-example-default-2023.09.25-000001", | ||
"_id": "JkwPzooBTddK4OtTQToP", | ||
"_score": 0, | ||
"_source": { | ||
"message": "192.168.1.101 Disk usage exceeds 90%.", | ||
"log": { | ||
"level": "WARN" | ||
}, | ||
"@timestamp": "2023-09-15T08:15:20.234Z" | ||
} | ||
}, | ||
{ | ||
"_index": ".ds-logs-example-default-2023.09.25-000001", | ||
"_id": "A5YSzooBMYFrNGNwH75O", | ||
"_score": 0, | ||
"_source": { | ||
"message": "192.168.1.102 Critical system failure detected.", | ||
"log": { | ||
"level": "ERROR" | ||
}, | ||
"@timestamp": "2023-09-14T10:30:45.789Z" | ||
} | ||
} | ||
] | ||
} | ||
} | ||
---- | ||
|
||
[discrete] | ||
[[logs-aggregate]] | ||
== Aggregate logs | ||
Use aggregation to analyze and summarize your log data to find patterns and gain insight. {ref}/search-aggregations-bucket.html[Bucket aggregations] organize log data into meaningful groups making it easier to identify patterns, trends, and anomalies within your logs. | ||
|
||
For example, you might want to understand error distribution by analyzing the count of logs per log level. | ||
|
||
First, from *Dev Tools*, add some logs with varying log levels to your data stream using the following command: | ||
|
||
[source,console] | ||
---- | ||
POST logs-example-default/_bulk | ||
{ "create": {} } | ||
{ "message": "2023-09-15T08:15:20.234Z WARN 192.168.1.101 Disk usage exceeds 90%." } | ||
{ "create": {} } | ||
{ "message": "2023-09-14T10:30:45.789Z ERROR 192.168.1.102 Critical system failure detected." } | ||
{ "create": {} } | ||
{ "message": "2023-09-15T12:45:55.123Z INFO 192.168.1.103 Application successfully started." } | ||
{ "create": {} } | ||
{ "message": "2023-09-14T15:20:10.789Z WARN 192.168.1.104 Network latency exceeding threshold." } | ||
{ "create": {} } | ||
{ "message": "2023-09-10T14:20:45.789Z ERROR 192.168.1.105 Database connection lost." } | ||
{ "create": {} } | ||
{ "message": "2023-09-20T09:40:32.345Z INFO 192.168.1.106 User logout initiated." } | ||
{ "create": {} } | ||
{ "message": "2023-09-21T15:20:55.678Z DEBUG 192.168.1.102 Database connection established." } | ||
---- | ||
|
||
Next, run this command to aggregate your log data using the `log.level` field: | ||
|
||
[source,console] | ||
---- | ||
POST logs-example-default/_search?size=0&filter_path=aggregations | ||
{ | ||
"size": 0,<1> | ||
"aggs": { | ||
"log_level_distribution": { | ||
"terms": { | ||
"field": "log.level" | ||
} | ||
} | ||
} | ||
} | ||
---- | ||
<1> Searches with an aggregation return both the query results and the aggregation, so you would see the logs matching the data and the aggregation. Setting `size` to `0` limits the results to aggregations. | ||
|
||
The results should show the number of logs in each log level: | ||
|
||
[source,JSON] | ||
---- | ||
{ | ||
"aggregations": { | ||
"error_distribution": { | ||
"doc_count_error_upper_bound": 0, | ||
"sum_other_doc_count": 0, | ||
"buckets": [ | ||
{ | ||
"key": "ERROR", | ||
"doc_count": 2 | ||
}, | ||
{ | ||
"key": "INFO", | ||
"doc_count": 2 | ||
}, | ||
{ | ||
"key": "WARN", | ||
"doc_count": 2 | ||
}, | ||
{ | ||
"key": "DEBUG", | ||
"doc_count": 1 | ||
} | ||
] | ||
} | ||
} | ||
} | ||
---- | ||
|
||
You can also combine aggregations and queries. For example, you might want to limit the scope of the previous aggregation by adding a range query: | ||
|
||
[source,console] | ||
---- | ||
GET /logs-example-default/_search | ||
{ | ||
"size": 0, | ||
"query": { | ||
"range": { | ||
"@timestamp": { | ||
"gte": "2023-09-14T00:00:00", | ||
"lte": "2023-09-15T23:59:59" | ||
} | ||
} | ||
}, | ||
"aggs": { | ||
"my-agg-name": { | ||
"terms": { | ||
"field": "log.level" | ||
} | ||
} | ||
} | ||
} | ||
---- | ||
|
||
The results should show an aggregate of logs that occurred within your timestamp range: | ||
|
||
[source,JSON] | ||
---- | ||
{ | ||
... | ||
"hits": { | ||
... | ||
"hits": [] | ||
}, | ||
"aggregations": { | ||
"my-agg-name": { | ||
"doc_count_error_upper_bound": 0, | ||
"sum_other_doc_count": 0, | ||
"buckets": [ | ||
{ | ||
"key": "WARN", | ||
"doc_count": 2 | ||
}, | ||
{ | ||
"key": "ERROR", | ||
"doc_count": 1 | ||
}, | ||
{ | ||
"key": "INFO", | ||
"doc_count": 1 | ||
} | ||
] | ||
} | ||
} | ||
} | ||
---- | ||
|
||
For more on aggregation types and available aggregations, refer to the {ref}/search-aggregations.html[Aggregations] documentation. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.