Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update and edit logsdb docs for logsdb / synthetic source GA #118303

Merged
merged 27 commits into from
Dec 11, 2024
Merged
Show file tree
Hide file tree
Changes from 14 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
5f5db01
Update licensing; fix screenshots; edit generally
marciw Dec 9, 2024
b908878
Small edit for clarity and style
marciw Dec 9, 2024
4d8736e
Merge branch 'main' into mw-logsdb
marciw Dec 9, 2024
4a01131
Update docs/reference/index-modules.asciidoc
marciw Dec 10, 2024
ce7a558
Apply changes from review
marciw Dec 10, 2024
737ea70
Address review comments
marciw Dec 10, 2024
b2a34e2
Merge branch 'main' into mw-logsdb
marciw Dec 10, 2024
c5cbf91
Match similar change from review
marciw Dec 10, 2024
d0cf775
Merge branch 'mw-logsdb' of github.com:marciw/elasticsearch into mw-l…
marciw Dec 10, 2024
f01ce45
More changes from review
marciw Dec 10, 2024
0ddf0e3
Apply suggestions from review
marciw Dec 10, 2024
0765a05
Apply suggestions from review
marciw Dec 10, 2024
c200598
Update docs/reference/data-streams/logs.asciidoc
marciw Dec 10, 2024
718aaf8
Apply suggestions from review
marciw Dec 10, 2024
80f9dfa
Apply suggestions from review
marciw Dec 10, 2024
adc0941
Merge branch 'mw-logsdb' of github.com:marciw/elasticsearch into mw-l…
marciw Dec 10, 2024
468c776
Merge branch 'main' into mw-logsdb
marciw Dec 10, 2024
d9dc4da
Change to general subscription note
marciw Dec 10, 2024
773cb92
Merge branch 'main' into mw-logsdb
marciw Dec 10, 2024
140c3f8
Apply suggestions from review
marciw Dec 10, 2024
7c3fb05
Apply suggestions from review
marciw Dec 10, 2024
e9bfa0e
Apply suggestions from review; additional edits
marciw Dec 10, 2024
4ebd4fa
Merge branch 'main' into mw-logsdb
marciw Dec 10, 2024
c9419d1
Apply suggestions from review; clarity tweaks
marciw Dec 11, 2024
61da7ab
Merge branch 'main' into mw-logsdb
marciw Dec 11, 2024
4620a37
Restore previous paragraph structure and context
marciw Dec 11, 2024
fccda9c
Merge branch 'main' into mw-logsdb
marciw Dec 11, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 14 additions & 17 deletions docs/reference/data-streams/logs.asciidoc
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
[[logs-data-stream]]
== Logs data stream

IMPORTANT: The {es} `logsdb` index mode is generally available for Elastic Cloud Hosted
and Self-Managed customers as of version 8.17 and is enabled by default for
IMPORTANT: The {es} `logsdb` index mode is generally available in Elastic Cloud Hosted
and self-managed Elasticsearch as of version 8.17, and is enabled by default for
logs in https://www.elastic.co/elasticsearch/serverless[{serverless-full}].

A logs data stream is a data stream type that stores log data more efficiently.
Expand Down Expand Up @@ -32,12 +32,12 @@ PUT _index_template/my-index-template
----
// TEST

<1> Index mode setting
<2> Index template priority: By default, Elasticsearch ships with a `logs-*-*` index template with a priority of 100. To make sure your index template takes priority over the default `logs-*-*` template, set its `priority` to a number higher than 100. For more information, see <<avoid-index-pattern-collisions,Avoid index pattern collisions>>.
<1> The index mode setting.
<2> The index template priority. By default, Elasticsearch ships with a `logs-*-*` index template with a priority of 100. To make sure your index template takes priority over the default `logs-*-*` template, set its `priority` to a number higher than 100. For more information, see <<avoid-index-pattern-collisions,Avoid index pattern collisions>>.

After the index template is created, new indices that use the template will be configured as a logs data stream. You can start indexing data and <<use-a-data-stream,using the data stream>>.

You can also set the index mode and adjust other template settings in <<index-mgmt,Stack Management in {kib}>>.
You can also set the index mode and adjust other template settings in <<index-mgmt,the Elastic UI>>.

////
[source,console]
Expand All @@ -53,16 +53,17 @@ DELETE _index_template/my-index-template
[[logsdb-synthetic-source]]
=== Synthetic source

By default, `logsdb` mode uses <<synthetic-source,synthetic `_source`>>, which omits storing the original `_source`
If you have the required https://www.elastic.co/subscriptions[subscription], `logsdb` index mode uses <<synthetic-source,synthetic `_source`>>, which omits storing the original `_source`
field. Instead, the document source is synthesized from doc values or stored fields upon document retrieval.

Before using synthetic source, make sure to review the <<synthetic-source,restriction and modifications>>. To prevent modifications of a particular object or field, you can <<synthetic-source-keep,minimize synthetic source modifications>>.
Before using synthetic source, make sure to review the <<synthetic-source,restrictions>>.

When working with multi-value fields, the `index.mapping.synthetic_source_keep` setting controls how field values
are preserved for <<synthetic-source,synthetic source>> reconstruction. In `logsdb`, the default value is `arrays`,
which retains both duplicate values and the order of entries. However, the exact structure of
array elements and objects is not necessarily retained. Preserving duplicates and ordering can be critical for some log fields, such as DNS A records, HTTP headers, and log entries that represent sequential or repeated events.

If you don't have the required https://www.elastic.co/subscriptions[subscription], `logsdb` mode uses the original `_source` field.
marciw marked this conversation as resolved.
Show resolved Hide resolved

[discrete]
[[logsdb-sort-settings]]
Expand All @@ -83,23 +84,19 @@ The `min` mode sorts indices by the minimum value of multi-value fields.
Missing values are sorted to appear `_first`.

You can override these default sort settings. For example, to sort on different fields
and change the order, modify `index.sort.field` and `index.sort.order`.
and change the order, manually configure `index.sort.field` and `index.sort.order`. For more details, see
<<index-modules-index-sorting>>.

When using the default sort settings, the `host.name` field is automatically injected into the index mappings as a `keyword` field to ensure that sorting can be applied. This guarantees that logs are efficiently sorted and
retrieved based on the `host.name` and `@timestamp` fields.

NOTE: If `subobjects` is set to `true` (default), the `host.name` field is mapped as an object field
NOTE: If `subobjects` is set to `true` (default), the `host` field is mapped as an object field
named `host` with a `name` child field of type `keyword`. If `subobjects` is set to `false`,
a single `host.name` field is mapped as a `keyword` field.

After an index is created, the sort settings cannot be modified. To use different sort settings,
create a new index. For data streams, you can update the relevant component templates and then
To apply different sort settings, update the data stream's component templates, and then
marciw marked this conversation as resolved.
Show resolved Hide resolved
perform or wait for an index <<data-streams-rollover,rollover>>.

Keep in mind that sort
settings can influence indexing throughput and query latency, and may affect compression efficiency. For more details, see
<<index-modules-index-sorting>>.

NOTE: For <<data-streams, data streams>>, the `@timestamp` field is automatically injected if it's not already present. If you apply custom sort settings, the `@timestamp` field is injected into the mappings but is not
automatically added to the list of sort fields.

Expand Down Expand Up @@ -141,7 +138,7 @@ segment-level keyword dictionary. This compression is used when multiple consecu
[[logsdb-ignored-settings]]
=== `ignore` settings

`logsdb` index mode uses `ignore` settings to optimize performance. You can override these settings as needed.
The `logsdb` index mode uses the following `ignore` settings. You can override these settings as needed.

[discrete]
[[logsdb-ignore-malformed]]
Expand Down Expand Up @@ -187,7 +184,7 @@ reconstructing the original value.

[discrete]
[[logsdb-settings-summary]]
=== Full settings reference
=== Settings reference

The `logsdb` index mode uses the following settings:

Expand Down
4 changes: 2 additions & 2 deletions docs/reference/index-modules.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -113,9 +113,9 @@ Index mode supports the following values:

`standard`::: Standard indexing with default settings.

<<tsds,`time_series`>>::: Index mode optimized for storage of metrics documented in <<tsds-index-settings,TSDS Settings>>.
`tsds`::: Index mode optimized for storage of metrics. For more information, see <<tsds-index-settings>>.

<<logs-data-stream,`logsdb`>>::: Index mode optimized for storage of logs. By default, this setting uses <<synthetic-source,synthetic `_source`>> and sorts on the `hostname` and `timestamp` fields. You can override the defaults to <<index-modules-index-sorting,sort>> on different fields.
`logsdb`::: Index mode optimized for <<logs-data-stream,logs>>.

[[routing-partition-size]] `index.routing_partition_size`::

Expand Down
3 changes: 1 addition & 2 deletions docs/reference/indices/put-index-template.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,6 @@ DELETE _index_template/template_*
* If the {es} {security-features} are enabled, you must have the
`manage_index_templates` or `manage` <<privileges-list-cluster,cluster
privilege>> to use this API.
* The `logsDB` index mode requires an https://www.elastic.co/pricing[Enterprise subscription].

[[put-index-template-api-desc]]
==== {api-description-title}
Expand Down Expand Up @@ -117,7 +116,7 @@ See <<create-index-template,create an index template>>.
`index_mode`::
(Optional, string) Type of data stream to create. Valid values are `null`
(standard data stream), `time_series` (<<tsds,time series data stream>>) and `logsdb`
(<<logs-data-stream,logs data stream>>, which requires an https://www.elastic.co/pricing[Enterprise subscription]).
(<<logs-data-stream,logs data stream>>).
+
The template's `index_mode` sets the `index.mode` of the backing index.
=====
Expand Down
6 changes: 2 additions & 4 deletions docs/reference/mapping/fields/synthetic-source.asciidoc
Original file line number Diff line number Diff line change
@@ -1,12 +1,10 @@
[[synthetic-source]]
==== Synthetic `_source`

IMPORTANT: This feature requires an https://www.elastic.co/subscriptions[Enterprise subscription].

Though very handy to have around, the source field takes up a significant amount
of space on disk. Instead of storing source documents on disk exactly as you
send them, Elasticsearch can reconstruct source content on the fly upon retrieval.
Enable this by using the value `synthetic` for the index setting `index.mapping.source.mode`:
To enable this https://www.elastic.co/subscriptions[subscription] feature, use the value `synthetic` for the index setting `index.mapping.source.mode`:

[source,console,id=enable-synthetic-source-example]
----
Expand All @@ -25,7 +23,7 @@ PUT idx
----
// TESTSETUP

While this on the fly reconstruction is *generally* slower than saving the source
While this on-the-fly reconstruction is _generally_ slower than saving the source
documents verbatim and loading them at query time, it saves a lot of storage
space. Additional latency can be avoided by not loading `_source` field in queries when it is not needed.

Expand Down