Skip to content

Commit

Permalink
Incorporate review feedback
Browse files Browse the repository at this point in the history
  • Loading branch information
karenzone committed Dec 6, 2024
1 parent b7ff238 commit fa049fc
Show file tree
Hide file tree
Showing 5 changed files with 98 additions and 86 deletions.
1 change: 1 addition & 0 deletions docs/en/ingest-guide/index.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ include::{docs-root}/shared/attributes.asciidoc[]

include::ingest-intro.asciidoc[]
include::ingest-tools.asciidoc[]
include::ingest-additional-proc.asciidoc[]
//include::ingest-static.asciidoc[]
//include::ingest-timestamped.asciidoc[]
include::ingest-solutions.asciidoc[]
Expand Down
27 changes: 27 additions & 0 deletions docs/en/ingest-guide/ingest-additional-proc.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
[[ingest-addl-proc]]
== Additional ingest processing

You can start with {agent} and Elastic {integrations-docs}[integrations], and still
take advantage of additional processing options if you need them.

{agent} processors::
You can use link:{fleet-guide}/elastic-agent-processor-configuration.html[{agent} processors] to sanitize or enrich raw data at the source.
Use {agent} processors if you need to control what data is sent across the wire, or if you need to enrich the raw data with information available on the host.

{es} ingest pipelines::
You can use {es} link:{ref}/[ingest pipelines] to enrich incoming data or normalize field data before the data is indexed.
{es} ingest pipelines enable you to manipulate the data as it comes in.
This approach helps you avoid adding processing overhead to the hosts from which you're collecting data.

{es} runtime fields::
You can use {es} link:{ref}/runtime.html[runtime fields] to define or alter the schema at query time.
You can start working with your data without needing to understand how it is
structured, add fields to existing documents without reindexing your data,
override the value returned from an indexed field, and/or define fields for a
specific use without modifying the underlying schema.

{ls} `elastic_integration filter`::
You can use the {ls} link:{logstash-ref}/[`elastic_integration filter`] and
other link:{logstash-ref}/filter-plugins.html[{ls} filters] to
link:{logstash-ref}/ea-integrations.html[extend Elastic integrations] by
transforming data before it goes to {es}.
25 changes: 16 additions & 9 deletions docs/en/ingest-guide/ingest-intro.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -14,14 +14,14 @@ reshape your data before it goes to {es}.
You can ingest:

* **General content** (data without timestamps), such as HTML pages, catalogs, and files
* **Timestamped (time series) data**, such as logs, metrics, and traces for Search, Security, Observability, or your own solution
* **Timestamped (time series) data**, such as logs, metrics, and traces for Elastic Security, Observability, Search solutions, or for your own custom solutions

[ingest-best-approach]
.What's the best approach for ingesting data?
****
The best choice for ingesting data is the _simplest option_ that _meets your needs_ and _satisfies your use case_.
[discrete]
[[ingest-general]]
=== Ingesting general content

**Best practice for general content**. Choose the ingest tool that aligns with your data source.
Elastic offer tools designed to ingest specific types of general content.
The content type determines the best ingest option.

* To index **documents** directly into {es}, use the {es} link:{ref}/docs.html[document APIs].
* To send **application data** directly to {es}, use an link:https://www.elastic.co/guide/en/elasticsearch/client/index.html[{es}
Expand All @@ -32,10 +32,18 @@ language client].

If you would like to try things out before you add your own data, try using our {kibana-ref}/connect-to-elasticsearch.html#_add_sample_data[sample data].

**Best practice for timestamped data**. Start with Elastic Agent and an Elastic integration.
[discrete]
[[ingest-timestamped]]
=== Ingesting time-stamped data



[ingest-best-timestamped]
.What's the best approach for ingesting time-stamped data?
****
The best approach for ingesting data is the _simplest option_ that _meets your needs_ and _satisfies your use case_.
Usually, the _simplest option_ for ingesting timestamped data is {agent} paired with an Elastic integration.
In most cases, the _simplest option_ for ingesting timestamped data is using {agent} paired with an Elastic integration.
* Install {fleet-guide}[Elastic Agent] on the computer(s) from which you want to collect data.
* Add the {integrations-docs}[Elastic integration] for the data source to your deployment.
Expand All @@ -49,5 +57,4 @@ to search for available integrations.
If you don't find an integration for your data source or if you need
additional processing to extend the integration, we still have you covered.
Check out <<ingest-addl-proc,additional processing>> for a sneak peek.
****
23 changes: 11 additions & 12 deletions docs/en/ingest-guide/ingest-solutions.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ To use {fleet-guide}[Elastic Agent] and {integrations-docs}[Elastic integrations
with Elastic solutions:
1. Create an link:https://www.elastic.co/cloud[{ecloud}] deployment for your solution.
If you don't have a {ecloud} account, you can sign up for a link:https://cloud.elastic.co/registration[free trial] get started.
If you don't have an {ecloud} account, you can sign up for a link:https://cloud.elastic.co/registration[free trial] to get started.
2. Add the {integrations-docs}[Elastic integration] for your data source to the deployment.
3. link:{fleet-guide}/elastic-agent-installation.html[Install {agent}] on the systems whose data you want to collect.
****
Expand Down Expand Up @@ -51,12 +51,8 @@ monitor and gain insights into logs, metrics, and application traces.
The resources and guides in this section illustrate how to ingest data and use
it with the Observability solution.

**Resources**

* link:{fleet-guide}/elastic-agent-installation.html[Install {agent}]
* link:https://www.elastic.co/integrations/data-integrations?solution=observability[Elastic Observability integrations]

**Guides for popular Observability use case**
**Guides for popular Observability use cases**

* link:{estc-welcome}/getting-started-observability.html[Monitor applications and systems with Elastic Observability]
* link:https://www.elastic.co/guide/en/observability/current/logs-metrics-get-started.html[Get started with logs and metrics]
Expand All @@ -67,6 +63,10 @@ it with the Observability solution.
** link:{serverless-docs}/observability/quickstarts/monitor-hosts-with-elastic-agent[Monitor hosts with {agent} ({serverless-short})]
** link:{serverless-docs}/observability/quickstarts/k8s-logs-metrics[Monitor your K8s cluster with {agent} ({serverless-short})]

**Resources**

* link:{fleet-guide}/elastic-agent-installation.html[Install {agent}]
* link:https://www.elastic.co/integrations/data-integrations?solution=observability[Elastic Observability integrations]

[discrete]
[[ingest-for-security]]
Expand All @@ -77,17 +77,17 @@ link:https://www.elastic.co/security[Elastic Security] to analyze and take
action on your data.
The resources and guides in this section illustrate how to ingest data and use it with the Security solution.

**Guides for popular Security use cases**

* link:https://www.elastic.co/guide/en/starting-with-the-elasticsearch-platform-and-its-solutions/current/getting-started-siem-security.html[Use Elastic Security for SIEM]
* link:https://www.elastic.co/guide/en/starting-with-the-elasticsearch-platform-and-its-solutions/current/getting-started-endpoint-security.html[Protect hosts with endpoint threat intelligence from Elastic Security]

**Resources**

* link:{fleet-guide}/elastic-agent-installation.html[Install {agent}]
* link:https://www.elastic.co/integrations/data-integrations?solution=search[Elastic Security integrations]
* link:{security-guide}/es-overview.html[Elastic Security documentation]

**Guides for popular Security use case**

* link:https://www.elastic.co/guide/en/starting-with-the-elasticsearch-platform-and-its-solutions/current/getting-started-siem-security.html[Use Elastic Security for SIEM]
* link:https://www.elastic.co/guide/en/starting-with-the-elasticsearch-platform-and-its-solutions/current/getting-started-endpoint-security.html[Protect hosts with endpoint threat intelligence from Elastic Security]


[discrete]
[[ingest-for-custom]]
Expand All @@ -106,6 +106,5 @@ Bring your ideas and use {es} and the {stack} to store, search, and visualize yo
** link:https://www.elastic.co/guide/en/elasticsearch/client/index.html[{es} language clients]
** link:https://www.elastic.co/web-crawler[Elastic web crawler]
** link:{ref}/es-connectors.html[Elastic connectors]

* link:{estc-welcome}/getting-started-general-purpose.html[Tutorial: Get started with vector search and generative AI]

108 changes: 43 additions & 65 deletions docs/en/ingest-guide/ingest-tools.asciidoc
Original file line number Diff line number Diff line change
@@ -1,22 +1,33 @@
[[ingest-tools]]
== Tools for ingesting data
== Tools for ingesting time-series data


Elastic and others offer tools to help you get your data from the original data source into {es}.
Some tools are designed for particular data sources, and others are multi-purpose.

// Iterative messaging as our recommended strategy morphs.
// This section is the summary. "Here's the story _now_."
// Hint at upcoming changes, but do it cautiously and responsibly.
// Modular and co-located to make additions/updates/deprecations easier as our story matures.

Elastic and others offer tools to help you get your data from the original data source into {es}.
Some tools are designed for particular data sources, and others are multi-purpose.

{agent} and Elastic integrations::
In this section, we'll help you determine which option is best for you.

* <<ingest-ea>>
* <<ingest-beats>>
* <<ingest-otel>>
* <<ingest-logstash>>

[discrete]
[[ingest-ea]]
=== {agent} and Elastic integrations

A single link:{fleet-guide}[{agent}] can collect multiple types of data when it is link:{fleet-guide}/elastic-agent-installation.html[installed] on a host computer.
You can use standalone {agent}s and manage them locally on the systems where they are installed, or you can manage all of your agents and policies with the link:{fleet-guide}/manage-agents-in-fleet.html[Fleet UI in {kib}].
+

Use {agent} with one of hundreds of link:{integrations-docs}[Elastic integrations] to simplify collecting, transforming, and visualizing data.
Integrations include default ingestion rules, dashboards, and visualizations to help you start analyzing your data right away.
Check out the {integrations-docs}/all_integrations[Integration quick reference] to search for available integrations that can reduce your time to value.
+

{agent} is the best option for collecting timestamped data for most data sources
and use cases.
Expand All @@ -25,34 +36,46 @@ link:{fleet-guide}/elastic-agent-processor-configuration.html[{agent}
processors], link:{logstash-ref}[{ls}], or additional processing features in
{es}.
Check out <<ingest-addl-proc,additional processing>> to see options.
+

Ready to try link:{fleet-guide}[{agent}]? Check out the link:{fleet-guide}/elastic-agent-installation.html[installation instructions].
+
**Beats.** link:{beats-ref}/beats-reference.html[Beats] are the original Elastic lightweight data shippers, and their capabilities live on in Elastic Agent.
When you use Elastic Agent, you're getting core Beats functionality and more added features.

[discrete]
[[ingest-beats]]
=== {beats}

link:{beats-ref}/beats-reference.html[Beats] are the original Elastic lightweight data shippers, and their capabilities live on in Elastic Agent.
When you use Elastic Agent, you're getting core Beats functionality, but with more added features.


Beats require that you install a separate Beat for each type of data you want to collect.
A single Elastic Agent installed on a host can collect and transport multiple types of data.
+
**Best practice:** Use link:{fleet-guide}[Elastic Agent] whenever possible.
If your data source is not yet supported by Elastic Agent, use Beats.
Check out {beats} and {agent} link:{fleet-guide}/beats-agent-comparison.html#additional-capabilities-beats-and-agent[comparison] for more info.

**Best practice:** Use link:{fleet-guide}[{agent}] whenever possible.
If your data source is not yet supported by {agent}, use {beats}.
Check out the {beats} and {agent} link:{fleet-guide}/beats-agent-comparison.html#additional-capabilities-beats-and-agent[comparison] for more info.
When you are ready to upgrade, check out link:{fleet-guide}/migrate-beats-to-agent.html[Migrate from {beats} to {agent}].

OpenTelemetry (OTel) collectors::
[discrete]
[[ingest-otel]]
=== OpenTelemetry (OTel) collectors

link:https://opentelemetry.io/docs[OpenTelemetry] is a vendor-neutral observability framework for collecting, processing, and exporting telemetry data.
Elastic is a member of the Cloud Native Computing Foundation (CNCF) and active contributor to the OpenTelemetry project.
+

In addition to supporting upstream OTel development, Elastic provides link:https://github.com/elastic/opentelemetry[Elastic Distributions of OpenTelemetry], specifically designed to work with Elastic Observability.
We're also expanding link:{fleet-guide}[{agent}] to use OTel collection.

Logstash::
[discrete]
[[ingest-logstash]]
=== Logstash

link:{logstash-ref}[{ls}] is a versatile open source data ETL (extract, transform, load) engine that can expand your ingest capabilities.
{ls} can _collect data_ from a wide variety of data sources with {ls} link:{logstash-ref}/input-plugins.html[input
plugins], _enrich and transform_ the data with {ls} link:{logstash-ref}/filter-plugins.html[filter plugins], and _output_ the
data to {es} and other destinations with the {ls} link:{logstash-ref}/output-plugins.html[output plugins].
+
Most users never need to use {ls}, but it's available if you need it for:
+

Many users never need to use {ls}, but it's available if you need it for:

* **Data collection** (if an Elastic integration isn't available).
{agent} and Elastic {integrations-docs}/all_integrations[integrations] provide many features out-of-the-box, so be sure to search or browse integrations for your data source.
If you don't find an Elastic integration for your data source, check {ls} for an {logstash-ref}/input-plugins.html[input plugin] for your data source.
Expand All @@ -64,48 +87,3 @@ link:{ingest-guide}/lspq.html[persistence or buffering],
additional link:{ingest-guide}/ls-enrich.html[data enrichment],
link:{ingest-guide}/ls-networkbridge.html[proxying] as a way to bridge network connections, or the ability to route data to
link:{ingest-guide}/ls-multi.html[multiple destinations].

Language clients::
link:https://www.elastic.co/guide/en/elasticsearch/client/index.html[Elastic
language clients] help you send **application data**, such as from NodeJS or Python,
directly to {es} for search and analysis.
//ToDo: Figure out trademark considerations.

APIs::
Use the {es} link:{ref}/docs.html[document APIs] to index **documents** directly into {es}.

File uploader::
Use the {kib} link:{kibana-ref}/connect-to-elasticsearch.html#upload-data-kibana[file uploader] to index **single files** into {es}.
This tool can be helpful for testing with small numbers of files.

Web crawler::
Use the Elastic link:https://www.elastic.co/web-crawler[web crawler] to index **web page content**.

Connectors::
Use link:{ref}/es-connectors.html[connectors] to index **data from third-party sources**, such as Amazon S3, GMail, Outlook, and Salesforce.
//ToDo: Figure out trademark considerations.

Elastic serverless forwarder::
The link:https://www.elastic.co/guide/en/esf/current/aws-elastic-serverless-forwarder.html[Elastic Serverless Forwarder] is an Amazon Web Services (AWS) Lambda function that ships logs from your AWS environment to {es}.

[discrete]
[[ingest-addl-proc]]
== Tools and features for additional processing
You can start with {agent} and Elastic {integrations-docs}[integrations], and still
take advantage of additional processing options if you need them.
You can use:

* link:{fleet-guide}/elastic-agent-processor-configuration.html[{agent} processors] to sanitize or enrich raw data at the source.
Use {agent} processors if you need to control what data is sent across the wire, or if you need to enrich the raw data with information available on the host.
* {es} link:{ref}/[ingest pipelines] to enrich incoming data or normalize field data before the data is indexed.
{es} ingest pipelines enable you to manipulate the data as it comes in.
This approach helps you avoid adding processing overhead to the hosts from which you're collecting data.

* {es} link:{ref}/runtime.html[runtime fields] to define or alter the schema at query time.
You can use runtime fields at query time to start working with your data without needing to understand how it is structured,
add fields to existing documents without reindexing your data,
override the value returned from an indexed field, and/or
define fields for a specific use without modifying the underlying schema.

* {ls} `elastic_integration filter` to link:{logstash-ref}/ea-integrations.html[extend Elastic integrations], and other {ls} link:{logstash-ref}/filter-plugins.html[filter plugins] to transform data before it goes to {es}.

0 comments on commit fa049fc

Please sign in to comment.