forked from opensearch-project/documentation-website
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add S3 sink documentation (opensearch-project#4340)
* Add S3 sink documentation Signed-off-by: Naarcha-AWS <[email protected]> * Update _data-prepper/pipelines/configuration/sinks/s3.md Signed-off-by: Naarcha-AWS <[email protected]> * Update _data-prepper/pipelines/configuration/sinks/s3.md Signed-off-by: Naarcha-AWS <[email protected]> * Update _data-prepper/pipelines/configuration/sinks/s3.md Signed-off-by: Naarcha-AWS <[email protected]> * Apply suggestions from code review Co-authored-by: Chris Moore <[email protected]> Signed-off-by: Naarcha-AWS <[email protected]> * Apply suggestions from code review Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: Naarcha-AWS <[email protected]> --------- Signed-off-by: Naarcha-AWS <[email protected]> Signed-off-by: Naarcha-AWS <[email protected]> Co-authored-by: Chris Moore <[email protected]> Co-authored-by: Nathan Bower <[email protected]>
- Loading branch information
1 parent
3fc0be3
commit bba8561
Showing
5 changed files
with
103 additions
and
30 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,25 +1,31 @@ | ||
--- | ||
layout: default | ||
title: file sink | ||
title: file | ||
parent: Sinks | ||
grand_parent: Pipelines | ||
nav_order: 45 | ||
--- | ||
|
||
# file sink | ||
# file | ||
|
||
## Overview | ||
Use the `file` sink to create a flat file output, usually a `.log` file. | ||
|
||
You can use the `file` sink to create a flat file output. The following table describes options you can configure for the `file` sink. | ||
## Configuration options | ||
|
||
The following table describes options you can configure for the `file` sink. | ||
|
||
Option | Required | Type | Description | ||
:--- | :--- | :--- | :--- | ||
path | Yes | String | Path for the output file (e.g. `logs/my-transformed-log.log`). | ||
|
||
<!--- ## Configuration | ||
## Usage | ||
|
||
Content will be added to this section. | ||
The following example shows basic usage of the `file` sink: | ||
|
||
## Metrics | ||
``` | ||
sample-pipeline: | ||
sink: | ||
- file: | ||
path: path/to/output-file | ||
``` | ||
|
||
Content will be added to this section. ---> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,25 +1,30 @@ | ||
--- | ||
layout: default | ||
title: Pipeline sink | ||
title: pipeline | ||
parent: Sinks | ||
grand_parent: Pipelines | ||
nav_order: 45 | ||
nav_order: 55 | ||
--- | ||
|
||
# Pipeline sink | ||
# pipeline | ||
|
||
## Overview | ||
Use the `pipeline` sink to write to another pipeline. | ||
|
||
You can use the `pipeline` sink to write to another pipeline. | ||
## Configuration options | ||
|
||
The `pipeline` sink supports the following configuration options. | ||
|
||
Option | Required | Type | Description | ||
:--- | :--- | :--- | :--- | ||
name | Yes | String | Name of the pipeline to write to. | ||
|
||
<!--- ## Configuration | ||
Content will be added to this section. | ||
## Usage | ||
|
||
## Metrics | ||
The following example configures a `pipeline` sink that writes to a pipeline named `movies`: | ||
|
||
Content will be added to this section. ---> | ||
``` | ||
sample-pipeline: | ||
sink: | ||
- pipeline: | ||
name: movies | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,70 @@ | ||
--- | ||
layout: default | ||
title: s3 | ||
parent: Sinks | ||
grand_parent: Pipelines | ||
nav_order: 55 | ||
--- | ||
|
||
# s3 | ||
|
||
The `s3` sink sends records to an Amazon Simple Storage Service (Amazon S3) bucket using the S3 client. | ||
|
||
## Usage | ||
|
||
The following example creates a pipeline configured with an s3 sink. It contains additional options for customizing the event and size thresholds for which the pipeline sends record events and sets the codec type `ndjson`: | ||
|
||
``` | ||
pipeline: | ||
... | ||
sink: | ||
- s3: | ||
aws: | ||
region: us-east-1 | ||
sts_role_arn: arn:aws:iam::123456789012:role/Data-Prepper | ||
sts_header_overrides: | ||
max_retries: 5 | ||
bucket: | ||
name: bucket_name | ||
object_key: | ||
path_prefix: my-elb/%{yyyy}/%{MM}/%{dd}/ | ||
threshold: | ||
event_count: 2000 | ||
maximum_size: 50mb | ||
event_collect_timeout: 15s | ||
codec: | ||
ndjson: | ||
buffer_type: in_memory | ||
``` | ||
|
||
## Configuration | ||
|
||
Use the following options when customizing the `s3` sink. | ||
|
||
Option | Required | Type | Description | ||
:--- | :--- | :--- | :--- | ||
`bucket` | Yes | String | The object from which the data is retrieved and then stored. The `name` must match the name of your object store. | ||
`region` | No | String | The AWS Region to use when connecting to S3. Defaults to the [standard SDK behavior to determine the Region](https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/region-selection.html). | ||
`sts_role_arn` | No | String | The [AWS Security Token Service](https://docs.aws.amazon.com/STS/latest/APIReference/welcome.html) (AWS STS) role that the `s3` sink assumes when sending a request to S3. Defaults to the [standard SDK behavior for credentials](https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/credentials.html). | ||
`sts_external_id` | No | String | The external ID to attach to AssumeRole requests from AWS STS. | ||
`max_retries` | No | Integer | The maximum number of times a single request should retry when ingesting data to S3. Defaults to `5`. | ||
`object_key` | No | Sets the `path_prefix` and the `file_pattern` of the object store. Defaults to the S3 object `events-%{yyyy-MM-dd'T'hh-mm-ss}` found inside the root directory of the bucket. | ||
|
||
## Threshold configuration options | ||
|
||
Use the following options to set ingestion thresholds for the `s3` sink. | ||
|
||
Option | Required | Type | Description | ||
:--- | :--- | :--- | :--- | ||
`event_count` | Yes | Integer | The maximum number of events the S3 bucket can ingest. | ||
`maximum_size` | Yes | String | The maximum count or number of bytes that the S3 bucket can ingest. Defaults to `50mb`. | ||
`event_collect_timeout` | Yes | String | Sets the time period during which events are collected before ingestion. All values are strings that represent duration, either an ISO_8601 notation string, such as `PT20.345S`, or a simple notation, such as `60s` or `1500ms`. | ||
|
||
## buffer_type | ||
|
||
`buffer_type` is an optional configuration that records stored events temporarily before flushing them into an S3 bucket. Use of one of the following options: | ||
|
||
- `local_file`: Flushes the record into a file on your machine. | ||
- `in_memory`: Stores the record in memory. | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters