diff --git a/docs/user-guide/logs/manage-pipelines.md b/docs/user-guide/logs/manage-pipelines.md index 8276c748c..5899c63d2 100644 --- a/docs/user-guide/logs/manage-pipelines.md +++ b/docs/user-guide/logs/manage-pipelines.md @@ -212,4 +212,128 @@ The output is as follows: } ``` -The output shows that the pipeline successfully processed the log data. The `rows` field contains the processed data, and the `schema` field contains the schema information of the processed data. You can use this information to verify the correctness of the pipeline configuration. \ No newline at end of file +The output shows that the pipeline successfully processed the log data. The `rows` field contains the processed data, and the `schema` field contains the schema information of the processed data. You can use this information to verify the correctness of the pipeline configuration. + +### Test a Failed Pipeline + +Assume that the pipeline configuration is as follows: + + +```bash +curl -X "POST" "http://localhost:4000/v1/events/pipelines/test" \ + -H 'Content-Type: application/x-yaml' \ + -d $'processors: + - date: + field: time + formats: + - "%Y-%m-%d %H:%M:%S%.3f" + ignore_missing: true + - gsub: + fields: + - message + pattern: "\\\." + replacement: + - "-" + ignore_missing: true + +transform: + - fields: + - message + type: string + - field: time + type: time + index: timestamp' +``` + +The pipeline configuration contains an error. The `gsub` Processor expects the `replacement` field to be a string, but the current configuration provides an array. As a result, the pipeline creation fails with the following error message: + + +```json +{"error":"Failed to parse pipeline: 'replacement' must be a string"} +``` + +Therefore, We need to modify the configuration of the `gsub` Processor and change the value of the `replacement` field to a string type. + +```bash +curl -X "POST" "http://localhost:4000/v1/events/pipelines/test" \ + -H 'Content-Type: application/x-yaml' \ + -d $'processors: + - date: + field: time + formats: + - "%Y-%m-%d %H:%M:%S%.3f" + ignore_missing: true + - gsub: + fields: + - message + pattern: "\\\." + replacement: "-" + ignore_missing: true + +transform: + - fields: + - message + type: string + - field: time + type: time + index: timestamp' +``` + +The Pipeline has been successfully created at this point, and We can test the Pipeline using the `dryrun` interface. We will test it with erroneous log data where the value of the message field is in numeric format, causing the pipeline to fail during processing. + + +```bash +curl -X "POST" "http://localhost:4000/v1/events/pipelines/dryrun?pipeline_name=test" \ + -H 'Content-Type: application/json' \ + -d $'{"message": 1998.08,"time":"2024-05-25 20:16:37.217"}' + +{"error":"Failed to execute pipeline, reason: gsub processor: expect string or array string, but got Float64(1998.08)"} +``` + +The output indicates that the pipeline processing failed because the `gsub` Processor expects a string type rather than a floating-point number type. We need to adjust the format of the log data to ensure the pipeline can process it correctly. +Let's change the value of the message field to a string type and test the pipeline again. + +```bash +curl -X "POST" "http://localhost:4000/v1/events/pipelines/dryrun?pipeline_name=test" \ + -H 'Content-Type: application/json' \ + -d $'{"message": "1998.08","time":"2024-05-25 20:16:37.217"}' +``` + +At this point, the Pipeline processing is successful, and the output is as follows: + +```json +{ + "rows": [ + [ + { + "data_type": "STRING", + "key": "message", + "semantic_type": "FIELD", + "value": "1998-08" + }, + { + "data_type": "TIMESTAMP_NANOSECOND", + "key": "time", + "semantic_type": "TIMESTAMP", + "value": "2024-05-25 20:16:37.217+0000" + } + ] + ], + "schema": [ + { + "colume_type": "FIELD", + "data_type": "STRING", + "fulltext": false, + "name": "message" + }, + { + "colume_type": "TIMESTAMP", + "data_type": "TIMESTAMP_NANOSECOND", + "fulltext": false, + "name": "time" + } + ] +} +``` + +It can be seen that the `.` in the string `1998.08` has been replaced with `-`, indicating a successful processing of the Pipeline. \ No newline at end of file diff --git a/i18n/zh/docusaurus-plugin-content-docs/current/user-guide/logs/manage-pipelines.md b/i18n/zh/docusaurus-plugin-content-docs/current/user-guide/logs/manage-pipelines.md index add6917bd..da5a68870 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/current/user-guide/logs/manage-pipelines.md +++ b/i18n/zh/docusaurus-plugin-content-docs/current/user-guide/logs/manage-pipelines.md @@ -212,4 +212,126 @@ curl -X "POST" "http://localhost:4000/v1/events/pipelines/dryrun?pipeline_name=t } ``` -输出显示该 Pipeline 成功处理了日志数据。`rows` 字段包含已处理数据,`schema` 字段包含已处理数据的模式信息。您可以使用这些信息来验证 Pipeline 配置的正确性。 \ No newline at end of file +输出显示该 Pipeline 成功处理了日志数据。`rows` 字段包含已处理数据,`schema` 字段包含已处理数据的模式信息。您可以使用这些信息来验证 Pipeline 配置的正确性。 + +### 测试 Pipeline 失败 + +接下来,我们测试一个失败的 Pipeline。假设我们的 Pipeline 配置文件如下: + +```bash +curl -X "POST" "http://localhost:4000/v1/events/pipelines/test" \ + -H 'Content-Type: application/x-yaml' \ + -d $'processors: + - date: + field: time + formats: + - "%Y-%m-%d %H:%M:%S%.3f" + ignore_missing: true + - gsub: + fields: + - message + pattern: "\\\." + replacement: + - "-" + ignore_missing: true + +transform: + - fields: + - message + type: string + - field: time + type: time + index: timestamp' +``` + +Pipeline 配置存在错误。`gsub` processor 期望 `replacement` 字段为字符串,但当前配置提供了一个数组。因此,管道创建失败,并显示以下错误消息: + + +```json +{"error":"Failed to parse pipeline: 'replacement' must be a string"} +``` + +因此,您需要修改 `gsub` processor 的配置,将 `replacement` 字段的值更改为字符串类型。 + +```bash +curl -X "POST" "http://localhost:4000/v1/events/pipelines/test" \ + -H 'Content-Type: application/x-yaml' \ + -d $'processors: + - date: + field: time + formats: + - "%Y-%m-%d %H:%M:%S%.3f" + ignore_missing: true + - gsub: + fields: + - message + pattern: "\\\." + replacement: "-" + ignore_missing: true + +transform: + - fields: + - message + type: string + - field: time + type: time + index: timestamp' +``` + +此时 Pipeline 创建成功,可以使用 `dryrun` 接口测试该 Pipeline。我们使用一个错误的日志数据来测试,message 字段的值为数字类型,这会导致 Pipeline 处理失败。 + +```bash +curl -X "POST" "http://localhost:4000/v1/events/pipelines/dryrun?pipeline_name=test" \ + -H 'Content-Type: application/json' \ + -d $'{"message": 1998.08,"time":"2024-05-25 20:16:37.217"}' + +{"error":"Failed to execute pipeline, reason: gsub processor: expect string or array string, but got Float64(1998.08)"} +``` + +输出显示 Pipeline 处理失败,因为 `gsub` Processor 期望的是字符串类型,而不是浮点数类型。我们需要修改日志数据的格式,确保 Pipeline 能够正确处理。 +我们再将 message 字段的值修改为字符串类型,然后再次测试该 Pipeline。 + +```bash +curl -X "POST" "http://localhost:4000/v1/events/pipelines/dryrun?pipeline_name=test" \ + -H 'Content-Type: application/json' \ + -d $'{"message": "1998.08","time":"2024-05-25 20:16:37.217"}' +``` + +此时 Pipeline 处理成功,输出如下: + +```json +{ + "rows": [ + [ + { + "data_type": "STRING", + "key": "message", + "semantic_type": "FIELD", + "value": "1998-08" + }, + { + "data_type": "TIMESTAMP_NANOSECOND", + "key": "time", + "semantic_type": "TIMESTAMP", + "value": "2024-05-25 20:16:37.217+0000" + } + ] + ], + "schema": [ + { + "colume_type": "FIELD", + "data_type": "STRING", + "fulltext": false, + "name": "message" + }, + { + "colume_type": "TIMESTAMP", + "data_type": "TIMESTAMP_NANOSECOND", + "fulltext": false, + "name": "time" + } + ] +} +``` + +可以看到,`1998.08` 字符串中的 `.` 已经被替换为 `-`,Pipeline 处理成功。 \ No newline at end of file