-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
S3 plugin not functioning correctly for GZ files from Firehose #180
Comments
Hi @yaauie, Thanks a lot. |
Hi @yaauie ! I was hoping to check on the plan to merge the above changes into the plugin? Regards, |
@apatnaik14 I am in similar boat as you! Wondering if you had any luck with other workarounds you may have tried? |
Hey @mrudrara , |
Thanks @Luk3rson! Really appreciate it. Wondering if you had issues with too many lambda invocations ever? |
@Luk3rson can you share the function may be gist thanks in advance |
Hi @mrudrara |
Hi @Luk3rson Really appreciate it. Meanwhile while working AWS Support engineer they also recommended "Data Transformation with Lambda" |
If the folder only contains gz logs then you can add this filter in the s3 plugin (https://www.elastic.co/guide/en/logstash/current/plugins-inputs-s3.html#plugins-inputs-s3-gzip_pattern)
So that input plugin will treat the files as gz without appending a gz extension using the lambda |
I was testing the s3 plugin for a production POC where a Firehose delivery system is delivering Cloudwatch logs into an S3 bucket from where I am reading it with the S3 plugin into logstash
My logstash config is as below:
input {
s3 {
bucket => "test"
region => "us-east-1"
role_arn => "test"
interval => 10
additional_settings => {
"force_path_style" => true
"follow_redirects" => false
}
}
}
output {
elasticsearch {
hosts => ["http://localhost:9200"]
sniffing => false
index => "s3-logs-%{+YYYY-MM-dd}"
}
stdout { codec => rubydebug }
}
As I start up logstash locally, I can see the data reaching to logstash but its not in proper format, like below.
{
"type" => "s3",
"message" => "\u001F�\b\u0000\u0000\u0000\u0000\u0000\u0000\u0000͒�n\u00131\u0010�_��\u0015�����x���MC)\u0005D\u0016!**************************************",
"@Version" => "1",
"@timestamp" => 2019-07-12T15:32:37.328Z
}
I also tried adding a codec => "gzip_lines" into the configuration, but then logstash was not able to process those files at all. The documentation suggests S3 plugin is supposed to support GZ files out of the box. I was hoping if anyone could point out what I am doing wrong?
Regards,
Arpan
Please find below version and OS information.
The text was updated successfully, but these errors were encountered: