Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Processing logs direct from Kinesis (NOT via cloudwatch logs)? #120

Open
max-rocket-internet opened this issue Apr 8, 2019 · 8 comments

Comments

@max-rocket-internet
Copy link

We are using Graylog 2.5.1+34194da and want to skip cloudwatch logs and send our logs directly to Kinesis using aws-fluent-plugin-kinesis.

We did a quick test but saw some errors related to GZIP. This same issue was mentioned at the end of #86

Should this work?

@danotorrey
Copy link
Contributor

@max-rocket-internet Thanks for the details. The AWS Logs and AWS Flow Logs inputs in Graylog were designed to work directly with CloudWatch, so there is some hard-coded processing (GZIP and CloudWatch JSON object decoding). These are most likely causing the errors you are seeing.

Can you help me understand what the payload (log messages) look like that are being written to the Kinesis stream? We are planning new AWS development, and I would like to make sure we consider if we can support this use case. I can see how it would be useful for Graylog to support the ability to subscribe to and read a user-defined payload from a Kinesis stream.

Can you please also help us understand the reason for skipping CloudWatch all together? Any info we can gather that will help us understand how users use Graylog and AWS will definitely help us with planning our development efforts.

Looping in @kroepke for reference.

@max-rocket-internet
Copy link
Author

Hey @danotorrey
Thanks for the reply.

The AWS Logs and AWS Flow Logs inputs in Graylog were designed to work directly with CloudWatch, so there is some hard-coded processing (GZIP and CloudWatch JSON object decoding). These are most likely causing the errors you are seeing.

Ah, makes sense.

Can you help me understand what the payload (log messages) look like that are being written to the Kinesis stream?

Sure. It's quite simple, it's just aws-fluent-plugin-kinesis running, docker image is fluent/fluentd-kubernetes-daemonset:v1.3.3-debian-kinesis-1.3 made by fluentd. It simply runs as a daemonset on our k8s nodes and collects logs. Nothing fancy, it's a pretty standard logging daemonset. Same as any other fluentd or fluent-bit setup but with a different output.

Can you please also help us understand the reason for skipping CloudWatch all together?

Also quite simple: Why would we want our logs in cloudwatch logs at all? It's just a stepping stone before Kinesis and then Graylog. Our log volume per day is about 1TB and currently Cloudwatch Logs costs us about $18k/month, so that's also a pretty big motivator 😅

@srlucken
Copy link

Hello @max-rocket-internet - Have you found a work around for this? I have many Kinesis streams I want to integrate with Graylog however I'm getting the same "Not in GZIP format" error message you mentioned.

@max-rocket-internet
Copy link
Author

@srlucken

Nope. I don't think a work around is possible, it's simply not supported. We've moved to Datadog now.

@kroepke
Copy link
Member

kroepke commented May 31, 2019

Hello @max-rocket-internet - Have you found a work around for this? I have many Kinesis streams I want to integrate with Graylog however I'm getting the same "Not in GZIP format" error message you mentioned.

@srlucken Which formats would you expect to the sending to Graylog this way?
The OP was using the fluent plugin, which apparently has a fixed proto2 transport encoding, so that would need to be implemented directly anyway, but other formats might be easier to support as we improve our AWS Kinesis support.

Thanks!

@srlucken
Copy link

Hello @kroepke - Currently the main format we're sending to Graylog is JSON. In regards to improved AWS Kinesis support, does Graylog have anything currently in development or on a roadmap that might meet this need in the near future?

@danotorrey
Copy link
Contributor

@srlucken Direct Kinesis support for arbitrary/custom log formats is definitely on our radar and will likely be supported in a future release. We are still working out the details for how to handle the various log formats that might be supplied.

Are you writing a distinct JSON document within the data payload for each Kinesis record? The current thinking is that we could directly extract the payload and convert it to the string and either directly parse the JSON and extract distinct fields, or provide some other parsing means. Can you provide a sample of what your JSON payload looks like? This will help us as we continue to investigate this.

@srlucken
Copy link

srlucken commented Jun 6, 2019

@danotorrey Thank you for the response. Here is a sample JSON payload.

{ "type": "type", "auth": { "client_token": "clientToken", "accessor": "accessor", "display_name": "displayName", "policies": [ "policy", "policy" ], "token_policies": [ "token", "policy" ], "metadata": { "role_name": "roleName" }, "entity_id": "entityId", "token_type": "tokenType" }, "request": { "id": "id", "operation": "operation", "client_token": "token:token", "client_token_accessor": "token:token", "namespace": { "id": "id", "path": "path/" }, "path": "path/path", "data": null, "policy_override": false, "remote_address": "1.1.1.1", "wrap_ttl": 0, "headers": {} }, "error": "" }

@deeshe deeshe added the feature label Jun 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants