Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enhance fluentbit process logging for plugin #249

Open
nwsparks opened this issue May 13, 2022 · 8 comments
Open

enhance fluentbit process logging for plugin #249

nwsparks opened this issue May 13, 2022 · 8 comments

Comments

@nwsparks
Copy link

when templating fails, it's very difficult to identify the source. the logs are flooded with this

time="2022-05-13T14:33:22Z" level=error msg="[cloudwatch 0] parsing log_group_name template '/eks/eks/$(kubernetes['namespace_name'])/$(kubernetes['labels']['k8s-app'])' (using value of default_log_group_name instead): k8s-app: sub-tag name not found"

but there is no way to identify where it is coming from.

@zhonghui12
Copy link
Contributor

@nwsparks could you please explain a little bit about what you want to improve the logging?

@nwsparks
Copy link
Author

@zhonghui12 if the error log I posted from when templating fails contained the name of the pod that was causing this it would help in identifying the source of the error generation.

These error logs generate a TON of volume and in a system with many deployments it is very difficult to track down the source.

@nwsparks nwsparks changed the title enhance logging enhance fluentbit process logging for plugin May 20, 2022
@nwsparks
Copy link
Author

edited the subject to make it more clear that this is for the fluentbit process logs

@jeremiasroma
Copy link

Facing a similar issue. In one hour that has generated 53 million records with the same error "sub-tag name not found":

Screen Shot 2022-08-09 at 4 11 35 pm

btw the day on the screenshot was the day we deployed the component to the cluster. We disabled it due to the Cloudwatch high ingestion costs.

Cluster details:

eks version 1.22
helm chart repo: https://aws.github.io/eks-charts
helm chart release_name: aws-for-fluent-bit
helm chart version: 0.1.18

@chaosun-abnormalsecurity

Getting the same issue. I checked the logs that were forwarded to the default log group in CloudWatch, and saw kubernetes['labels'] clearly contains the sub-tag.

@Mattie112
Copy link

Did you ever fixed this?

We used to run https://github.com/DNXLabs/terraform-aws-eks-cloudwatch-logs

That uses the following config:

  set {
    name  = "cloudWatch.logGroupName"
    value = "/aws/eks/${var.cluster_name}/$(kubernetes['labels']['app'])"
  }

But that repo is no longer maintained so I just got the Helm chart https://artifacthub.io/packages/helm/aws/aws-for-fluent-bit and set the config value listed above. But now I got the the following errors:

time="2023-03-21T09:49:24Z" level=error msg="[cloudwatch 0] parsing log_group_name template '/aws/eks/staging/$(kubernetes['labels']['app'])' (using value of default_log_group_name instead): app: sub-tag name not found"

@jeremiasroma
Copy link

jeremiasroma commented Jun 8, 2023

@Mattie112 If you're still using the DNXLabs module, please update it to version 0.1.5
One of the changes is that instead of using a label "app" which may not exist, I've set "app.kubernetes.io/name" as it's one of the default k8s labels.

According to the plugin's doco, they've released a new version of the Cloudwatch plugin which brings better performance and other improvements.

The main issue we've seen with the old cloudwatch plugin is that it couldn't handle logs if a label did not exist in a pod definition. The new cloudwatchlogs plugin uses a logGroupTemplate rather than a fixed logroupname, if that template does not exit, it shuffles back to the default logroup "/aws/eks/fluentbit-cloudwatch/logs".

Thanks

@Mattie112
Copy link

Mattie112 commented Jun 8, 2023

Thanks! I have switched to https://github.com/aws/aws-for-fluent-bit I prefer to have something that is maintained :) I don't really see the added benefit of this chart other than saving a few lines of code.

And indeed I am now using the namespace_name instead of the app label. I would still prefer the label but hey the namespace is for 99% of our cases fine :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants