Skip to content
This repository has been archived by the owner on Aug 31, 2022. It is now read-only.

Missing events #159

Open
cristifalcas opened this issue Nov 7, 2021 · 9 comments
Open

Missing events #159

cristifalcas opened this issue Nov 7, 2021 · 9 comments

Comments

@cristifalcas
Copy link

Latest version is not catching most events.
Compared with version 0.10, nothing is showing up in my logs now

@omauger
Copy link
Contributor

omauger commented Nov 8, 2021

Hi,
Have you modified your configuration to increase the throttlePeriod to catch your missing events ?

If it's done, please give us the value you set and an example of missing event in Json.

@cristifalcas
Copy link
Author

I set throttlePeriod from 5 to 300 and nothing changed.

Issues:

  1. pkg/kube/watcher.go needs to import _ "k8s.io/client-go/plugin/pkg/client/auth/gcp" in order to run it locally
  2. It throws a lot of warnings about client-side throttling. I put this in main.go to fix it:
+       kubeconfig.QPS = 1e6
+       kubeconfig.Burst = 1e6
  1. When the config is missing the namespace, seems it drops most of the events. Very few are printed. Hardcoding the namespace is showing a lot more events, but unfortunately only for that namespace
  2. It doesn't receive events from custom sources (kube-downscaler).

Config:

logLevel: debug
#logFormat: json
throttlePeriod: 5
# namespace: "custom-ns"
leaderElection:
  enabled: False
route:
  routes:
    - match:
        - receiver: "stdout"
receivers:
  - name: "stdout"
    stdout: {}

@mustafaakin
Copy link
Contributor

I've run some manual tests in minikube cluster before I released this feature, but that small volume probably did not amount to something. Will test it more to find out what's going on.

I don't use gcp @cristifalcas does other auth mechanisms need to be imprted that way too? It seems similar how database drivers are used in Go

@cristifalcas
Copy link
Author

@mustafaakin sorry, it didn't cross my mind that this is used on other clusters as well :) . I don't know how it is outside gcp.

I found the slowness in my case. The calls for GetObject(reference, l.clientset, l.dynClient) break everything. I think they return too slow? Returning nil, nil in GetLabelsWithCache and GetAnnotationsWithCache fixed all my issues.

I have to admit that there are hundreds of events per second in my cluster.
Maybe when it watches for all namespaces it gets overwhelmed when it tries to get the labels and annotations?

@superbrothers
Copy link

superbrothers commented Nov 15, 2021

ETA: This was restored when I changed the throttlePeriod value 🙇


This is also the case in my environment. After upgrading to v0.11, event-exporter can no longer track events at all.

logLevel: error
logFormat: json
route:
  routes:
  - drop:
    - type: Normal
    match:
    - receiver: dump
      kind: Pod
receivers:
- name: dump
  stdout: {}

@DaveOHenry
Copy link

DaveOHenry commented Nov 22, 2021

Most likely not related to the original issue, but I also spotted missing events: #163

@ujo-trackunit
Copy link

ujo-trackunit commented Jan 3, 2022

Also seeing issues with throttling in 0.11:

I0103 11:46:44.912506       1 request.go:665] Waited for 1.157283329s due to client-side throttling, not priority and fairness, request: GET:https://172.20.0.1:443/apis/apiregistration.k8s.io/v1beta1?timeout=32s

and fewer events. Issue with no events when upgrading was fixed by not setting layout when using json as logFormat - my config looks like this:

config.yaml: |
  leaderElection: {}
  logFormat: json
  logLevel: info
  receivers:
  - file:
      path: /dev/stdout
    name: dump
  route:
    routes:
    - match:
      - receiver: dump

@ncgee
Copy link

ncgee commented Feb 9, 2022

Having similar issues with similar configurations, using throttle periods anywhere between 1 and 300. Using both stdout and elasticsearch sinks shows similar (if not exactly the same) number of events (e.g. ~270 in the last hour), but nowhere near the actual number of events (~1700).
Watching the logs and events simultaneously, it seems that it occurs most often when there is a larger number of events happening concurrently/in rapid succession. Current configuration:

  config.yaml: |
    leaderElection: {}
    logFormat: json
    logLevel: info
    receivers:
    - elasticsearch:
        deDot: true
        hosts:
        - https://elasticsearch:9200
        indexFormat: events-{2006-01-02}
        tls:
          insecureSkipVerify: true
      name: elasticsearch
    - name: stdout
      stdout:
        layout:
          customEvent: |
            {{ toJson . }}
    route:
      routes:
      - match:
        - receiver: stdout
        - receiver: elasticsearch
    throttlePeriod: 10

@ankit-arora-369
Copy link

@ncgee I am also facing the same issue where difference in the count of events coming and events getting sinked is too high.
100s of events are coming per second and kube-events-exporter is also showing them in logs but "sink" events are very low.

Did you get any solution for that?

I have described everything here #192

@omauger Can you please help here?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants