-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kong proxy goes unresponsive with normal traffic - #8152 issue - kubelet restarts the container since liveness probe fails #11710
Comments
@kranthirachakonda How large is your configuration (number of routes/services/consumers), roughly? |
@hanshuebner approx |
@kranthirachakonda The number of Kong entities seems to be moderate, so it is not a sizing issue. Are you able to monitor DNS traffic made by the Kong pod? |
@hanshuebner @nowNick - Yes, dns traffic increased and we tried below and none helped in stopping from kong proxy going into unresponsive mode. Changed ndots from 5 to 2dns_valid_ttl: 53dns_valid_ttl: 30
|
@kranthirachakonda Can you provide us with some information regarding the DNS traffic that you see? Are the requests that are sent all different, or are there multiple requests for the same name? How many requests do you see? |
We are able to reproduce the issue in our non-prod environment also , with couple of service endpoints with mix of internal and external hostnames (fqdn). I feel when one/few of the problematic route/service is invoked worker processes or either running out of timers or some capacity which is making it to unresponsive. Meaning at that scenario /status page or any service doesnt respond for ~45 secs. Based on our grafana chart and real-time top I dont see high CPU/Memory - CPU max is 500m and Memory max is 512Mi The DNS traffic and traffic onto the kong proxy is normal e.g. |
Where do you see those? When Kong hangs, do you see a lot of open network connections? |
Ya about 200 time_waits I see http latency about 100s for few calls, and some times 100% cpu usage for the kong proxy container alone. I am able to reproduce same issue even in 3.2.2 version also. Any help on how i can debug further. |
We found the issue with one of the external api timeouts and our custom plugin’s which caused worker processes to go into cpu 100%. Updating those fixed our issue. |
@kranthirachakonda Can you please share how do you identify the root cause? I think i'm facing the same issue hare, i'm newer with kong so it will be great if can you help me with some tricks to identify. |
Discussed in #11709
Originally posted by kranthirachakonda October 7, 2023
Looks like we are running into Kong #8152 issue with 2.8.3 and also 3.0.1. We had new route/service added recently and there is increase in traffic but not huge. May ~15% increase. Only proxy container inside Kong pods gets restart due to /status timing out. We tried changing the timeout/period intervals but that didnt help much. We dont see our containers going above the cpu/memory requests.
Behavior is identical, where kong proxy container goes into unresponsive mode for about 4-5 mins and processes again. We also observed that while I gave memory requests of 4G, proxy container never goes above 512Mi. Not sure if there are default limits within LuaVM or threads etc. Can you please help me in debugging further.
db-less
kong:2.8.3
kubernetes-ingress-controller:2.8.2
Events:
These are few errors I see while proxy containers goes into a condition where status page fails to respond within 5s, causing kubelet to restart when liveness probe fails.
Admin page
Status page
The text was updated successfully, but these errors were encountered: