-
Notifications
You must be signed in to change notification settings - Fork 155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Upgrading to 1.4.1 out of memory issue #2902
Comments
Hey @styoo4001, I'm really puzzled as to what could be the cause of that. By chance, is it possible for you to enable logging and send the logs to us?
I.e. we'd be interested in the logs in the very timeframe of the memory spike. |
Alternatively, using the https://github.com/DataDog/ddprof native profiler and sharing the memory profiles with us could also be very interesting! |
@bwoebi |
A new version has already come out, I'm not sure if this log will help, but I've attached it.
|
The new version is not fixing memory growth issues though. There's nothing suspicious inside these logs sadly. Would you be willing to run the native profiler? I'm pretty certain that one will help us more. And tell us which service & what time. |
Bug report
Bug report
Hello,
Our company has been utilizing DataDog effectively, and we are running services in a Kubernetes environment.
Recently, after deploying a specific API server feature, approximately 30 minutes later, CPU usage significantly spiked, leading to an increase in pod scaling and a sharp rise in memory consumption, causing some pods to crash (out of memory). Notably, the php-fpm process did not exhibit any abnormal behavior.
After rolling back to last week's ArgoCD deployment image, the system stabilized. Initially, we assumed the issue was related to the deployed code, and we thoroughly investigated various aspects. However, we couldn't find any evidence in APM or DataDog profiler. Crucially, the profiler showed that the application's memory usage was below 1GB, but under certain conditions, pod memory usage spiked from 1GB to 8GB within a minute.
Eventually, we discovered that the issue stemmed from the trace version being set to latest when building the deployment image. After changing it to last week's version (1.3.1) and redeploying, the system stabilized.
We confirmed that the issue was resolved following this change.
usual condition image
left : usual condition , right : unusual
My php settings are:
PHP version
8.1.13
Tracer or profiler version
1.4.1
Installed extensions
No response
Output of
phpinfo()
No response
Upgrading from
1.3.1
PHP version
8.1.13
Tracer or profiler version
1.4.1
Installed extensions
No response
Output of
phpinfo()
No response
Upgrading from
No response
The text was updated successfully, but these errors were encountered: