You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are running uwsgi service as our primary WSGI for a Django-based project with the following configuration:
[uwsgi]
chdir=/var/www/rest-api-python/module=etc.wsgi:applicationhome=/var/www/rest-api-python/envmaster=trueprocesses=4# Processes should be equal to no of vcpus of machinethreads=15socket=/var/run/python/django_server.sockchmod-socket=750#vacuum = trueuid=www-datagid=www-datatouch-reload=/var/run/python/reloadmaster-fifo=/var/www/pipes/rest-api-python-fifolazy-apps=trueenv=DJANGO_SETTINGS_MODULE=etc.settingssafe-pidfile=/var/run/python/django-server.pidharakiri=1000#limit-as = 128max-requests=30000route-uri=^/python/(.*) rewrite:/$1enable-threads=truedisable-logging=truelog-4xx=truelog-5xx=trueenv=prometheus_multiproc_dir=/var/run/python/prometheus/listen=250thunder-lock=true
What we are observing here is that at high request loads, all the workers die around the same time making the service unavailable for further requests. These workers eventually get respawned but during this period no new request gets served. Following are the uwsgi logs for the same:
[deadlock-detector] a process holding a robust mutex died. recovering...
[deadlock-detector] a process holding a robust mutex died. recovering...
Fri Feb 16 07:30:38 2024 - worker 2 (pid: 2214) is taking too much time to die...NO MERCY !!!
DAMN ! worker 2 (pid: 2214) died, killed by signal 9 :( trying respawn ...
Respawned uWSGI worker 2 (new pid: 67074)
Fri Feb 16 07:30:40 2024 - worker 1 (pid: 2213) is taking too much time to die...NO MERCY !!!
Fri Feb 16 07:30:40 2024 - worker 3 (pid: 2215) is taking too much time to die...NO MERCY !!!
Fri Feb 16 07:30:40 2024 - worker 4 (pid: 2216) is taking too much time to die...NO MERCY !!!
WSGI app 0 (mountpoint='') ready in 2 seconds on interpreter 0x55936c12fb00 pid: 67074 (default app)
DAMN ! worker 1 (pid: 2213) died, killed by signal 9 :( trying respawn ...
Respawned uWSGI worker 1 (new pid: 67108)
DAMN ! worker 3 (pid: 2215) died, killed by signal 9 :( trying respawn ...
Respawned uWSGI worker 3 (new pid: 67109)
DAMN ! worker 4 (pid: 2216) died, killed by signal 9 :( trying respawn ...
Respawned uWSGI worker 4 (new pid: 67110)
Also sometimes we also face the following error just after the deadlock error:
corrupted double-linked list
worker 2 killed successfully (pid: 133287)
Respawned uWSGI worker 2 (new pid: 201887)
Can you please let us know if we have any configuration-level issues here or if it's something completely unrelated?
real-time non-blocking time (microseconds, -R) unlimited
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 62311
max locked memory (kbytes, -l) 2000660
max memory size (kbytes, -m) unlimited
open files (-n) 65536
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 62311
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
Furthermore, i am also attaching uwsgi startup logs. have a feeling that it might help:
*** Starting uWSGI 2.0.18 (64bit) on [Wed Feb 21 06:21:02 2024] ***
compiled with version: 7.5.0 on 14 July 2020 07:19:16
os: Linux-6.2.0-1012-aws #12~22.04.1-Ubuntu SMP Thu Sep 7 14:01:24 UTC 2023
nodename: ip-10-61-33-114
machine: x86_64
clock source: unix
pcre jit disabled
detected number of CPU cores: 8
current working directory: /
detected binary path: /var/www/rest-api-python/env/bin/uwsgi
*** dumping internal routing table ***
[rule: 0] subject: request_uri regexp: ^/python/(.*) action: rewrite:/$1
*** end of the internal routing table ***
chdir() to /var/www/rest-api-python/
your processes number limit is 62311
your memory page size is 4096 bytes
*** WARNING: you have enabled harakiri without post buffering. Slow upload could be rejected on post-unbuffered webservers ***
detected max file descriptor number: 1024
lock engine: pthread robust mutexes
thunder lock: enabled
uwsgi socket 0 bound to UNIX address /var/run/python/django_server.sock fd 4
Python version: 3.6.9 (default, May 5 2020, 05:01:21) [GCC 7.5.0]
PEP 405 virtualenv detected: /var/www/rest-api-python/env
Set PythonHome to /var/www/rest-api-python/env
Python main interpreter initialized at 0x55f181a66b00
python threads support enabled
your server socket listen backlog is limited to 250 connections
your mercy for graceful operations on workers is 60 seconds
mapped 1096520 bytes (1070 KB) for 60 cores
*** Operational MODE: preforking+threaded ***
*** uWSGI is running in multiple interpreter mode ***
spawned uWSGI master process (pid: 789)
spawned uWSGI worker 1 (pid: 972, cores: 15)
writing pidfile to /var/run/python/django-server.pid
spawned uWSGI worker 2 (pid: 973, cores: 15)
writing pidfile to /var/run/python/django-server.pid
spawned uWSGI worker 3 (pid: 974, cores: 15)
writing pidfile to /var/run/python/django-server.pid
spawned uWSGI worker 4 (pid: 975, cores: 15)
writing pidfile to /var/run/python/django-server.pid
writing pidfile to /var/run/python/django-server.pid
unable to stat() /var/run/python/reload, events will be triggered as soon as the file is created
WSGI app 0 (mountpoint='') ready in 7 seconds on interpreter 0x55f181a66b00 pid: 975 (default app)
WSGI app 0 (mountpoint='') ready in 7 seconds on interpreter 0x55f181a66b00 pid: 974 (default app)
WSGI app 0 (mountpoint='') ready in 7 seconds on interpreter 0x55f181a66b00 pid: 973 (default app)
WSGI app 0 (mountpoint='') ready in 7 seconds on interpreter 0x55f181a66b00 pid: 972 (default app)
The text was updated successfully, but these errors were encountered:
malloc_consolidate(): unaligned fastbin chunk detected , corrupted double-linked list There are different kind of messages - on different occasions - we see with deadlock detection.
Hi,
We are running uwsgi service as our primary WSGI for a Django-based project with the following configuration:
What we are observing here is that at high request loads, all the workers die around the same time making the service unavailable for further requests. These workers eventually get respawned but during this period no new request gets served. Following are the uwsgi logs for the same:
Also sometimes we also face the following error just after the deadlock error:
Can you please let us know if we have any configuration-level issues here or if it's something completely unrelated?
System config:
Python: 3.6.9
OS: Ubuntu "22.04.3 LTS (Jammy Jellyfish)"
Django: 2.2.4
uWSGI: 2.0.18
Our machine config is as follows:
vCPUs | 8
Memory (GiB) | 16.0
Memory per vCPU (GiB) | 2.0
Physical Processor | AMD EPYC 7R13 Processor
Clock Speed (GHz) | 3.6
CPU Architecture | x86_64
Ulimits of our machine:
Furthermore, i am also attaching uwsgi startup logs. have a feeling that it might help:
The text was updated successfully, but these errors were encountered: