-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Redis-server killed by oom_killer at 19GB memory allocation #974
Comments
Seconded. I got these stats from redis just before it got killed again. I was interrupted at 80% of scanning a /21.
|
Okay. Been looking at it some more, and I saw a lot of memory usage, even as redis-server had been killed a moment ago. I dropped the db that seemed to be transient: -server_time_usec:1709405329080351
-uptime_in_seconds:588
+server_time_usec:1709406156005681
+uptime_in_seconds:1415
...
# Memory
-used_memory:3344099096
-used_memory_human:3.11G
-used_memory_rss:3001507840
-used_memory_rss_human:2.80G
+used_memory:171441592
+used_memory_human:163.50M
+used_memory_rss:194514944
+used_memory_rss_human:185.50M
...
# Keyspace
db0:keys=1,expires=0,avg_ttl=0
db1:keys=177456,expires=0,avg_ttl=0
-db6:keys=3468,expires=0,avg_ttl=0 With just 3400 entries in db6, it was apparently using those 2GB. The entries consisted of: $ sudo docker exec -it greenbone-community-container_redis-server_1 redis-cli -s /run/redis/redis.sock -n 6
redis /run/redis/redis.sock[6]> keys *
1) "Cache/node.example.com/8200/excluding_404_body/URL_/ui/vcav-bootstrap/rest/vcav-providers/config.neon"
2) "Cache/node.example.com/8200/excluding_404_body/URL_/ui/vcav-bootstrap/rest/WEB-INF/local.properties"
3) "Cache/node.example.com/8200/excluding_404_body/URL_/ui/vropspluginui/rest/services/.env.example"
4) "Cache/node.example.com/8200/excluding_404_body/URL_/ui/vcav-bootstrap/rest/vcav-providers/local.properties"
...
3466) "Cache/node.example.com/8200/excluding_404_body/URL_/ui/psc-ui/resources/core/config/databases.yml"
3467) "Cache/node.example.com/8200/excluding_404_body/URL_/ui/vcav-bootstrap/rest/vcav-providers/example.key"
3468) "Cache/node.example.com/8200/excluding_404_body/URL_/ui/h5-vsan/rest/proxy/service/phinx.yml" Not sure what was in those values, but it was big enough.. |
Moved this issue to the ospd-openvas repository to let the @greenbone/scanner-maintainers have a look at. |
Could be a duplicate of greenbone/openvas-scanner#1488 and might not even be an issue in ospd-openvas. More reading: |
Okay. I increased memory (and cpu) on the box, and now there is no excessive redis key usage anymore.. By increasing mem, I probably implemented fix no. 1 suggested at https://forum.greenbone.net/t/oom-killing-redis-on-large-scan-with-openvas/14251/2 -- "Prevent overloading the system [...]" I guess there is still something buggy, but I have no time to allocate to look into this deeper at the moment. Thanks for the suggested links! |
In the mean time I tried adding some limits, and now I'm running into this when doing a single scan: redis-server:
image: greenbone/redis-server
command:
# https://forum.greenbone.net/t/redis-oom-killed-on-one-host-scan/15722/5
- /bin/sh
- -c
- 'rm -f /run/redis/redis.sock && cat /etc/redis/redis.conf >/run/redis/redis.conf && printf "%s\n" "maxmemory 12884901888" "maxmemory-policy volatile-ttl" "maxclients 150" "tcp-keepalive 15" >>/run/redis/redis.conf && redis-server /run/redis/redis.conf'
logging:
driver: journald
restart: on-failure
volumes:
- redis_socket_vol:/run/redis/
🤔 |
Okay. It turned out the problem was indeed the "caching of web pages during CGI scanning" mentioned in oom-killing-redis-on-large-scan-with-openvas. A HashiCorp Vault server was yielding responses of about 770kB for lots and lots of scan URLs, like I did the following mini-patch: --- greenbone-community-container_vt_data_vol/_data/http_keepalive.inc.orig 2024-03-18 15:46:31.480951508 +0100
+++ greenbone-community-container_vt_data_vol/_data/http_keepalive.inc 2024-03-18 15:52:51.764904305 +0100
@@ -726,7 +726,8 @@ function http_get_cache( port, item, hos
# Internal Server Errors (5xx)
# Too Many Requests (429)
# Request Timeout (408)
- if( res !~ "^HTTP/1\.[01] (5(0[0-9]|1[01])|4(08|29))" )
+ # Size of response must be less than 1.5*64k
+ if( res !~ "^HTTP/1\.[01] (5(0[0-9]|1[01])|4(08|29))" && strlen( res ) < 98304 )
replace_kb_item( name:"Cache/" + host + "/" + port + "/" + key + "/URL_" + item, value:res );
} And now I finally got past that server without things breaking. |
Ok. That looks good. Of course it would be better if we could make that I did not see an easy way to query redis for the total size in a keyspace. But maybe the kbdb [1] itself could keep track of key count and value size. Thinking out loud... [1]
Doesn't look there is something available for that yet. Alternatively set the a TTL on these values, so that they can be purged with (edit) |
This really looks like a duplicate of greenbone/openvas-scanner#1488 and should be IMHO closed as such (as this is also not a problem in ospd-openvas) Maybe you could post your (btw. great) analysis over there to have all info collected at one place?
If this is the only problematic dir for this specific target system it should be also possible to exclude that whole dir in the setting |
I summarized the stuff here: greenbone/openvas-scanner#1488 (comment) I'm fine with closing. |
I have a relatively small installation of greenbone community containers running on docker (one VM). Only 85 targets and 8 tasks.
The VM containing 6 [email protected] and 16GB of vRAM
When I start more then one scan in Greenbone all scan will be stopped because of the openvas crash.
When the problem is happening the redis try to allocate more memory then all of the memory and swap of the VM.
The redis server print a log:
The vm.overcommit_memory is set in sysctl.conf like this:
The vm.overcommit_ration calculated by swap size
After the redis are killed, in the openvas log:
The text was updated successfully, but these errors were encountered: