ClamAV stopped scanning files #93

nidhigwari · 2022-07-21T15:21:30Z

We use Clamav for scanning files for our application. We use S3, SQS, clamAV integration. It seems to have stopped working suddenly.
Adding: clam av version: ClamAV 0.103.6

andreaswittig · 2022-07-21T15:34:51Z

Sorry, we are not providing support for this free/open-source project. Check out our solution bucketAV with professional support included: https://bucketav.com

rmerrellgr · 2022-07-21T15:37:02Z

Oddly enough, we had this happen to us yesterday as well. Any instance of the s3-virusscan that we had running on a t3.micro all suddenly died at the same. Log inspection lead us to find that they all ran out of RAM and OOM killer killed clamd, but when systemd tried restarting it, it couldn't. I don't know enough about how clamd works when it phones home to get signature updates, but one theory is that it pulled an update yesterday that maxed out all the ram on the smaller instances. We fixed it just by launching new instances.

andreaswittig · 2022-07-21T15:40:42Z

@nidhigwari Sorry, I was to fast and harsh.

@rmerrellgr Thanks for providing more context.

nidhigwari · 2022-07-21T16:04:51Z

Thanks @rmerrellgr!
We have launched new instances, but still the service is not working. We also see freshclam related error "WARNING: FreshClam previously received error code 429 or 403 from the ClamAV Content Delivery Network (CDN).
This means that you have been rate limited or blocked by the CDN."

andreaswittig · 2022-07-21T19:25:05Z

@nidhigwari ClamAV introduced very strict throttling limits. We have been running into those limits as well and are now hosting our own mirror of the malware database.

andreaswittig · 2022-07-21T19:26:47Z

Oddly enough, we had this happen to us yesterday as well. Any instance of the s3-virusscan that we had running on a t3.micro all suddenly died at the same. Log inspection lead us to find that they all ran out of RAM and OOM killer killed clamd, but when systemd tried restarting it, it couldn't. I don't know enough about how clamd works when it phones home to get signature updates, but one theory is that it pulled an update yesterday that maxed out all the ram on the smaller instances. We fixed it just by launching new instances.

Is it possible that you tried to scan a "large" S3 object? Did you check the dead-letter queue?

michaelwittig · 2022-07-21T19:30:06Z

@rmerrellgr what is the value of the SwapSize parameter?

awsnicolemurray · 2022-07-21T21:07:49Z

@nidhigwari ClamAV introduced very strict throttling limits. We have been running into those limits as well and are now hosting our own mirror of the malware database.

What is the recommendation? How does the customer determine if the issue is because of throttling? Currently no files are being scanned and the issue impacts dev, stag, and prod environments. All appear to have been impacted on the same day.

Please help us understand what changes were made since July 15th so we can determine the best course of action for troubleshooting.

rmerrellgr · 2022-07-21T21:20:34Z

@andreaswittig Nope, no large file scans (no scans at all for some time before the crash, actually). But as I suspected, this is what we find in the logs

Jul 20 11:41:09 clamd[27447]: Database correctly reloaded (8622752 signatures)
Jul 20 11:41:11 clamd[27447]: Activating the newly loaded database...
Jul 20 11:41:13 kernel: amazon-cloudwat invoked oom-killer:
(Followed by 100+ lines of OOM killer output, which ultimately lead to clamd being killed)
Jul 20 11:41:13 kernel: Killed process 27447 (clamd)
Jul 20 11:41:13 systemd: Unit [email protected] entered failed state.
Jul 20 11:41:14 systemd: [email protected] holdoff time over, scheduling restart.
Jul 20 11:48:14 systemd: [email protected] start operation timed out. Terminating.

At which point it just loops forever in this state of trying to start back up, but it can't. At this point, I just decided it would be easier to just launch replacement instances and be done with it.

I think it's safe to say that this isn't a Widdix problem. We have production level instances running on larger instances and they did not suffer the same fate as these. I just found it peculiar that our dev servers died unexpectedly and then someone else reported that there's did as well. I do not believe any action needs to be taken on your part, however.

And to answer your other question, these t3.micro instances have the SwapSize set to 2 in the CF config.

andreaswittig · 2022-07-22T07:34:09Z

@awsnicolemurray I'd recommend to check the logs.

andreaswittig · 2022-07-22T07:35:00Z

@rmerrellgr Interesting, haven't observed something like this before.

andreaswittig closed this as not planned Won't fix, can't repro, duplicate, stale Jul 21, 2022

andreaswittig reopened this Jul 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ClamAV stopped scanning files #93

ClamAV stopped scanning files #93

nidhigwari commented Jul 21, 2022 •

edited

Loading

andreaswittig commented Jul 21, 2022

rmerrellgr commented Jul 21, 2022

andreaswittig commented Jul 21, 2022

nidhigwari commented Jul 21, 2022

andreaswittig commented Jul 21, 2022 •

edited

Loading

andreaswittig commented Jul 21, 2022

michaelwittig commented Jul 21, 2022

awsnicolemurray commented Jul 21, 2022

rmerrellgr commented Jul 21, 2022 •

edited

Loading

andreaswittig commented Jul 22, 2022

andreaswittig commented Jul 22, 2022

ClamAV stopped scanning files #93

ClamAV stopped scanning files #93

Comments

nidhigwari commented Jul 21, 2022 • edited Loading

andreaswittig commented Jul 21, 2022

rmerrellgr commented Jul 21, 2022

andreaswittig commented Jul 21, 2022

nidhigwari commented Jul 21, 2022

andreaswittig commented Jul 21, 2022 • edited Loading

andreaswittig commented Jul 21, 2022

michaelwittig commented Jul 21, 2022

awsnicolemurray commented Jul 21, 2022

rmerrellgr commented Jul 21, 2022 • edited Loading

andreaswittig commented Jul 22, 2022

andreaswittig commented Jul 22, 2022

nidhigwari commented Jul 21, 2022 •

edited

Loading

andreaswittig commented Jul 21, 2022 •

edited

Loading

rmerrellgr commented Jul 21, 2022 •

edited

Loading