Find the number of times SpamAssassin has to run

In this article we’ll cover how you can find the number of times SpamAssassin has to run for your users on your VPS or dedicated server.

Can SpamAssassin slow down a server?
While having SpamAssassin enabled for your users is a great option to help reduce the amount of spam that they deal with, if one particular user on your server is having an excessive amount of spam being processed for their account by SpamAssassin, it can lead to an increase in server demand.

When should I review SpamAssassin logs?
If you happen to have a server load monitoring script setup to email you when the load on your server is spiking, or if you’ve reviewed my article on advanced server load monitoring and noticed that your server’s load is spiking at times, it would be good to review how often SpamAssassin is running for the accounts on your server.

How can SpamAssassin slow down a server?
If one user on your server is filling out their email address on every marketing list they come across, or placing their email address in public places, at some point they could be receiving hundreds if not thousands of spam messages a day. Trying to have your server handle all of these could possibly be causing websites to run a bit slower, or delay other users trying to access their own email.

Please note that in order to follow the steps below, you’ll need to have root access to your VPS or dedicated server, so that you have access to the SpamAssassin logs.

Locate users with highest SpamAssassin executions

  1. Login to your server via SSH as the root user.
  2. Run the following command:
    grep "SpamAssassin as" /var/log/exim_mainlog | awk -F"SpamAssassin as " '{print $2}' |
    awk '{print $1}' | sort | uniq -c | sort -n

    Code breakdown:

    grep “SpamAssassin as” /var/log/exim_mainlog Locate mentions of SpamAssassin in the Exim mail log.
    awk -F”SpamAssassin as ” ‘{print $2}’ | awk ‘{print $1}’ Use the awk command with the Field seperator set to SpamAssassin as and print out the 2nd set of data following that. Then use awk again to only print out the first column of data (usernames).
    sort | uniq -c | sort -n Sort the users by name, then uniquely count them, and finally sort them numerically by lowest to highest.

    You should get back something like:
    3783 unserna1
    4339 userna6
    5111 userna3
    6588 userna5

    So now we know that the userna5 user has had SpamAssassin run on at least 6,588 emails.

  3. Now we can take a look to see how often this user is having to have SpamAssassin scan messages with the following command:
    grep "SpamAssassin as userna5" /var/log/exim_mainlog | sed -e 's#-# #g' -e 's#:# #g' |
    awk '{print $1"-"$2"-"$3,$4}' | uniq -c

    Code breakdown:

    grep “SpamAssassin as userna5” /var/log/exim_mainlog Locate mentions of SpamAssassin in the Exim mail log for the user userna5 who had the highest amount of messages.
    sed -e ‘s#-# #g’ -e ‘s#:# #g’ Use the sed command to replace the hyphens and the colons : that appear in the time stamps for the Exim mail log.
    awk ‘{print $1″-“$2”-“$3,$4}’ Use the awk command to print out the dates and just the hour column.
    uniq -c Uniquely count up the time stamps, to see how many times SpamAssassin had to run each hour.

    You should get back something like:

    15 2013-01-16 00
    25 2013-01-16 01
    28 2013-01-16 02
    31 2013-01-16 03
    32 2013-01-16 04
    26 2013-01-16 05
    40 2013-01-16 06
    70 2013-01-16 07
    126 2013-01-16 08
    117 2013-01-16 09
    154 2013-01-16 10
    183 2013-01-16 11
    186 2013-01-16 12
    155 2013-01-16 13
    128 2013-01-16 14
    145 2013-01-16 15
    69 2013-01-16 16

    So that’s about 1,530 times that SpamAssassin had to run today for that one user, and you can see that during some hours it had to run as many as 186 times.

You should now know how to locate users on your server that have high levels of SpamAssassin activity. You can disable SpamAssassin for these users if their usage is affecting your other users and request for them to use their own local mail filtering, or a 3rd party spam filtering service.

Leave a Reply