Chapter 30. Showing SPAM statistics

Table of Contents

30.1. Introduction and purpose
30.2. Step 1: Parsing the log file
30.3. Step 2: Creating the graph
30.4. Step 3: Uploading the image file to a server
30.5. Step 0: The full driver script

30.1. Introduction and purpose

This worked example will show hot to setup a graph that is periodically generated offline to show Spam statistics for a mail server. The graph will show daily summaries for the three things; a) The total number of identified spam mails b) The number of immediately deleted spams and c) The number of suspected spams that was stored in the "spam" folder instead of being immediately deleted.

An example of the graph we will generate is shown in Figure 30.1

Figure 30.1. Spam statistics

Spam statistics

The graph above makes two assumption;

  1. The spam setup has two levels of identifications and how spams are handled. Suspected email are either deleted immediately (by sending them to dev/null) or storing the email in the users spam folder.

  2. The log files from the mail server are available for analysis.

In the following we will construct a complete PHP command line script that will be run periodically and analyse the email logs and produce a graph similar to what is shown above. The script assume that the log file uses procmail log format so that the folder where each mail is stored are logged.

Warning

For very high load email servers doing log file analysis in PHP is probably not a very good idea due to performance reasons in regards to both time and memory constraints. We do not make any claim that the scripts below is enough optimized to be used on high volume mail servers.

The script will consist of three parts

  1. A parser to scan the log file and create the data

  2. A suitable graph script to create an accumulated bar graph

  3. Uploading of the created image file with the graph to a server where it will be displayed

We will therefore use three classes that corresponds to each step above.

To define the different files and ftp credentials we will use the following symbolic constants which will need to be defined depending on the system setup. Constants that must be adjusted is marked as "...".

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
<?php
/ FTP Server credentials
DEFINE('FTP_SERVER','...');
DEFINE('FTP_UID','...');
DEFINE('FTP_PWD','...');
 
// Directory on FTP server where the image should be stored
DEFINE('FTP_DIR','...');
 
// Which procmail logfile to read
DEFINE('PROCMAIL_LOGFILE','...');
 
// 2 Weeks windows to display
DEFINE('WINDOWSIZE',14); 
 
// Where to store the temporary image file
DEFINE('IMGFILE','/tmp/spamstat.png');
?>

The whole process is then driven by the following relative small main script:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
<?php
// Use the text based error handling and log potential errors to the 
// system default system logger
JpGraphError::SetImageFlag(false); 
JpGraphError::SetLogFile('syslog');
JpGraphError::SetTitle('Spamstat Message: ');
 
//---------------------------------------------------------------------------    
// Step 1) Get the statistics. We return a window of WINDOWSIZE days
//---------------------------------------------------------------------------
$parser = new ParseProcmailLogFile(PROCMAIL_LOGFILE);
list($xdata, $ydata, $y2data) = $parser->GetStat(WINDOWSIZE);
 
//---------------------------------------------------------------------------
// Step 2) Create the graph and store it in the file IMGFILE
//---------------------------------------------------------------------------
$width = 650; $height = 420;
$sgraph = new SpamGraph($width,$height);
$sgraph->Create(IMGFILE,$xdata,$ydata,$y2data);
 
//---------------------------------------------------------------------------
// Step 3) Upload the file to the FTP_SERVER server and store the
// local file IMGFILE with the same base name in directory FTP_DIR  
//---------------------------------------------------------------------------
$ftp = new FTPUploader(FTP_SERVER,FTP_UID,FTP_PWD);
$ftp->Upload(IMGFILE,FTP_DIR);
?>

For brevity we have excluded the lines that defines the symbolic constants above and also the inclusion of the necessary library files.

In the following sections we will shortly discuss each of the three support classes.

Note

An actual usage of this can be found on JpGraph home page where the result of this script is run by a daily cronjob. The graph is available at http://www.aditus.nu/jpgraph/spamstat.php