Preparing the SQL database in SpamAssassin

January 19, 2026 by Roberto Puzzanghera 6 comments

Choosing the right perl DataBase Interface (DBI) module

Depending on whether you use MySQL or MariaDB, and if you use MySQL things also depend on the version, you have to install DBD::mysql or DBD::MariaDB perl DBI module. Other connectors for SQLite or Oracle are not covered here.

Here's a summary of the situation, which has become quite complicated:

DB server DB server version Connector Connector version
MariaDB any DBD::MariaDB latest version
MySQL 8.x or newer DBD::mysql 5.x
MySQL prior to v. 8 DBD::mysql 4.054 or latest of 4.x series

Install the perl module as follows (-T avoids the tests, capitalization is important in module's names):

MariaDB users:

cpan -T install DBD::MariaDB

Users with MySQL v. 8.x or newer:

cpan -T install DBD::mysql

MySQL users with version prior to v. 8.x have to call the specific version in this way, otherwise the latest version from 5.x series will be installed:

cpan -T install DVEEDEN/DBD-mysql-4.054.tar.gz

Setting up the database

Enter the mysql prompt as root and create the database and the user:

> mysql -u root -p

CREATE DATABASE spamassassin;
CREATE USER 'spamassassin'@'localhost' IDENTIFIED BY '***';
GRANT USAGE ON * . * TO 'spamassassin'@'localhost' IDENTIFIED BY '***' WITH MAX_QUERIES_PER_HOUR 0 MAX_CONNECTIONS_PER_HOUR 0 MAX_UPDATES_PER_HOUR 0 MAX_USER_CONNECTIONS 0 ;

Setting up the tables' layout

Let's prepare in advance the layout for the tables of the vaious plugins that we are going to install later:

USE spamassassin;

-- userprefs
CREATE TABLE userpref (
  username varchar(100) NOT NULL default '',
  preference varchar(50) NOT NULL default '',
  value varchar(255) NOT NULL default '',
  prefid int(11) NOT NULL auto_increment,
  ts timestamp NOT NULL DEFAULT current_timestamp() ON UPDATE current_timestamp(),
  PRIMARY KEY  (prefid),
  KEY username (username)
) ENGINE=InnoDB; 
GRANT ALL PRIVILEGES ON spamassassin . * TO 'spamassassin'@'localhost';
-- The column holding the timestamp can be useful in the future to purge very old records

-- TxRep
CREATE TABLE txrep (
  username varchar(100) NOT NULL default '',
  email varchar(255) NOT NULL default '',
  ip varchar(40) NOT NULL default '',
  msgcount int(11) NOT NULL default '0',
  totscore float NOT NULL default '0',
  signedby varchar(255) NOT NULL default '',
  last_hit timestamp NOT NULL default CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (username,email,signedby,ip),
  KEY last_hit (last_hit)
) ENGINE=InnoDB;

-- Bayes
CREATE TABLE bayes_expire (
  id int(11) NOT NULL default '0',
  runtime int(11) NOT NULL default '0',
  KEY bayes_expire_idx1 (id)
) ENGINE=InnoDB;

CREATE TABLE bayes_global_vars (
  variable varchar(30) NOT NULL default '',
  value varchar(200) NOT NULL default '',
  PRIMARY KEY  (variable)
) ENGINE=InnoDB;

INSERT INTO bayes_global_vars VALUES ('VERSION','3');

CREATE TABLE bayes_seen (
  id int(11) NOT NULL default '0',
  msgid varchar(200) binary NOT NULL default '',
  flag char(1) NOT NULL default '',
  PRIMARY KEY  (id,msgid)
) ENGINE=InnoDB;

CREATE TABLE bayes_token (
  id int(11) NOT NULL default '0',
  token binary(5) NOT NULL default '',
  spam_count int(11) NOT NULL default '0',
  ham_count int(11) NOT NULL default '0',
  atime int(11) NOT NULL default '0',
  PRIMARY KEY  (id, token),
  INDEX bayes_token_idx1 (id, atime)
) ENGINE=InnoDB;

CREATE TABLE bayes_vars (
  id int(11) NOT NULL AUTO_INCREMENT,
  username varchar(200) NOT NULL default '',
  spam_count int(11) NOT NULL default '0',
  ham_count int(11) NOT NULL default '0',
  token_count int(11) NOT NULL default '0',
  last_expire int(11) NOT NULL default '0',
  last_atime_delta int(11) NOT NULL default '0',
  last_expire_reduce int(11) NOT NULL default '0',
  oldest_token_age int(11) NOT NULL default '2147483647',
  newest_token_age int(11) NOT NULL default '0',
  PRIMARY KEY  (id),
  UNIQUE bayes_vars_idx1 (username)
) ENGINE=InnoDB;

-- DecodeShortURLs
CREATE TABLE `short_url_cache` (
  `short_url` VARCHAR(255) NOT NULL,
  `decoded_url` VARCHAR(512) NOT NULL,
  `hits` INT NOT NULL DEFAULT 1,
  `created` INT(11) NOT NULL,
  `modified` INT(11) NOT NULL,
  PRIMARY KEY (`short_url`)
) ENGINE = InnoDB;
-- Maintaining index for cleaning is likely more expensive than occasional full table scan
ALTER TABLE `short_url_cache` ADD INDEX `short_url_created` (`created`);

-- Redirectors
CREATE TABLE `redir_url_cache` (
  `redir_url` VARCHAR(255) NOT NULL,
  `target_url` VARCHAR(512) NOT NULL,
  `hits` INT NOT NULL DEFAULT 1,
  `created` INT(11) NOT NULL,
  `modified` INT(11) NOT NULL,
  PRIMARY KEY (`redir_url`)
) ENGINE = InnoDB;
-- Maintaining index for cleaning is likely more expensive than occasional full table scan
ALTER TABLE `redir_url_cache` ADD INDEX `redir_url_created` (`created`);

Configuring DecodeShortURLs and DecodeShortURLs plugins

DecodeShortURLs / UrlShortener automatically expands short URLs (tinyurl, bit.ly, etc.) into canonical URLs and normalizes URLs for scoring. It also has a cache in the database to avoid repeated requests.

Redirectors follows real HTTP redirects (301, 302, meta-refresh, etc.) to get the final destination of a URL. It also caches the result in the database.

The settings for the various plugins will be presented in the following pages. Let's configure here in this page just the DecodeShortURLs and Redirectors plugins, as their settings are very simple.

First of all enable them uncommenting the following in the v402.pre file

loadplugin Mail::SpamAssassin::Plugin::Redirectors

and the following in the v400.pre file:

loadplugin Mail::SpamAssassin::Plugin::DecodeShortURLs

Now add their settings in the 90-sql.conf file, which we will use to save all SQL related settings:

# spamassassin MySQL user pwd
MYSQL_PWD=xxxxxxx
# mysql host
MYSQL_HOST=localhost
# perl DBI driver
PERL_DBI="DBD:MariaDB"

cat >> /etc/mail/spamassassin/90-sql.cf << __EOF__
# DecodeShortURLs plugin 
# https://spamassassin.apache.org/full/4.0.x/doc/Mail_SpamAssassin_Plugin_DecodeShortURLs.html 
# https://github.com/apache/spamassassin/blob/trunk/sql/decodeshorturl_mysql.sql 
url_shortener_cache_type        dbi
url_shortener_cache_dsn         ${PERL_DBI}:database=spamassassin;host=localhost
url_shortener_cache_username    spamassassin 
url_shortener_cache_password    ${MYSQL_PWD} 

# Redirectors plugin 
# https://spamassassin.apache.org/full/4.0.x/doc/Mail_SpamAssassin_Plugin_Redirectors.html 
# https://github.com/apache/spamassassin/blob/trunk/sql/redirectors_mysql.sql 
url_redirector_cache_type       dbi 
url_redirector_cache_dsn        ${PERL_DBI}:database=spamassassin;host=localhost
url_redirector_cache_username   spamassassin 
url_redirector_cache_password   ${MYSQL_PWD}

__EOF

In the above command, just set your MariaDB/MySQL password for the user spamassassin, and the connector (driver) that you are using in the PERL_DBI variable (DBD:MariaDB or DBD:mysql).

The MYSQL_HOST is usually localhost if your database server lives in the same host of SpamAssassin.

Purging the cache

Create a file /usr/local/bin/purge_sa_cache.sh in order to purge the records older than one year from the cache:

#!/bin/bash

DB_USER="spamassassin"
DB_PASS="password"
DB_NAME="spamassassin"
DB_HOST="localhost"
DB_PORT="3306"
MYSQL_BIN=/usr/bin/mysql
# purge Redirectors $MYSQL_BIN -u"$DB_USER" -p"$DB_PASS" -h "$DB_HOST" -P "$DB_PORT" "$DB_NAME" << __EOF__ DELETE FROM redir_url_cache WHERE created < UNIX_TIMESTAMP(DATE_SUB(NOW(), INTERVAL 1 YEAR));
# purge DecodeShortURLs DELETE FROM short_url_cache WHERE created < UNIX_TIMESTAMP(DATE_SUB(NOW(), INTERVAL 1 YEAR)); __EOF__

Adjust the privileges

chmod +x /usr/local/bin/purge_sa_cache.sh
chmod go-wrx /usr/local/bin/purge_sa_cache.sh

Add to the crontab:

cat >> /etc/cron.d/qmail << __EOF__

# purge SpamAssassin's cache
0 3 * * 0 root /usr/local/bin/purge_sa_cache.sh >> /var/log/cron 2>&1
__EOF__

Comments

Redirector and Decode Short Url dsn config issue

Hi Roberto ,

In the config example given above for 90-sql.cf  Redirector Plugin and DecodeShort URL Plugin the dsn config does not work  and gives a error in the spamd.log as below 

Thu May 14 04:58:49 2026 [111446] warn: Redirectors: invalid cache configuration
Thu May 14 05:06:18 2026 [111446] warn: Redirectors: invalid cache configuration

This is the current config given by you in 90-sql.cf

PERL_DBI="DBD:MariaDB"

url_redirector_cache_type dbi
url_redirector_cache_dsn ${PERL_DBI}:database=spamassassin;host=localhost;
url_redirector_cache_username spamassassin

While I have modified it to the below and I can see the data populate in the Mariadb database and the error logs have also stoppeed.

PERL_DBI="DBI:MariaDB"

url_redirector_cache_type dbi
#url_redirector_cache_dsn            DBI:MariaDB:database=spamassassin;host=localhost;port=3306
url_redirector_cache_dsn              DBI:MariaDB:spamassassin:localhost
url_redirector_cache_username    spamassassin
url_redirector_cache_password    ${MYSQL_PWD}

Based on the above I did also modify the DecodeShortURLs plugin dsn setting in 90-sql.cf

url_shortener_cache_type             dbi
url_shortener_cache_dsn              DBI:MariaDB:spamassassin:localhost
url_shortener_cache_username    spamassassin
url_shortener_cache_password    ${MYSQL_PWD}

Can you check if you also get the following error in your spamd.log file 

warn: Redirectors: invalid cache configuration

Regards

Shailendra

Reply |

Re: Redirector and Decode Short Url dsn config issue

Hi Shailendra,

no, I don't get that error. Anyway I'll add the port specification to the configuration

Reply |

Redirector and Decode Short Url dsn config issue

If you specify the port number in the dsn config

${PERL_DBI}:database=spamassassin;host=localhost;port=3306

getting the below error in logs 

config: SQL error: Connection error: port cannot be specified when host is localhost or embedded
Wed May 13 11:24:46 2026 [267493] warn: Redirectors: cache connect failed: DBI connect('database=spamassassin;host=localhost;port=3306','spamassassin',...) failed: Connection error: port cannot be specified when host is localhost or embedded at /usr/local/share/perl5/Mail/SpamAssassin/Plugin/Redirectors.pm line 546.

This I have checked on 2 servers using MariaDB version 10.5.29. 

Will dig further for the root cause .  Though things were working smoothly was just doing some housekeeping and came across a few of the error messages . 

Reply |

Re: Redirector and Decode Short Url dsn config issue

I didn't get the same issue because I have mariadb in the localnet, so my situation is something like

${PERL_DBI}:database=spamassassin;host=192.168.5.66;port=3306

If you can help on spotting the cause of this error, that would be very much appreciated

Reply |

Redirector and Decode Short Url dsn config issue

Point on your db sits on the localnet while on a few of my installations its on the localhost (same server) itself. 

MariaDB (and MySQL) treats the host localhost as a special case that defaults to using a Unix socket (on Linux/macOS) or shared memory/named pipes (on Windows) rather than a network port.

Workaround would be 

Change the host from localhost to 127.0.0.1 in your connection string or configuration. Using the IP address forces the client to use TCP/IP, which then allows (and requires) a port number.

I would assume that majority of the installations would be on the localhost, they might surely run into this issue on MariaDB/Mysql.  Well this did not use to be the case with Spamassassin 4.0.1. with the upgrade to v4.0.2 and the /usr/local/share/perl5/Mail/SpamAssassin/Plugin/Redirectors.pm comming into picture I guess port config or unix socket is embedded in to the code, hence the error. 

My old server on Spamassassin v4.0.1 did not had this issue but just a few days back I upgraded it to v4.0.2 and that is when I noticed this issue and then happenned to check on my newer servers and stumbled upon the error logs.

Reply |

Re: Redirector and Decode Short Url dsn config issue

Unfortunately for mariadb/mysql spamassassin@localhost and spamassassin@127.0.0.1 are two different users, so I would have to change the configuration here.

If you have at least one server in 127.0.0.1, can you suggest a working configuration?

Reply |

Recent comments
See also...
Recent posts

RSS feeds