Version and revision: V1.0
For: Nethserver Administrators
Skill: Beginner
Published: 2021-02-21
Review: tbd
Contact: @capote
Manual training of the bayes filter (rspamd)
For various reasons it may be necessary to manually train the bayes filter of rspamd:
- the database is corrupt
- fast learning of freshly implemented systems to shorten the learning phase
- reinitialize the database because users have made too many classification mistakes
This user guide describes how to manually train the bayes flter using collected sample spam mails.
System preparation
- You need ssh access to your system or use the web app “Terminal” in Cockpit.
- You need an unpack program to unpack 7z-files
- How to install 7z-Unpack program:
~# yum install p7zip
This will install the 7zip program. Please note that the command to call the utility is not 7zip or p7zip, but 7za. Check following articles to get started with 7zip:
Extract .7z File in Linux
Create .7z File in Linux
Create .7z File From Folder Recursively in Linux
Use Case 1: manual training for a fresh installed system
- login to your system
- download Spam-Samples from http://untroubled.org/spam/
- Example:
~# wget http://untroubled.org/spam/2021-01.7z
- unpack samples:
~# 7za x 2021-01.7z
- check the current number of learned samples
~# rspamc stat
- remember the line total learns: 0.
- train the filter with the downloaded samples:
~# rspamc learn_spam 2021/*
- check the current number of learned samples again and compare it: The number of total learns should increase
Use Case 2: manual training for a resetting system
- backup the rspamd-DB:
# It is better to stop Redis before you copy the file. cp /var/lib/redis/rspamd/dump.rdb /var/lib/redis/rspamd/dump.rdb_bak_jjmmtt
- reset the bayes data (Source: Wiki)
redis-cli -s /var/run/redis-rspamd/rspamd --scan --pattern BAYES_* | xargs redis-cli -s /var/run/redis-rspamd/rspamd del redis-cli -s /var/run/redis-rspamd/rspamd --scan --pattern RS* | xargs redis-cli -s /var/run/redis-rspamd/rspamd del
- train the bayes filter like in Use Case 1
Usefull commands for deeper investigation
Sometimes more in-depth information is helpful, especially when support is needed.
- current state of the service
~# systemctl status redis redis-rspamd ● redis.service - Redis persistent key-value database Loaded: loaded (/usr/lib/systemd/system/redis.service; disabled; vendor preset: disabled) Drop-In: /etc/systemd/system/redis.service.d └─limit.conf Active: active (running) since Mon 2021-02-22 18:20:20 CET; 9s ago Main PID: 31539 (redis-server) CGroup: /system.slice/redis.service └─31539 /usr/bin/redis-server 127.0.0.1:6379 Feb 22 18:20:20 ns-srv01.dargels.de systemd[1]: Starting Redis persistent key-value database... Feb 22 18:20:20 ns-srv01.dargels.de systemd[1]: Started Redis persistent key-value database. ● redis-rspamd.service - Redis persistent key-value database Rspamd Loaded: loaded (/usr/lib/systemd/system/redis-rspamd.service; static; vendor preset: disabled) Active: active (running) since Mon 2021-02-22 12:30:09 CET; 5h 50min ago Main PID: 737 (redis-server) CGroup: /system.slice/redis-rspamd.service └─737 /usr/bin/redis-server 127.0.0.1:0 Feb 22 12:30:09 ns-srv01.dargels.de systemd[1]: Started Redis persistent key-value database Rspamd.
- example for an inactive services
~# systemctl status redis ● redis.service - Redis persistent key-value database Loaded: loaded (/usr/lib/systemd/system/redis.service; disabled; vendor preset: disabled) Drop-In: /etc/systemd/system/redis.service.d └─limit.conf Active: inactive (dead)
- functionality of the the rspamd service
~# redis-cli -s /var/run/redis-rspamd/rspamd --scan --pattern BAYES_* BAYES_SPAM_keys BAYES_HAM_keys
- monitor rspamd
~# redis-cli -s /var/run/redis-rspamd/rspamd monitor OK