RBLs - SpamTagger/SpamTagger-Plus GitHub Wiki

RBLs are "Real time Blacklists/Blocklists". As the name indicates, they are a mechanism for checking various reputations with always up-to-date information. This is done using DNS requests to a variety of reputation sources for a variety of different message elements.

RBL activation at different stages

In MailCleaner, these reputation checks can be configured to check different reputation sources during different phases of the filtering process.

Reject mail listed by an RBL (SMTP stage)

A message can be directly refused if it is listed by an RBL that is activated in Configuration->SMTP->STMP Checks.

This is the most strict possible action and should be reserved for RBL sources which provide a very low false-positive rate. Since the message is rejected outright, it will not be recoverable. This has the benefit of reducing filtering load (ie. CPU and RAM usage) as well as limiting disk usage for quarantined items, and removing the risk of the user releasing a dangerous message accidentally.

Also not that because the message is rejected during the SMTP phase, the sender is made aware that the message was not delivered using a 5xx SMTP code. For false-positives, this is good because the sender will receive a bounce email. For true-positives, this is potentially bad because it informs a spammer that they need to change their tactics.

You will need to restart the Incoming MTA (Exim Stage 1), when changes are made to the SMTP RBL list.

Guarantee Quarantining of mail listed by an RBL (PreRBL and UriRBL modules)

There are two modules in Configuration->Anti-spam, the PreRBLs module and UriRBLs module. The difference between these is described below.

If a listing is found for an RBL in either of these modules, it will flag the message as spam so that it will definitely get quarantined (or tagged), except in the case of a sender whitelist.

This option is 'safer' when it comes to false-positives because the message is recoverable from the spam quarantine, but it also means that resources are used to scan the message and the sender is not informed that the message was held in the quarantine. RBL sources with occasional false-positives and/or a fast review and de-listing process when they occur should be used at this level.

Increase the chances that a message will we held as spam

This option does not guarantee that an item will be held if an RBL listing is found in one of the RBLs, but will increase the chances. This is done by enabling the lists as contributors to the SpamC score in Configuration->Anti-spam->SpamC.

The SpamC module uses a large variety of tests to determine whether a message is spam. When an RBL is enabled in this configuration, it is simply considered as one of many aspects which could indicate that the message is spam. In general, this will add a couple of points to the sum of all other tests. If a message receives 5 or more points total, the message is held as spam. This will often allow for borderline spam messages to be blocked while otherwise safe-looking messages can still get through despite the listing.

This option is the least strict and hence most tolerant to false-positives. It should be used when the RBL source makes no guarantees about false-positives or when the list has a lengthy removal process (eg. a minimum listing period before automatic removal).

As suggested by these options, each option is irrelevant if the same RBL is enabled by one of the prior options since the more strict options have higher precedence or occur earlier in the scanning process.

Types of content checked by RBLs

MailCleaner classifies RBLs into two general groups: PreRBLs (aka "IPRBLs") and UriRBLs. The latter is a bit of a misnomer, as it also contains RBLs for other "content".

PreRBLs specifically check the reputation of IPs (and sometimes hostnames). These are very quick and difficult to subvert, because it checks the publicly available IP/hostname before any message content is sent. These are generally subverted either by using a distributed attack (BotNet) so each single IP is impacted by the reputation of others, or by piggy-backing on the good reputation of a large host (eg. large FreeMail vendors, or third-party newsletter senders). Otherwise, PreRBLs are very effective.

UriRBLs and other content reputation lists are slower, because it requires that the message content have been sent already before they can be evaluated. These all run during the Filtering Engine phase rather than during the SMTP phase, so they cannot result in outright rejections. Aside from the reputation of URLs, these are able to search for sender/domain reputations, bitcoin addresses, a hash of the whole email, or many other identifiers.

RBL Whitelists

Note that there are a few sources listed in the TrustedSources module (Configuration->Anti-spam->TrustedSources). These use the same technology, but the RBLs there have listings for IPs that are known to be trustworthy. If these are enabled, the message is treated the same as if the sender had been whitelisted. This is to say that it will override any results from the remaining Anti-spam modules and the mail will be delivered.

However, because this also happens after the SMTP phase, if the same IP is listed by one of the RBLs in Configuration->SMTP->SMTP Checks, then it will be rejected prior to that whitelist being seen.

Logs

If you search for any message in Managment->Tracing, or in the /var/mailcleaner/log/mailscanner/ logs, you will see several different logs related to RBLs. Here is an explanation of those logs:

Jan  1 00:00:00 mailcleaner MailScanner[25226]: PreRBLs found sender hostname: hostname.domain.com for x.x.x.x on message 1nldZQ-0001GN-Q4

This line is for information purposes only. It is simply providing information on the correlation of hostnames and IP address at the time that the DNS information was checked.

Jan  1 00:00:00 mailcleaner MailScanner[25226]: PreRBLs (1nldZQ-0001GN-Q4) x.x.x.x SORBS => Hit 127.0.0.6

This line indicates that a RBL lookup actually found a match. This usually means that the message will go on to be blocked, since only the enabled RBLs at any given stage are actually checked. The example above indicates that the IP x.x.x.x was checked against the SORBS RBL and resulted in finding a listing with response 127.0.0.6. Those response codes are discussed below.

Note that the only time that this will appear and the message is not determined to be spam will be if there was a whitelist which overrides the result, or if that Anti-spam module is enabled but not 'decisive' (ie. unable to impact the filtering result; a sort of debugging/dryrun mode).

Jan  1 00:00:00 mailcleaner MailScanner[25226]: PreRBLs result is spam (SORBS) for 1nldZQ-0001GN-Q4

This is the actual result logged by the relevant Anti-spam module. Unlike the previous, it lacks additional information on which IP and the exact return code. It will only appear if the module is decisive.

Jan  1 00:00:00 mailcleaner MailScanner[25226]: Message 1nldZQ-0001GN-Q4 from x.x.x.x ([email protected]) to [email protected] is spam, Newsl (score=0.0, required=5.0, position : 0, not decisive), NiceBayes (0.0%, position : 6, not decisive), PreRBLs (SORBS, position : 8, spam decisive)

This is the final decision by MailScanner after considering all other modules. In this example it shows that despite other modules not perceiving the message to be spam, the listing by SORBS is decisive.

RBL return codes

The return codes for different RBL sources are not typically of interest to users; if it has a 'Hit' listed at all, it is considered spam, regardless of the code. However, since the detail is provided and it can differ slightly per source, more information is provided here.

In general, the convention is to always list an IP in the 'local' space (127.0.0.X). This is because it ensures that no real, resolvable, IP address is used. With some, the final octet of the returned IP is significant. It may indicate the age of the listing, the precision (exact hostname vs. a different subdomain of a blocked root domain), or other factors. Some RBL sources will use dedicated IPs responses for non-listings or other errors (eg. requesting a hostname for an IP-only RBL).

For the first two methods - SMTP Checks and dedicated Anti-spam modules - the result is seen as binary for MailCleaner. In the case that a certain return code is innocuous (eg. some have a code for "previously listed"), that code is simply ignored. Otherwise, if there is any return code, it is treated as spam.

For SpamC, each RBL source is capable of having a different set of rules hit with a different score. So, the severity of the listing can impact the number of points that are applied.

RBL configurations

MailCleaner reads in the configuration for each RBL source from files in /usr/mailcleaner/etc/rbls. For example, the URIBL.cf file contains:

name=URIBL
type=URIRBL
dnsname=multi.uribl.com.
sublist=127.0.0.2,URIBL_BLACK,URIBL blacklist
sublist=127.0.0.8,URIBL_RED,URIBL redlist
sublist=127.0.0.14,URIBL_TEST,URIBL testlist
callonip=1

This includes:

name - The human-readable name for the source.
type - The type of RBL source, usually URIRBL or IPRBL.
dnsname - The DNS suffix to be queried. eg. if this were an IPRBL, it would query 1.0.0.127.multi.uribl.com for 127.0.0.1, and as a URIRBL it would query www.example.com.multi.uribl.com for www.example.com.
callonip - Indicates whether an URIRBL supports reputation results for IP address in URLs (eg. https://127.0.0.1/evil_link.php).
sublist - Specific return code identifiers with a SpamC-readable code name and human-readable description.

You can feasibly use this information to configure custom RBL sources that are not supported by MailCleaner. If this is a public source, please feel free to create a pull request to add this source for other users.