Frequently Asked Questions
General Questions
- What is spamdyke and what does it do?
- How do I install spamdyke?
- How do I upgrade spamdyke? What's the significance of the version numbers?
- How do I get support for spamdyke? Is there a mailing list?
- I don't use qmail. Can I still use spamdyke?
- Do I have to install the programs from the "utils" folder? Does spamdyke use them? Do they use spamdyke or each other?
- I love spamdyke! Can I send you money? Can I have your children?
Feature Questions
- Does spamdyke run its filters in any particular order?
- My users authenticate with SMTP AUTH. Can I still use spamdyke?
- My users authenticate with POP3-before-SMTP. Can I still use spamdyke?
- I want to block all emails unless the sender authenticates. Can spamdyke do that?
- Does spamdyke support TLS?
- I want to whitelist a large number of IP addresses; can I use wildcards?
Feature Suggestions
- On the mailing list, you often promise changes in an upcoming version. How can I find out what you're working on?
- Why doesn't spamdyke use CDB files or a database? A database would be faster and better than text files and strange directory structures!
- Why doesn't spamdyke filter using Sender Policy Framework (SPF), Sender ID, Certified Server Validation (CSV), DomainKeys or DomainKeys Identified Mail (DKIM)?
- Why can't spamdyke filter based on message headers like "From:" or "Received:"? Why can't spamdyke block large messages or strip attachments?
- Why can't spamdyke validate a recipient address before accepting a message?
- My graylist folders are getting huge -- many, many entries. I think this is a problem. Why can't spamdyke automatically delete the old ones?
Troubleshooting
- Graylisting isn't working! What am I doing wrong?
- I want to use spamdyke but some of my users roam and connect from strange places. How can I allow them to send email but still filter spam?
- I installed spamdyke and now I'm seeing a lot of timeouts in my logs. Why?
- I installed spamdyke and now my server is very slow! Incoming connections have to wait 20 seconds or longer before they see the greeting banner. What can I do speed it up?
- I use spamdyke to prevent relaying (because my qmail isn't patched to provide SMTP AUTH) but SpamAssassin has stopped scanning incoming messages. What gives?
What is spamdyke and what does it do?
In a sentence, spamdyke is a drop-in qmail filter for stopping spam at connection-time.
"drop-in" means it can be installed without patching or recompiling qmail, without installing or updating libraries, without drastically reconfiguring anything and without having to become a qmail expert.
"connection-time" means spamdyke evaluates and rejects spam while the remote server is still delivering it. Other filters and anti-spam solutions focus on classifying spam after qmail has accepted it. The spam still has to go somewhere. Even if it's filed in a folder, it still occupies disk space, consumes server resources and must be deleted by someone. When spamdyke rejects the incoming spam completely, no one has to deal with it. It's never on the server at all.
For a complete description of spamdyke and all its features, see the README page.
How do I install spamdyke?
For installation instructions, see the INSTALL.txt file.
How do I upgrade spamdyke? What's the significance of the version numbers?
Typically, upgrading spamdyke is as simple as compiling the new version and
copying the new spamdyke
binary over the old one. However, sometimes
the new version is not backwards-compatible with the old version and simply replacing
the binary will cause problems. The Upgrading file has
details on the backwards-compatibility of each version.
The version numbers are used to show the type of changes in each version:
- When major, non-backwards-compatible changes are made, the "major" version number is incremented. For example, 3.0.0 is not completely backwards compatible with 2.6.3, so the major version number changed from 2 to 3.
- When new, backwards-compatible features are added, only the "minor" version number is incremented. For example, 3.1.0 includes new features but is backwards compatible with 3.0.0, so the minor version number changed from 0 to 1.
- When no features are added but bugs are fixed, only the "revision" version number is incremented. For example, 3.0.1 only contained bug fixes, so the revision version number changed from 0 to 1.
How do I get support for spamdyke? Is there a mailing list?
Yes! Visit the mailing list page to sign up: www.spamdyke.org/mailman/listinfo/spamdyke-users. The mailing list archives are searchable at: www.mail-archive.com/spamdyke-users@spamdyke.org.
All of the documentation and releases can be found at on the spamdyke website at spamdyke.org.
If you can't find answers there, send an email to me at: samc (at) silence (dot) org
I don't use qmail. Can I still use spamdyke?
Not at this time.
Some background: Qmail starts a new process (qmail-smtpd
) from a listening
daemon (tcpserver
or xinetd
) every time a new connection is established. spamdyke slips
in between those two, so the daemon starts a new copy of spamdyke for each
connection and spamdyke starts the qmail process.
Most other mail servers don't work this way; they use a single long-running daemon for handling incoming requests. There's no way to insert spamdyke without rewriting the mail server daemon (no thanks!).
Having said that, spamdyke could be modified to listen for incoming connections itself (replacing qmail's tcpserver) and establish a new connection to the "real" server (presumably listening on a different port, perhaps running on a different interface or machine). This requires some refactoring in the spamdyke code but it's not an insurmountable task.
Look for this feature in a future version.
Do I have to install the programs from the "utils" folder? Does spamdyke use them? Do they use spamdyke or each other?
No. domainsplit and domain2path are just small utilities for use in writing scripts. spamdyke doesn't use them or depend on them. Conversely, they don't use or depend on spamdyke.
dnsa, dnsmx, dnsns, dnsptr, dnssoa and dnstxt are just small, self-contained examples of how to make A, MX, NS, PTR, SOA and TXT DNS queries using libc. They're not really useful on their own. I made them available because I was looking for examples like them when I was writing the DNS code in spamdyke and I couldn't find any. Hopefully someone else will be able to learn from them. Aren't I just a swell guy?
I love spamdyke! Can I send you money? Can I have your children?
Well this is so sudden... we just met.
Thank you for your generosity, but I must decline. I wrote spamdyke to meet my own needs, not to make money. If you really feel that strongly, drop me an email to let me know you're using spamdyke and you like it. Ask questions, request features, report bugs if you find any. I love getting emails about spamdyke. :)
Does spamdyke run its filters in any particular order?
Yes. spamdyke evaluates its filters in the following order (of course a filter is skipped if it's disabled):
- Check for an rDNS name
- Check for an IP address in a country code rDNS name
- Check for an rDNS whitelist entry
- Check for an rDNS blacklist entry
- Check for an IP whitelist entry
- Check for an IP blacklist entry
- Check for an IP address and keyword in the rDNS name
- Check if the rDNS name resolves
- Check DNS whitelists
- Check right-hand-side whitelists
- Check DNS RBLs
- Check right-hand-side blacklists
- Check for earlytalkers
- Check for an IP address in a country code rDNS name
The intent is to order the filters from least-to-most expensive, so connections will be rejected as quickly as possible. In a typical setup, DNS queries are more expensive than file searches, pattern matching is more expensive than simply checking for a file's existence, etc.
The remaining filters are all checked during the SMTP conversation.
- Limit the number of recipients
- Block unqualified recipient addresses
- Block relaying from unauthorized remote hosts
- Check for sender's domain MX record
- Graylisting
- Check sender whitelists
- Check sender blacklists
- Check right-hand-side whitelists for the sender's domain name
- Check right-hand-side blacklists for the sender's domain name
- Check recipient whitelists
- Check recipient blacklists
- Block unqualified recipient addresses
My users authenticate with SMTP AUTH. Can I still use spamdyke?
Yes! As of version 2.5.0, spamdyke understands SMTP AUTH and disables all of its filtering for authenticated users.
See the README page for complete details.
My users authenticate with POP3-before-SMTP. Can I still use spamdyke?
Probably not. If your POP3 server writes authenticated IP addresses to a file, you can use that file as an IP whitelist with spamdyke. If it keeps track of authenticated IP addresses in some other way, you're out of luck.
POP3-before-SMTP is really a kludge anyway; consider using SMTP AUTH instead.
I want to block all emails unless the sender authenticates. Can spamdyke do that?
Yes. First, enable SMTP AUTH. Then, create an IP blacklist file that will block all IP addresses:
0.0.0.0/0.0.0.0
Does spamdyke support TLS?
As of version 2.6.0, spamdyke supports TLS (which is just another name for SSL). spamdyke will detect TLS and pass it through seamlessly. Obviously, none of its post-connect filters will work (e.g. graylisting) because the traffic will all be encrypted.
However, if spamdyke has access to a server certificate, it will handle the TLS itself and all of its filters will work as before. Bonus: spamdyke will provide TLS even if your qmail has not been patched to provide TLS!
spamdyke also works very well with SMTP-over-SSL (SMTPS) using external daemons
like stunnel
. The SSL decryption
takes place before spamdyke receives the traffic, so spamdyke never knows it's
happening.
See the README page for complete details.
I want to whitelist a large number of IP addresses; can I use wildcards?
Yes, as of spamdyke version 2.2.0. The whitelist and blacklist IP files will work with partial IP addresses to represent ranges.
As of spamdyke version 2.4.0, whitelist and blacklist IP files can also contain IP addresses as dotted quad IP address followed by a netmask as a number of bits. Also, spamdyke supports dotted quad IP addresses followed by a netmask as a dotted quad.
As of spamdyke version 3.0.0, whitelisted IP addresses can also be found by using a DNS realtime whitelist. This is like a DNS RBL that lists whitelisted IP addresses instead of blacklisted ones.
For complete details, see the README page.
On the mailing list, you often promise changes in an upcoming version. How can I find out what you're working on?
The easiest way is to simply ask. Send an email.
Other than that, check out the Changelog. There are notes at the top of the file to indicate the intended major changes in the next few versions. Those notes are not a completely accurate predictor of what will happen but they'll give you an idea of what's happening.
Why doesn't spamdyke use CDB files or a database? A database would be faster and better than text files and strange directory structures!
Reasons why files and directory structures are preferable to databases:
- Speed. Because spamdyke doesn't run as a daemon, a database engine must be loaded and initialized for every incoming connection. That takes time, so the database engine must be very fast to keep the overall time lower than using plain text files.
- Memory. Because one copy of spamdyke is started for every incoming connection, memory usage is a big concern on busy servers. Most qmail installations use DJB's softlimit program to limit memory usage for exactly this reason. Database libraries use memory, often quite a bit, to load/cache/parse data. spamdyke must be able to fit within a reasonable limit.
- Concurrency. On a busy mail server, hundreds (possibly thousands) of spamdyke processes could be running at the same time. A database engine would have to handle that kind of simultaneous access without failing.
- Portability. At the moment, spamdyke runs on every Unix-like platform I've tested. The only external library it uses is OpenSSL (for TLS support) and even that is optional. I believe spamdyke's simplicity is largely responsible for its popularity -- nothing extra must be installed, no existing programs must be recompiled. Requiring a mail server administrator to install a database engine could scare away potential users.
- Accessibility. By using only plain text files and directories, spamdyke is easy for anyone to administer and reconfigure with standard command line tools. Any administrator can understand a file of whitelisted IP addresses without help. Very few administrators know SQL. This is very important to me -- I hate proprietary file formats that can only be accessed with special tools. I can't remember how many times I've stared at a malfunctioning Sendmail server and been unable to determine if a given option was even available, much less enabled. This point and the next point are very closely related.
- Safety. Plain text files and directories are easy to understand, easy to back up and easy to restore. They can be printed out, emailed, imported into other programs, etc. Most importantly, it's very difficult to corrupt a plain text file by "improperly" stopping spamdyke or losing power. In an emergency, when an administrator is trying to restore a mail server while users are screaming at him, I don't want him wondering if spamdyke's files are intact. If he has any doubts, he should be able to visually verify them with any available text editor. (I've had to restore corrupted Exchange Private Mail Stores in the middle of the night. I wouldn't wish that experience on anyone.)
- Availability. Mail servers should depend on as few external systems as possible. spamdyke already depends heavily on DNS but that's unfortunately unavoidable. I've done everything I can to make sure spamdyke fails gracefully if DNS is down. Fortunately, DNS servers are (usually) very stable and very reliable. Database engines are not in the same league -- while many databases are fairly reliable they still have much higher downtime than DNS servers. As stated above, I don't want to force a stressed administrator to restore his database server just to get mail flowing again. If I were forced to do that, I would choose instead to uninstall spamdyke (and I would never use it again).
Database servers like MySQL conflict with all 7 points. Embedded database engines like SQLite conflict with points 1, 2, 5 and 6. CDB files conflict with points 5 and 6.
This doesn't mean spamdyke will never use a database. It just means there's a lot to think about. A database would have to solve a really tough problem before it would be considered. Whether it saves some coding time is really not a factor -- the time required to install and administer spamdyke is more important.
Why doesn't spamdyke filter using Sender Policy Framework (SPF), Sender ID, Certified Server Validation (CSV), DomainKeys or DomainKeys Identified Mail (DKIM)?
Each of those systems is very complex, so adding support to spamdyke will not be a small task. They're also all different, so supporting all of them would be a major undertaking. At this point, there are other ways to improve spamdyke that will take less effort and provide much more benefit.
Additionally, I'm not convinced any of those systems will make any difference. They were each designed to prevent spam being sent from forged addreses so, for example, you won't get spam from president@whitehouse.gov. However, most spammers own their domains (often many thousands of them) and control their own DNS, so they can (and do) add SPF/CSV/etc records so their spam follows the SPF/CSV/etc rules. (Hint: Spammers understand and administer their DNS records better than most ISPs do.) The major ISPs (AOL, Yahoo!, GMail) use these systems and still have spam problems; this is pretty good evidence they aren't living up to the hype.
There is a (small) chance these systems could stop spam coming from botnets. If that happens however, the spammers will just start relaying their spam through the compromised machine's ISP's mail servers. That's why having every ISP block all outbound port 25 traffic won't stop botnet spam either.
If you really, really want to filter your email using one or more of these frameworks, try using SpamAssassin. It already supports most of them and you'll be able to just their effectiveness for yourself.
Why can't spamdyke filter based on message headers like "From:" or "Received:"? Why can't spamdyke block large messages or strip attachments?
To understand why, a little explanation of the SMTP protocol is necessary.
After some conversational preliminaries, an SMTP client gives the server a sender address. This address is referred to as the "envelope sender" and it's used as a return address if the message bounces. Here's the part most users don't understand: The "envelope sender" doesn't have to be the same address you see on the "From" line. The two aren't related at all; the "From" line is mostly for show (this is why spam "From" lines are so often forged).
If the server accepts the "envelope sender" address, the client sends the addresses of the message recipients, one at a time. The recipient addresses are for message delivery. The server must accept or reject each recipient individually. Once again, the "To" and "Cc" lines aren't related at all; they're mostly for show.
If the server accepts one or more recipients, the client sends the message. The server accepts or rejects the message and the conversation is over.
spamdyke makes a decision whether to accept or reject senders and recipients during the first two steps because those steps must be acknowleged separately. Once the client sends the actual message, it can't be rejected.
Well, yes, technically, according to the RFCs, it can be rejected. However, most MTAs don't handle this situation correctly and bounce the message if they see anything other than a success code after sending the message. This is a Bad Thing. If the rejection was for graylisting, the client won't retry delivery. If the rejection was for one recipient out of many, the client (and therefore the sending user) won't understand why the error was given. The message will just bounce and that's it.
There are many, many tools for filtering messages based on headers and content that are far, far smarter and feature-rich than spamdyke. If you really want to filter based on message headers, please use one of them in addition to spamdyke. Personally, I use (and recommend) SpamAssassin and ClamAV.
Why can't spamdyke validate a recipient address before accepting a message?
This would be very nice to have but it's not very easy to implement. Qmail doesn't make it easy to determine if an address is valid and vpopmail (if present) only complicates matters. Since it's better to accept spam than reject legitimate email, this feature doesn't exist (yet).
If you have a suggestion or method for checking recipient validity, I'd love to know about it. Note: recompiling qmail or reusing qmail's source code is not an option.
My graylist folders are getting huge -- many, many entries. I think this is a problem. Why can't spamdyke automatically delete the old ones?
It's not really a problem, since most of the files contain no data. In other words, they don't take up any space and shouldn't cause a problem unless your filesystem runs low on inodes.
However, if they really bother you, a simple search for files by date range can be used to remove the old entries. On Linux, the following command should work (your mileage may vary):
find GRAYLIST_FOLDER -type f -mmin +$[SPAMDYKE_MAX_GRAYLIST_SECS/60] -print0 | xargs -0 rm -f
GRAYLIST_FOLDER
with the path to your graylist folder, as
given in the spamdyke command line. Also replace SPAMDYKE_MAX_GRAYLIST_SECS
with the maximum graylist age from the spamdyke command line (-M).
spamdyke doesn't automatically delete the old entries because it doesn't run as a daemon. In other words, if spamdyke suddenly decided to clean up the graylist folder, it would have to do that work while receiving an incoming email message. If your graylist folder is large, the cleanup could cause a large enough delay to bounce the message. Also, how is spamdyke to decide when to cleanup? Because a new spamdyke process is started for each incoming connection and those processes don't communicate with each other, there's a good chance multiple spamdyke instances would attempt to cleanup the folder at the same time. That would delay multiple incoming messages and place an unnecessary load on the server.
Graylisting isn't working! What am I doing wrong?
Most often, graylisting doesn't work because there are no domain folders. spamdyke is designed to allow some flexibility when configuring graylisting, so you can enable it for some domains and disable it for others. To enable graylisting for a domain, you must create a folder within your graylist folder named for the domain you want to graylist. For example, suppose you created a graylist folder:
/home/vpopmail/graylist.d
/usr/local/bin/spamdyke -g /home/vpopmail/graylist.d ...
example.com
, you must also create
another folder:
/home/vpopmail/graylist.d/example.com
example.com
domain. Create a folder for each domain you want to graylist.
spamdyke can automatically catch configuration problems like this. See the README page for details.
I want to use spamdyke but some of my users roam and connect from strange places. How can I allow them to send email but still filter spam?
As of version 2.5.0, this is no problem. spamdyke understands SMTP AUTH, so it can authenticate your users and bypass all of its filtering just for them. Bonus: spamdyke will provide SMTP AUTH even if your qmail has not been patched to provide SMTP AUTH!
See the README page for complete details.
I installed spamdyke and now I'm seeing a lot of timeouts in my logs. Why?
Badly written software on the remote hosts. It seems a lot of spam software doesn't handle error codes at all. It just attempts delivery and expects success. When an error code is sent, the software just sits and waits for the success code it wants. Eventually, the connection times out. Sometimes, a remote server will take a long time to begin delivering a large (legitimate) message, which can cause timeouts.
qmail enforces a 20 minute idle timeout but it does so silently (no logging). It's possible you were already getting timeouts and just didn't know it until spamdyke began logging them.
If you suspect legitimate connections are timing out, there are two things you can do. First, you can increase or disable spamdyke's timeouts. Of course, qmail's 20 minute idle timeout will still apply but at least you'll be back where you were before.
Second, you can use spamdyke's full logging feature to log all incoming connections to files. The log files contain timestamps so you can see how quickly the remote server is sending data and where it's stopping. Hopefully that will yield some clues you can use to fix the problem.
I installed spamdyke and now my server is very slow! Incoming connections have to wait 20 seconds or longer before they see the greeting banner. What can I do speed it up?
Most often, delays like these are due to DNS traffic. Depending on the enabled filters, spamdyke can perform the following DNS queries for each incoming connection:
- Find the reverse DNS name for the remote server
- Find the IP address for the remote server's reverse DNS name
- Check the realtime whitelists for TXT records matching the remote server's IP address
- Check the realtime whitelists for A records matching the remote server's IP address
- Check the realtime blacklists for TXT records matching the remote server's IP address
- Check the realtime blacklists for A records matching the remote server's IP address
- Check the righthand-side whitelists for TXT records matching the remote server's rDNS name
- Check the righthand-side whitelists for A records matching the remote server's rDNS name
- Check the righthand-side blacklists for TXT records matching the remote server's rDNS name
- Check the righthand-side blacklists for A records matching the remote server's rDNS name
- Check the righthand-side whitelists for TXT records matching the sender's domain name
- Check the righthand-side whitelists for A records matching the sender's domain name
- Check the righthand-side blacklists for TXT records matching the sender's domain name
- Check the righthand-side blacklists for A records matching the sender's domain name
If you have disabled all of the filters and spamdyke is still running slowly, you may have found a bug. Please report it!
I use spamdyke to prevent relaying (because my qmail isn't patched to provide SMTP AUTH) but SpamAssassin has stopped scanning incoming messages. What gives?
You must be using qmail-scanner from qmail-scanner.sourceforge.net. That
package has an interesting flaw: it assumes any time the environment variable
RELAYCLIENT
is set, no scanning should be performed. When spamdyke prevents
relaying, it always sets that variable to keep qmail from interfering with its
relaying decision.
To reenable scanning, modify your /etc/tcp.smtp file and add QS_SPAMASSASSIN
to all of the connections you want scanned. For example:
127.:allow,RELAYCLIENT="",QMAILQUEUE="/var/qmail/bin/qmail-scanner-queue"
:allow,QMAILQUEUE="/var/qmail/bin/qmail-scanner-queue",QS_SPAMASSASSIN=""
qmailctl cdb
and you should be fine.