Spam Rejection Hall of Shame

The Internet is awash in a sea of spam. Email users complain to their mail providers, and the mail providers try to Do Something about the spam. It won't come as a surprise that some of the things mail providers try to do are misguided, not to say evil. This even includes vendors of mail systems, who should know better.

Among the most frequent (and most evil) things people do to try to combat spam are synthetic bounces and and silently discarding messages.

The Hall of Shame

Mirapoint (maker of an email "appliance," 
so they should know better!) 
Silent discards (October, 2006)

Why Synthetic Bounces are Evil

The term "synthetic bounce" might not be familiar. Some people call this a "delayed bounce." To describe a synthetic bounce, we first have to describe a real bounce. When someone attempts to send a message that the destination server knows it can't (or won't) deliver, the destination server replies with an SMTP permanent error code. These are the codes that start with a five. 550 recipient unknown is one that many people have seen. The sending server formats a delivery failure message and returns it to the sender's individual mailbox. The message has "bounced off" the server for which it was destined; it never got inside.

Some, maybe many, misguided attempts at spam filtering allow the destination server to accept a message. After it's inside, it gets scanned for virus infection and filtered to see whether it might be spam. If it turns out to be something you don't want to drop into the recipient's mailbox, you have a problem because you have already accepted the message for delivery.

Things go from misguided to evil when you decide to create, that is, synthesize, your own bounce message and send it back. Think about it for a minute; you didn't bounce the message, you caught it, and now you want to throw it back. The problem is, to whom do you throw it? If the message is legit, then you can just send your fake bounce to the return address. But duhhhhh! The reason you want to return the message is that you're pretty sure it isn't legit!

OK, quick... how many spammers and virus writers put their own return addresses on the garbage they send out?

Right in one guess! Zero! If the message isn't legitimate, neither is the return address, and your synthetic bounce will go astray. One of two things will happen. If the world is a lucky place at that moment, it'll come back to you because the return address was completely fake. Now you have Yet Another piece of bogus mail to deal with. That's somewhat fair because you created the bogus mail in the first place.

Spammers are more likely to use legitimate addresses, just not their own! Virus writers pick addresses at random from the address books of machines they've already infected. And, some "fake" addresses will match real mailboxes. So, some innocent third party will get your synthetic bounce. You've wasted their time, stolen their bandwidth, and deposited into their mailbox a piece of junk they can't do anything with. But they're likely to read it and ponder over it to figure that out. For a longer analysis of this problem, check this article: http://www.ironport.com/company/pp_enterprise_it_planet_05-02-2006.html. Do note that the problem is only with synthetic bounces; real, SMTP connection-level bounces go only to the originating server. That's what should happen.

Synthetic bounces are evil!

Why Silently Discarding Messages is Evil

The problem with silently discarding messages is false positives. We are trying to discard spam, and the spammers are trying to make their junk look legit so we won't discard it. It is inevitable that some legitimate email will be mis-identfied as spam. How much? Not very much, but what if one of the discarded messages is the one offering you that new job, or that book contract? You never see it. The sender, believing the message has been delivered, thinks you're ignoring it. Bad. Evil, even.

It's also unnecessary. There are effective ways of stopping 70% or more of spam without discarding anything.

Some people would rather risk a few legitimate messages being discarded than deal with the remaining 25-30% of spam, and there are ways to allow them to make that decision without forcing it on every user of a mail domain.

Silently discarding messages is evil.

The Right Way to Reject Spam

Stopping spam and virus-infected email should be a two-stage process: SMTP rejection at the edge of your network and cleaning and tagging within. You can stop more than two-thirds of spam before it ever gets inside your network. That may reduce the volume that you have to clean and tag to a tolerable level.

Reject at the SMTP Level

If your mail gateway server can determine that a particular message cannot or should not be delivered, your server should issue an SMTP permanent error. If a message rejected this way was really legitimate, the sender will get a non-delivery message from his own mail server and will know that he must try another way of contacting the intended recipient.

Messages that cannot be delivered are those that are addressed to unknown users. That means your mail gateway has to know who your legitimate users are. That sounds like a duhhh! statement, but some organizations use a "firewall" server that just relays everything. That creates an unnecessary internal load. If you have a separate mail relay machine at the edge of your network, use something like LDAP to tell it who your users are. That way, it can reject mail for bogus addresses.

Some people erroneously believe that allowing a gateway mail relay to know who their users are somehow makes them more vulnerable to dictionary attack harvesting. It's not true, though. The spammers will just assume that all the addresses they try are deliverable, and your spam load will go up!

If you have internal-use-only mailing lists, reject mail for those, too. After all, they're for internal use, right?

You can reject mail from known spam sources by using a "realtime black list" or RBL. These are maintained by subscription services or volunteers, at costs from free to several thousand dollars a year. There are many levels of accuracy and aggressiveness available. I believe one of the best is the Spamhaus SBL+XBL list. It is also one of the least expensive.

You will have some false positives with RBLs, especially the more aggressive ones. If mail is rejected at the SMTP level, they're bad, but not evil, because the sender will be informed that his message wasn't delivered.

Some "email filter appliances" may be able to detect virus payloads while the SMTP connection is open. If you can detect them, reject them! Again, false positives are bad, but not evil, because the recipient will be notified.

However, once you've accepted a message for delivery, you may not discard it. Whether it's a frog or a prince, once you've kissed it, it's yours!

Clean and Tag Accepted Messages

If a message that has been accepted at the SMTP level is later found to have a virus infection, remove the infected attachment and deliver the message. Sure, it's probably junk, but it just might be that book contract! Let the recipient decide. (You might also consider tagging the message as described below.)

If a message that's been accepted appears to be spam based on some filtering criterion, tag it as probable spam and deliver it.

One option is to deliver mail tagged as spam to a Junk Mail or Quarantine folder that's accessible to the recipient. Recipients should be encouraged to look in the Junk Mail folder from time to time, both to check for misclassified mail (false positives) and to purge the spam. It might not be too misguided to auto-delete mail from the Junk Mail folder after it has been there, say, 30 days.

If your mail recipients use a variety of clients, as is common for ISPs and universities, you probably can't enforce the use of a Junk Mail folder, and some mail programs may not even have such an option. You should still tag messages that appear to be spam and help your users make the best use of those tags. You can publish on the Web instructions for using your particular tags with popular mail programs like Thunderbird and Eudora. One university that's pretty generally on the ball (Nova Southeastern University) prepends "** SPAM Score 7.0 **" (or whatever the number might be) to the subject of a message thought to be spam. Almost any email client can filter on the subject, and even people who choose not to try filtering can see immediately that a message has been tagged as spam. Altering the subject line is likely to be much more useful than adding a custom header that will be effectively invisible to most of your users. Best practice, as exhibited by Nova, is to do both. Most recipients will use the tag in the subject, but detailed information remains available to sophisticated mail users in the form of custom headers.

I promised you a way to allow some users to auto-delete mail flagged as spam without forcing that choice on everyone. Tagging is it. Instead of a rule that files tagged messages in a Junk Mail folder, those who want to can write a rule that just deletes the messages. Auto-deleting is still a Bad Idea, but if it's the recipient's choice and not some mail administrator's choice, it's not evil.

So, you see, it is possible to confront spam without being evil.

Back to Brown's home page.

Last updated: 2007-01-08 22:23