This is the blog of Adam Kalsey. Unusual depth and complexity. Rich, full body with a hint of nutty earthiness.
Freshness Warning
This blog post is over 20 years old. It's possible that the information you read below isn't current and the links no longer work.
22 Jan 2003
For months, I’ve laughed in the face of spam. I’ve bared my email address in public, daring the spammers to come and find me. "Let the spam flow," I cried, for I was prepared. I had SpamAssassin on my side.
In the last 5 months, SpamAssassin has caught over 60MB of spam wih nary a false positive or spam slipping through. But now chinks have appeared in my armor. In the last half-hour, four spams have slipped through. Each one was scored just under the five points required in order for a message to be tagged as spam.
The latest SpamAssassin (2.5) has Bayesian filtering built in. I'm still running 2.43, though, because I wnat to make sure that the new release is stable before updating my servers.
... it's the classig problem of filtering systems with feedback mechanisms... It seems the only way to keep up is to install new versions of SA before the spammers do... Despite the odd slip-through, I couldn't imagine living without it... It catches 1000+ messages for me a month... There are commercial products available that I can't imagine being nearly as thorough as SA and bing sold for massive sums... See also: http://www.jacobsen.no/anders/blog/archives/2002/08/23/is_spamassassin_helping_the_spammers_too_much.html http://www.jacobsen.no/anders/blog/archives/2002/08/28/to_filter_or_not_to_filter.html http://www.jacobsen.no/anders/blog/archives/2002/10/16/spam_hype_and_spamassassin.html
It's not that I don't like spamassassin - I used it before switching to spamprobe. It's just that spamassassin is a bit resource hungry and takes longer time due to hostname lookups and such. That's stuff I can live without -- spamprobe includes the headers and server names when analyzing the mail anyway. though of course, I used to run spamassassin on an old Cyrix 100MHz laptop with a whopping 16 megs of RAM. :-)
This discussion has been closed.
Johan Svensson
January 23, 2003 4:44 AM
Personally, I run SpamProbe (spamprobe.sf.net), that uses bayesian logic. For all the recent hype about the Bayes formula, it's damn good. Unfortunately (heh) I don't get any spam at all to my domain, so I had to import a bunch of hotmail spam to train the filter. Sure, after about 300 messages from hotmail, it picked up every piece of spam originating from my hotmail account and nuked it. Though of course, many spams there tend to be repeated messages. Paul Graham's latest paper sounds promising, and adds a new layer to the previously completely naive filter; his suggestion is to differ between different headers, the body and URLs in the message. Personally, I think a system based on bayesian logic is the way to go. It's not the Be All, End All of spam preventing, but it's so incredibly effective that it's a bad idea not to have it among your filters. I've noticed that some of the hotmail spam I've gotten recently have changed style -- it's a short sentence with non-spammy words followed by a URL. That's certainly not much for SpamAssassin to chew on.