Security & Privacy
Distributed comment spam prevention
Freshness Warning
This blog post is over 21 years old. It's possible that the information you read below isn't current and the links no longer work.
3 Sep 2003
Earlier I mentioned some ideas for preventing comment spam. Thanks to a TrackBack ping, I found out that Simon Willison had been discussing the same thing yesterday. I need to read Simon more often. This is the second time that I’ve been working on something only to find out that he’s doing something similar.
Simon’s offering a blacklist of domains that are used in his spam, and that gave me an idea. Combine a distributed blacklist with my distributed anti-spam concept. Sites could participate by sending the IP address, URL, and a digest of the comment body (an MD5 hash would work) to a central server or a cloud of servers. If the server saw that the same comment was being posted multiple places within a short time period it would send a ping to all participating sites. The ping would contain the IP address and URL of the spammer. The sites would then use this information to ban further comments from that site and IP. Ideally the ban would be temporary to minimize the impact of false positives, but that would be up to the site’s software.
Essentially, this would create an organic system that responds to wholesale comment spamming in real time. This wouldn’t solve the problem of someone posting an individual comment on a single site, but that’s not really the way spammers work. For spam to be effective, it needs enormous volume. And the only way to have that sort of posting volume is to automate it.